-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix HTTP for kernel < 4.16 #2132
Conversation
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
@ros-pull-request-builder please run the test again |
Thanks for the review! I will update the PR. |
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one nitpick, lgtm
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
Looks like the checks passed, if it looks good can we merge? Also whats the process for backport, should I just open another PR for melodic-devel? |
i am good to go with this, but i am not maintainer... @sloretz @jacobperron friendly ping.
i think backport will be taken care of CI. |
Awesome, thanks for all the help!! |
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
@jikawa-az |
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
Signed-off-by: Jesse Ikawa <jikawa@amazon.com>
Hi @sloretz @jacobperron, when you get a chance may we please have a review? I would like to have this fix merged to address a customer's performance issues. |
Friendly ping: @sloretz @jacobperron |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay. The change LGTM.
Addressing performance issues described in #2118 Signed-off-by: Jesse Ikawa <jikawa@amazon.com> Co-authored-by: Emerson Knapp <537409+emersonknapp@users.noreply.github.com>
Update after @Crcodlus comment
I have the exact same problem. Any use of the rosmaster API, even a simple getPid, increases the memory without any release. After days of investigation, I found out it was caused by ros_comm itself and not something in my project. I did a git-bisect and "8b4089917fef19ab9fd0e00f11ae06235ed7380b is the first bad commit". I work at a company with robots, and all robots updated in >= 1.15.10 with a linux kernel < 4.16 have the bug. In our project it increases the memory by 3GB/hour and it freezes sometimes the communication between nodes. I will have to force 1.15.9 or update linux kernel on all affected robots, I think that is a bit critical! |
That's not good! This change to HTTP1.1 that went in from Kinetic to Lunar made it so that our service calls would reliably fail - is this application old enough that you happened to be running it on Kinetic? If so, did you experience any performance issues when you upgraded to Lunar/Melodic/Noetic? This issue is especially hard to test, because even containerizing doesn't help since Docker shares the host kernel. |
Yes I was building from sources to test the 1.15.9 version and then to do the git-bisect. I've just tested, and it is the I can't tell you about any performance issue, I didn't notice anything in the past (we were on Kinetic then Melodic and now Neotic) since this memory issue (that also triggers communication timeouts between some nodes). |
Even if the code has changed, I think it is related to that: https://salsa.debian.org/debian/python-prometheus-client/commit/5aa256d8aab3b81604b855dc03f260342fc391fb |
That's good to know! I see no reason why |
Sure: #2165 |
Please, see #2182 - this fix has unwanted side-effects on 4.x systems: it does not allow setting parameters larger than 32 kB from roscpp. |
@jikawa-az would you be able to write down a way to tell whether the performance hit you observed is resolved or not? there are several other issues around the change this PR made and it would be good to know that solutions to those issues do not break the improvement your PR brought |
@jikawa-az or is the test app from #2118 still a good test for this? |
Thanks for confirming that the test app from #2118 still fails. That's the only repro we currently have for the issue |
Fixes bug #2118
Linux kernels 4.15 and older encounter performance issues with HTTP/1.1
I have tested this with a customer's 4.14 kernel workflow and it addressed their performance issues.
Will need to be backported to Melodic.
Signed-off-by: Jesse Ikawa jikawa@amazon.com