-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8268361: Fix the infinite loop in next_line #4378
Conversation
… in container environment, the while loop may lead to 100% cpu usage.
👋 Welcome back UncleNine! A progress list of the required criteria for merging this PR into |
@UncleNine The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
I have encountered a similar problem before, I've filed a new issue https://bugs.openjdk.java.net/browse/JDK-8268361 for this. You can change your title to "8268361: Fix the infinite loop in next_line", then OpenJDK bot would associate your PR with the corresponding JBS issue. Best regards, |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
This fixes the potential infinite loop, but is it sufficient?
Thanks,
David
int c; | ||
do { | ||
c = fgetc(f); | ||
} while(c != '\n' && c != EOF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: please add space before (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not obvious to me that the caller of next_line will handle the fact that we have hit EOF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my case, it happened in the container environment.
the /proc filesystem of the container is provided by lxcfs, but a lxcfs bug may make the /proc/stat mount point change, then the file descriptor is different and fgetc function returns an EOF on error, But c != '\n' is true and it leads to the infinit loop.
Below are our flamegraph in the production , it happends on serveral frameworks(micrometer, elasticsearch..)which use the api "sun/management/OperatingSystemImpl.getSystemCpuLoad"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@UncleNine We should handle the case in the get_totalticks()
function - which seems to be the only user of next_line()
- when next_line()
returns EOF as David said. One way would be to return the 'c' character read in next_line
and if it's EOF, return -2 in get_totalticks()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all right, i have updated the code
In what way have you observed /proc/stat being changed which manifests in 100% cpu usage? |
In my case, it happened in the container environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. Please fix the jcheck (whitespace) issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks okay to me.
Thanks,
Serguei
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems better to me now.
Thanks,
David
@UncleNine This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 113 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dholmes-ora, @jerboaa, @sspitsyn) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your patience!
/integrate |
@UncleNine |
/sponsor |
Going to push as commit 7267227.
Your commit was automatically rebased without conflicts. |
@jerboaa @UncleNine Pushed as commit 7267227. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
@UncleNine please do not force push commits to an open PR as it makes it difficult for reviewers to track the changes. The PR can contain as many commits as you like as it will all be squashed to a single clean commit when integrating. Thanks. |
Would it not be better to read the whole content of /proc/stat with a single read() call instead of line by line? I don't think proc fs guarantees any kind of consistency with separate reads. |
Sorry, |
1 similar comment
Sorry, |
yes, so fgetc's return value should need to be checked |
If the /proc/stat mount point is changed in container environment, the while loop may lead to 100% cpu usage.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4378/head:pull/4378
$ git checkout pull/4378
Update a local copy of the PR:
$ git checkout pull/4378
$ git pull https://git.openjdk.java.net/jdk pull/4378/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 4378
View PR using the GUI difftool:
$ git pr show -t 4378
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4378.diff