Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix total disk bytes returning negative value #23093

Merged

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Feb 9, 2017

https://bugs.openjdk.java.net/browse/JDK-8162520
Some filesystems can be so large that they return a negative value for their
free/used/available disk bytes due to being larger than Long.MAX_VALUE.

This adds protection for our FsProbe implementation and adds a test that it
does the right thing.

You can see a failure from this here: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+nfs/4/console

@dakrone dakrone added :Data Management/Stats Statistics tracking and retrieval APIs discuss v6.0.0-alpha1 labels Feb 9, 2017
fsPath.available = nodePath.fileStore.getUsableSpace();
fsPath.total = adjustForHugeFilesystems(nodePath.fileStore.getTotalSpace());
fsPath.free = adjustForHugeFilesystems(nodePath.fileStore.getUnallocatedSpace());
fsPath.available = adjustForHugeFilesystems(nodePath.fileStore.getUsableSpace());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am on the fence about these two, free and available, because I have seen filesystems where a negative "free" value is legal (I believe ZFS has returned negative free values before for disk space), so I'm not sure whether we should drop these.

I do think we should have the protection on getTotalSpace though, I don't think it's as likely for it to be negative and still a legal value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only make the change for total until we see evidence that adjustments need to be made for the others.

@dakrone dakrone force-pushed the workaround-for-openjdk-8162520-large-fs branch from af51c93 to f6a3d01 Compare February 11, 2017 00:31
@jasontedor
Copy link
Member

Can you please change the title of this PR to reflect the underlying problem (a JDK bug ID is too opaque), and rewrite the commit message to do the same?

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment on the code, and I would prefer a more transparent commit message.

fsPath.available = nodePath.fileStore.getUsableSpace();
fsPath.total = adjustForHugeFilesystems(nodePath.fileStore.getTotalSpace());
fsPath.free = adjustForHugeFilesystems(nodePath.fileStore.getUnallocatedSpace());
fsPath.available = adjustForHugeFilesystems(nodePath.fileStore.getUsableSpace());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only make the change for total until we see evidence that adjustments need to be made for the others.

This adds a workaround for JDK-8162520 -
https://bugs.openjdk.java.net/browse/JDK-8162520

Some filesystems can be so large that they return a negative value for their
free/used/available disk bytes due to being larger than `Long.MAX_VALUE`.

This adds protection for our `FsProbe` implementation and adds a test that it
does the right thing.
@dakrone dakrone force-pushed the workaround-for-openjdk-8162520-large-fs branch from f6a3d01 to 2cb898b Compare February 13, 2017 16:30
@dakrone
Copy link
Member Author

dakrone commented Feb 13, 2017

@jasontedor I pushed a new commit that has a different commit header (and removed the changes so it only affects total bytes)

@dakrone dakrone changed the title Add a workaround for JDK-8162520 Fix total disk bytes returning negative value Feb 13, 2017
Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left one more nit, but the change is good.

@@ -135,14 +135,22 @@ public FsInfo stats(FsInfo previous, @Nullable ClusterInfo clusterInfo) throws I
return Files.readAllLines(PathUtils.get("/proc/diskstats"));
}

/** See: https://bugs.openjdk.java.net/browse/JDK-8162520 */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a pain, but can you make this not a Javadoc, just a regular comment? 😦

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly!

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dakrone
Copy link
Member Author

dakrone commented Feb 13, 2017

retest this please

@dakrone dakrone merged commit b42d477 into elastic:master Feb 13, 2017
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 13, 2017
* master:
  Update to forbiddenapis 2.3 (improves Gradle configuration time) (elastic#23154)
  Make the version of the remote node accessible on a transport channel (elastic#23019)
  Fix total disk bytes returning negative value (elastic#23093)
  Fix communication with 5.3.0 nodes
  Update redirects.asciidoc (elastic#23148)
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Feb 22, 2017
From elastic#23093, we fixed the issue where a filesystem can be so large that it
overflows and returns a negative number. However, there is another issue when
adding a path as a sub-path to another `FsInfo.Path` object, when adding the
totals the values can still overflow.

This adds the same safety to return `Long.MAX_VALUE` instead of the negative
number, as well as a test exercising the logic.
jasontedor pushed a commit that referenced this pull request Mar 19, 2017
* Fix total disk bytes returning negative value

This adds a workaround for JDK-8162520 -
https://bugs.openjdk.java.net/browse/JDK-8162520

Some filesystems can be so large that they return a negative value for their
free/used/available disk bytes due to being larger than `Long.MAX_VALUE`.

This adds protection for our `FsProbe` implementation and adds a test that it
does the right thing.
jasontedor pushed a commit that referenced this pull request Mar 19, 2017
From #23093, we fixed the issue where a filesystem can be so large that it
overflows and returns a negative number. However, there is another issue when
adding a path as a sub-path to another `FsInfo.Path` object, when adding the
totals the values can still overflow.

This adds the same safety to return `Long.MAX_VALUE` instead of the negative
number, as well as a test exercising the logic.
jasontedor pushed a commit that referenced this pull request Mar 19, 2017
* Fix total disk bytes returning negative value

This adds a workaround for JDK-8162520 -
https://bugs.openjdk.java.net/browse/JDK-8162520

Some filesystems can be so large that they return a negative value for their
free/used/available disk bytes due to being larger than `Long.MAX_VALUE`.

This adds protection for our `FsProbe` implementation and adds a test that it
does the right thing.
jasontedor pushed a commit that referenced this pull request Mar 19, 2017
From #23093, we fixed the issue where a filesystem can be so large that it
overflows and returns a negative number. However, there is another issue when
adding a path as a sub-path to another `FsInfo.Path` object, when adding the
totals the values can still overflow.

This adds the same safety to return `Long.MAX_VALUE` instead of the negative
number, as well as a test exercising the logic.
@jasontedor
Copy link
Member

I backported this to 5.3 and 5.x.

dakrone added a commit to dakrone/elasticsearch that referenced this pull request May 26, 2017
In elastic#23093 we made a change so that total bytes for a filesystem would not be a
negative value when the total bytes were > Long.MAX_VALUE.

This fixes elastic#24453 which had a related issue where `available` and `free` bytes
could also be so large that they were negative. These will now return
`Long.MAX_VALUE` for the bytes if the JDK returns a negative value.
@dakrone dakrone deleted the workaround-for-openjdk-8162520-large-fs branch December 13, 2017 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants