-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large files have negative available bytes due to long to int type cas… #21
Conversation
It was observed that NFSFileInputStream was coming back with 0 bytes if file was larger than 2.1GB (basically Integer.MAX_VALUE) |
This is actually an interface problem - for practical Java coding reasons, I implemented the The correct solution may be to return |
-1 implies end of stream. The use of this field is being used improperly as it is right now. Available is not meant to show the amount of bytes remaining in the file but rather the amount available to fetch from the stream without blocking (is the proper use case). Since it is being used to imitate this for pulling the chunked bytes it seemed better suited to imply that there are bytes available for now as total file size here is not at all a valid dataset . The interface is not the problem
… On Jun 29, 2018, at 12:27 PM, DavidASeibert ***@***.***> wrote:
This is actually an interface problem - for practical Java coding reasons, I implemented the InputStream interface, but that interface doesn't support large streams properly. The proposed fix will make the returned value look more sensible, but it will still be wrong, and it makes it harder than necessary for users to see if the value is correct or not.
The correct solution may be to return -1 if the value is larger than Integer.MaxValue, and then to implement an interface extension that handles these larger streams. I have not yet figured out how to design that extension for ease of use, though, and I do not know of any standard fix that has been developed for that interface. Do you have any suggestions for the complete fix?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABaEtCMUdP9ZM7iG-yW4m51Bsfsb-Wvks5uBlWEgaJpZM4U7-yM>.
|
The proper answer or definition of the available() method from the javadoc is this... " So if implementing properly it would estimate this value and never be based on the size of the file itself. |
I should highlight this is a minimal fix so that it behaves acceptably with other streams for fetching files larger than 2.1GB (in our case its QCOW2 , RAW , or VMDK image files for backup/recovery) |
Yes, my initial thoughts were a bit off. Coming back here from other code...
That will give either the remaining bytes that can be read without going to the NFS file, or the best estimate of what the next NFS read will give if the buffer is currently empty and there is still data left to read. That will also give good processing results, as the buffer size is typically set to the optimum NFS read size for that system. That will also ensure that it never returns |
Yes it does. -1 is actually used a lot on HTTP InputStreams in their docs to imply EOF because 0 doesnt mean the stream is at end of file. It means the transfer might be hung and theres nothing to be read at this moment but there may be shortly.
… On Jun 29, 2018, at 1:00 PM, DavidASeibert ***@***.***> wrote:
Yes, my initial thoughts were a bit off. Coming back here from other code...
-1 does not imply EOF for available(), but 0 does according to the javadoc. Probably the best estimate the code can give is the following:
bytesLeftInBuffer() if that is > 0, or
0 if _isEof, or
_bytesInBuffer otherwise.
That will give either the remaining bytes that can be read without going to the NFS file, or the best estimate of what the next NFS read will give if the buffer is currently empty and there is still data left to read. That will also give good processing results, as the buffer size is typically set to the optimum NFS read size for that system. Does that make sense to you?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABaEi5pwtpJu52VDJKIILHbKp_uv1ltks5uBl1IgaJpZM4U7-yM>.
|
The javadoc says that https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available() Ignoring |
Im reading the java doc and see nothing of the sort. As a matter of fact on the base implementation "The available method for class InputStream always returns 0.
“
It says it will return an ESTIMATE of the bytes available for reading or it will return 0 when it reaches the end of the stream. 0 could also be an estimate of bytes available for being read without blocking
… On Jun 29, 2018, at 1:08 PM, DavidASeibert ***@***.***> wrote:
The javadoc says that 0 means you are at the end of the stream. Yes, there may be extensions for subclasses that have special values for -1, but that doesn't really make sense for anything other than special devices, and we're trying to stay as plain vanilla as possible.
https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available() <https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available()>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABaEgrLjPesBz7LQhxIWB8HG6EGA-8Lks5uBl8fgaJpZM4U7-yM>.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, although the solution will likely need tweaking before the next release.
I'm leaving this comment here, even though this PR has already been merged, because we are trying to get a release out that includes a debug fix (#42). Seems to me, based on the method contract, |
…ting..