Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CharSource.asByteSource(...).openStream() does not implement InputStream.available() #5377

Open
Marcono1234 opened this issue Jan 2, 2021 · 4 comments
Labels
P3 package=io type=enhancement Make an existing feature better

Comments

@Marcono1234
Copy link
Contributor

The InputStream returned by CharSource.asByteSource(...).openStream() (i.e. internal class ReaderInputStream) does not implement InputStream.available().

Example:

InputStream in = CharSource.wrap("a").asByteSource(StandardCharsets.UTF_8).openStream();
System.out.println(in.available()); // Returns 0

Possible implementation:

  • If com.google.common.io.ReaderInputStream.byteBuffer is non empty, its remaining() value can be returned
  • Otherwise try encoding remaining chars in com.google.common.io.ReaderInputStream.charBuffer
    Though note that even if buffer is not empty it might require more chars, e.g. when it only contains high surrogate and is missing low surrogate
  • Otherwise check if com.google.common.io.ReaderInputStream.reader is ready() and if so call reader.read() (other read(...) methods might block?); afterwards repeat steps above
  • Otherwise return 0

Though the above might be pretty inefficient in case someone repeatedly calls available() because it would then only encode one (or at most two) char(s) at a time. However, java.io.Reader.ready() sadly only guarantees that read() won't block. Therefore there might be implementations whose other read(...) methods might block so encoding more chars would not be possible.

@netdpb netdpb added P3 package=io type=enhancement Make an existing feature better labels Jan 6, 2021
@1c3t3a
Copy link

1c3t3a commented Feb 1, 2021

We are a group of students (Tobias, Elena, Dominik and me), who would like to contribute to guava, could we pick this up?

@Marcono1234
Copy link
Contributor Author

Additionally I think this check will always be false:

if (doneFlushing) {
result = CoderResult.UNDERFLOW;

If doneFlushing == true it was set in this invocation because otherwise the draining check at the beginning of the method would have returned fast.
However, if doneFlushing was set in this invocation, then draining would have started afterwards:

doneFlushing = true;
startDraining(false);
continue DRAINING;

Therefore that other doneFlushing check mentioned above is impossible to reach.

@cgdecker
Copy link
Member

cgdecker commented Feb 8, 2021

Something you don't explain in this issue that I'm curious about is why you want this stream to implement available(). As far as I can tell the main reason to want to use available() in the first place is to implement some kind of non-blocking thing where you only read from an InputStream when it says it has bytes available, but:

  • This isn't something you can rely on as a general rule given that not every InputStream implementation will actually implement available().
  • InputStream isn't great for implementing any kind of non-blocking I/O regardless; any modern code should probably be using Channels in some form for that.

Do you have a specific use case for this?

@Marcono1234
Copy link
Contributor Author

My rationale was that if an InputStream can in some way provide bytes without blocking, it should override available() because I assumed in general it would be useful to provide this information to the user of the API (without a specific use case in mind).
When I wrote this GitHub issue I was only aware of some of the corner cases, which need to be handled, but not all and only became aware of the remaining ones during the process of reviewing the pull request.
At this point I am also not really sure anymore if the amount of work required (and the implicit maintenance costs associated with it) really justifies the small benefit this would create (if any, maybe available() is indeed not widely used).
Sorry for all the trouble this caused, especially for @1c3t3a. Unless 1c3t3a wants to continue, feel free to close this issue if you think that this is not needed.

It would be possible to implement a 'lite' version which only covers certain cases but is greatly simplified. However, it might not be worth having an available() method which only reports in some cases that it won't block.

public int available() {
  if (draining) {
    return byteBuffer.remaining();
  }
  return 0;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 package=io type=enhancement Make an existing feature better
Projects
None yet
Development

No branches or pull requests

4 participants