-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retries to handle connection reset errors for GZIPInputStream #16606
Conversation
processing/src/main/java/org/apache/druid/data/input/impl/RetryingInputStreamUtils.java
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/data/input/impl/RetryingGZIPInputStream.java
Show resolved
Hide resolved
This error doesn't seem limited to the GZIPInputStream. What will happen if you retry the experiment, but not with a .gz file? Do we still retry in that case? |
@LakshSingla I haven't tried it out. But this particular error public static InputStream decompress(final InputStream in, final String fileName) throws IOException
{
if (fileName.endsWith(Format.GZ.getSuffix())) {
return gzipInputStream(in);
} else if (fileName.endsWith(Format.BZ2.getSuffix())) {
return new BZip2CompressorInputStream(in, true);
} else if (fileName.endsWith(Format.XZ.getSuffix())) {
return new XZCompressorInputStream(in, true);
} else if (fileName.endsWith(Format.SNAPPY.getSuffix())) {
return new FramedSnappyCompressorInputStream(in);
.
.
.
} |
I don't think we need to solve for the error, but we are adding the missing retries. The error that the input stream throws when the server is disconnected is irrelevant as long as we retry it. Hence I was curious if the other streams retry normally if the server is disconnected.
Does it mean that when the server is disconnected, the inner input stream is working fine (and returning partial data), however the wrapping GZ stream is throwing because the data is incomplete? If so, why is the inner retrying stream not throwing an error and retrying on server disconnection?
Where are these other streams getting created? |
No, the inner stream stops working. But since the inner stream isn't Previously:
In the code that I added above (
|
Linking the root cause : openjdk/jdk#19909 |
We don't use it directly, it gets used internally within some other JDK layer. |
This will get fixed in the newer JDK versions. The JDK fix linked above by Karan will be part of JDK 24. We can probably wrap a bunch of input stream layers with our custom version but it would be very hard to maintain. Same thing applies for S3 input source. Since this is a transient issue, and the fix would come out of JDK 24, I'm closing this PR. |
Description
This PR adds retries to handle connection reset errors during GZIPInputStream. We were running into the following error if the connection resets in the middle of ingesting a gz file, causing the ingestion task to fail:
This PR wraps the GZIPInputStream in a retry wrapper using a new class
RetryingGZIPInputStream
that handles such connection failures, and tries to continue the operation using retries.Test plan
python -m http.server 9333
- I had wikipedia.json.gz in the same directory for the server to serve.http://localhost:9333/wikipedia.json.gz
Key changed/added classes in this PR
RetryingGZIPInputStream
: Class to facilitate retries over a GZIPInputStream.RetryingInputStreamUtils
: A utils class to extract common functionality betweenRetryingGZIPInputStream
andRetryingInputStream
.This PR has: