Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closing a ReleasableBytesStreamOutput closes the underlying BigArray #23941

Merged
merged 5 commits into from Apr 14, 2017

Conversation

jaymode
Copy link
Member

@jaymode jaymode commented Apr 6, 2017

This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only
release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use
the channel builder and in turn the bytes stream output that is managed by the channel.

Note, this commit differs from 6bfecdf in that it updates
ReleasableBytesStreamOutput to handle the case of the BigArray decreasing in size, which changes
the reference to the BigArray. When the reference is changed, the releasable needs to be updated
otherwise there could be a leak of bytes and corruption of data in unrelated streams.

This reverts commit afd45c1.

This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only
release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use
the channel builder and in turn the bytes stream output that is managed by the channel.

Note, this commit differs from 6bfecdf in that it updates
ReleasableBytesStreamOutput to handle the case of the BigArray decreasing in size, which changes
the reference to the BigArray. When the reference is changed, the releasable needs to be updated
otherwise there could be a leak of bytes and corruption of data in unrelated streams.

This reverts commit afd45c1.
@jaymode jaymode added :Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload v5.4.0 v6.0.0-alpha1 WIP labels Apr 6, 2017
@jaymode
Copy link
Member Author

jaymode commented Apr 6, 2017

jenkins test this

3 similar comments
@jaymode
Copy link
Member Author

jaymode commented Apr 6, 2017

jenkins test this

@jaymode
Copy link
Member Author

jaymode commented Apr 6, 2017

jenkins test this

@jaymode
Copy link
Member Author

jaymode commented Apr 6, 2017

jenkins test this

@rjernst
Copy link
Member

rjernst commented Apr 6, 2017

This seems odd to me. If there is an underlying stream that we wrap, we should be closing it. The problem then seems to me that we are wrapping/delegating a stream we should not be closing? To be clear, flushing on close seems ok, but only before doing the actual close. Hacking a wrapper to not call close when close is called seems like it could lead to errors in the future where we don't close the underlying stream when we should have.

@jaymode
Copy link
Member Author

jaymode commented Apr 6, 2017

I agree that this is less than ideal. This is the second approach I have taken and this was given a LGTM in #23572; this PR is here to test in Jenkins because the previous pr had a bug that led to corrupted bytes but it never reproduced locally for me.

A lot of the ugliness stems from compression and how deflate compressor works and the fact that closing a releasable bytes stream should release its resources, which is what this change does. the deflateoutputstream needs to be closed in order to write all of the bytes and it closes the output stream it is given. The same holds for an xcontentbuilder which closes the stream when getting the bytes. If you have a idea to simplify this that would be great but this fixes a potential leak of bytes that we keep track of.

@jaymode
Copy link
Member Author

jaymode commented Apr 7, 2017

jenkins test this

@jaymode
Copy link
Member Author

jaymode commented Apr 13, 2017

jenkins test this

@jaymode jaymode added v5.5.0 and removed v5.4.0 labels Apr 13, 2017
@jaymode jaymode removed the WIP label Apr 14, 2017
@jaymode jaymode merged commit 30ab873 into elastic:master Apr 14, 2017
@jaymode jaymode deleted the bytes_tracking_2 branch April 14, 2017 15:42
jaymode added a commit that referenced this pull request Apr 14, 2017
…23941)

This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only
release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use
the channel builder and in turn the bytes stream output that is managed by the channel.

Note, this commit differs from 7412555 in that it updates
ReleasableBytesStreamOutput to handle the case of the BigArray decreasing in size, which changes
the reference to the BigArray. When the reference is changed, the releasable needs to be updated
otherwise there could be a leak of bytes and corruption of data in unrelated streams.

This reverts commit a7c9658, which reverted #23572.
jaymode added a commit that referenced this pull request Apr 15, 2017
The cat APIs and rest tables would obtain a stream from the RestChannel, which happened to be a
ReleasableBytesStreamOutput. These APIs used the stream to write content to, closed the stream,
and then tried to send a response. After #23941 was merged, closing the stream meant that the bytes
were released for use elsewhere. This caused occasional corruption of the response when the bytes
were used prior to the response being sent.

This commit changes these two usages to wrap the stream obtained from the channel in a flush on
close stream so that the bytes are still reserved until the message is sent.
jaymode added a commit that referenced this pull request Apr 15, 2017
The cat APIs and rest tables would obtain a stream from the RestChannel, which happened to be a
ReleasableBytesStreamOutput. These APIs used the stream to write content to, closed the stream,
and then tried to send a response. After #23941 was merged, closing the stream meant that the bytes
were released for use elsewhere. This caused occasional corruption of the response when the bytes
were used prior to the response being sent.

This commit changes these two usages to wrap the stream obtained from the channel in a flush on
close stream so that the bytes are still reserved until the message is sent.
Tim-Brooks added a commit that referenced this pull request May 31, 2017
This is a follow-up to #23941. Currently there are a number of
complexities related to compression. The raw DeflaterOutputStream must
be closed prior to sending bytes to ensure that EOS bytes are written.
But the underlying ReleasableBytesStreamOutput cannot be closed until
the bytes are sent to ensure that the bytes are not reused.

Right now we have three different stream references hanging around in
TCPTransport to handle this complexity. This commit introduces
CompressibleBytesOutputStream to be one stream implemenation that will
behave properly with or without compression enabled.
jasontedor pushed a commit to jasontedor/elasticsearch that referenced this pull request Nov 27, 2017
This is a follow-up to elastic#23941. Currently there are a number of
complexities related to compression. The raw DeflaterOutputStream must
be closed prior to sending bytes to ensure that EOS bytes are written.
But the underlying ReleasableBytesStreamOutput cannot be closed until
the bytes are sent to ensure that the bytes are not reused.

Right now we have three different stream references hanging around in
TCPTransport to handle this complexity. This commit introduces
CompressibleBytesOutputStream to be one stream implemenation that will
behave properly with or without compression enabled.
jasontedor added a commit that referenced this pull request Nov 29, 2017
This is a follow-up to #23941. Currently there are a number of
complexities related to compression. The raw DeflaterOutputStream must
be closed prior to sending bytes to ensure that EOS bytes are written.
But the underlying ReleasableBytesStreamOutput cannot be closed until
the bytes are sent to ensure that the bytes are not reused.

Right now we have three different stream references hanging around in
TCPTransport to handle this complexity. This commit introduces
CompressibleBytesOutputStream to be one stream implemenation that will
behave properly with or without compression enabled.

Relates #27540
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload >enhancement v5.5.0 v6.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants