Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/1.x and HTTP/2.0 decoder Buffer visibility and data corruption #898

Merged
merged 1 commit into from Dec 17, 2019

Conversation

@Scottmitch
Copy link
Member

Scottmitch commented Dec 13, 2019

Motivation:
ServiceTalk attempts to effectivley disable Netty's reference counting by using
a strategy similar to Netty's UnreleasableByteBuf to short circit operations
which attempt to modify the reference count, and therefore prevent the reference
count from reaching zero and prematruley freeing memory. ServiceTalk's
ByteBuf instances will therefore always have a refCnt() == 1. However Netty's
ByteToMessageDecoder uses refCnt() > 1 to determine if the cumulation ByteBuf
maybe shared, and if not it may compact data in place. This compacting data in
place may result in a corrupted view of data from the perspective of offloaded
threads in ServiceTalk, and for example result in failures to decode HTTP/2 data
such as gRPC frames.

ServiceTalk's ByteToMessageDecoder uses Netty's
ByteToMessageDecoder#MERGE_CUMULATOR to manage accumulation of data. As of
Netty 4.1.43.Final this may result in writing into the cumulation ByteBuf such
that a reallocation opearation occurs. This reallocation operation is not done
in a thread safe way and this may result in visibility issues when referencing
the same Buffer object from another thread (e.g. when offloading is enabled).
This may result in the applicaiton thread observing an empty Buffer if the
reference to the newly allocated memory is visibile before the copy operation is
visible.

Modifications:

  • All ByteBuf objects generated by ServiceTalkBufferAllocator should have a
    refCnt() > 1 to avoid any optimizations which may assume exclusive ownership
    of a ByteBuf in Netty. This will prevent corrupted content for HTTP/2.x and
    gRPC use cases. This may result in more reallocation/copy operations in Netty's
    ByteToMessageDecoder and there is a Netty PR
    netty/netty#9877 intended to improve this behavior.
  • ServiceTalk's ByteToMessageDecoder should not use
    ByteToMessageDecoder#MERGE_CUMULATOR and instead use a strategy similar to
    netty/netty#9877 to do the accumulation of bytes. It is
    also unecessary to attempt to compact the buffer after accumulation has been
    attempted, as any compaction can be handled in the accumulation.

Result:
No more decode related visibility issues for HTTP/1.x and no more content
corruption and visibility issues for HTTP/2.0 / gRPC.

Motivation:
ServiceTalk attempts to effectivley disable Netty's reference counting by using
a strategy similar to Netty's `UnreleasableByteBuf` to short circit operations
which attempt to modify the reference count, and therefore prevent the reference
count from reaching zero and prematruley freeing memory. ServiceTalk's
ByteBuf instances will therefore always have a `refCnt() == 1`. However Netty's
ByteToMessageDecoder uses `refCnt() > 1` to determine if the cumulation ByteBuf
maybe shared, and if not it may compact data in place. This compacting data in
place may result in a corrupted view of data from the perspective of offloaded
threads in ServiceTalk, and for example result in failures to decode HTTP/2 data
such as gRPC frames.

ServiceTalk's `ByteToMessageDecoder` uses Netty's
`ByteToMessageDecoder#MERGE_CUMULATOR` to manage accumulation of data. As of
Netty 4.1.43.Final this may result in writing into the `cumulation` ByteBuf such
that a reallocation opearation occurs. This reallocation operation is not done
in a thread safe way and this may result in visibility issues when referencing
the same `Buffer` object from another thread (e.g. when offloading is enabled).
This may result in the applicaiton thread observing an empty `Buffer` if the
reference to the newly allocated memory is visibile before the copy operation is
visible.

Modifications:
- All `ByteBuf` objects generated by `ServiceTalkBufferAllocator` should have a
`refCnt() > 1` to avoid any optimizations which may assume exclusive ownership
of a `ByteBuf` in Netty. This will prevent corrupted content for HTTP/2.x and
gRPC use cases. This may result in more reallocation/copy operations in Netty's
`ByteToMessageDecoder` and there is a Netty PR
netty/netty#9877 intended to improve this behavior.
- ServiceTalk's `ByteToMessageDecoder` should not use
`ByteToMessageDecoder#MERGE_CUMULATOR` and instead use a strategy similar to
netty/netty#9877 to do the accumulation of bytes. It is
also unecessary to attempt to compact the buffer after accumulation has been
attempted, as any compaction can be handled in the accumulation.

Result:
No more decode related visibility issues for HTTP/1.x and no more content
corruption and visibility issues for HTTP/2.0 / gRPC.
Copy link
Member

NiteshKant left a comment

💯

@jayv
jayv approved these changes Dec 16, 2019
Copy link
Member

idelpivnitskiy left a comment

LGTM!
One question:

protected ByteBuf swapCumulation(ByteBuf cumulation, ByteBufAllocator allocator) {
protected ByteBuf swapAndCopyCumulation(final ByteBufAllocator alloc, final ByteBuf cumulation, final ByteBuf in) {
ByteBuf newCumulation = alloc.buffer(alloc.calculateNewCapacity(
cumulation.readableBytes() + in.readableBytes(), MAX_VALUE));

This comment has been minimized.

Copy link
@idelpivnitskiy

idelpivnitskiy Dec 17, 2019

Member

Should we take care about int overflow here? Otherwise, users may get IllegalArgumentException from calculateNewCapacity

This comment has been minimized.

Copy link
@Scottmitch

Scottmitch Dec 17, 2019

Author Member

if we overflow int then we wouldn't be able to accommodate all the data in a single buffer. We could attempt to fall back to a multi-buffer approach, or provide a pre-check for a more informative exception, but since this case is unlikely to occur I think the current "check in a single spot" approach is sufficient for now.

@Scottmitch Scottmitch merged commit c3c0e81 into apple:master Dec 17, 2019
3 checks passed
3 checks passed
pull request validation (jdk11) Build finished.
Details
pull request validation (jdk8) Build finished.
Details
pull request validation (quality) Build finished.
Details
@Scottmitch Scottmitch deleted the Scottmitch:grpc_deserialize branch Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.