You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a depend-a-bot PR, to upgrade to the latest v1.71.0 version of this library (link), CI encountered a panic inside the gRPC runtime. Looking at the stack-trace, it seems impossible: it's hitting a nil pointer dereference on a field named hl, but the preceding stack frame sets hl to a non-nil pointer.
The only possible way I can think of this happening is if a compressor is incorrectly used concurrently -- like another goroutine resetting this compressor in between the write and subsequent read of this field. And that could happen if it is accidentally returned to a sync.Pool before we are actually finished with it.
The only commit I could see between 1.70.0 and 1.71.0 that might be a culprit is #7918. I am still analyzing the code and have also been re-running this with the race detector enabled, hoping it could highlight a smoking gun. But I do know sync.Pool behaves a little differently when race detection is enabled. So far, it has not found anything, but I was able to reproduce the issue with a different panic (described further below).
The second panic I've seen that is also inside the gRPC runtime and related to gzip compression is a slice bounds out of range issue. I have not dug into the stack trace as deeply as the one above, but it smells like it could also be a concurrent modification issue.
I do see a possible issue: in gzip.reader.Read, it eagerly returns the underlying gzip reader back to a pool on io.EOF. However, if the caller tries to read again (after EOF), it will hit this issue because it will end up trying to call that gzip reader's Read method even though it no longer "belongs" to the calling goroutine and may have been used concurrently/reset.
It's possible that the new changes in the mentioned PR might end up making a foillow-up Read call after EOF is returned, and that could be tickling what appears to be a pre-existing condition in the gzip encoding.
In a depend-a-bot PR, to upgrade to the latest v1.71.0 version of this library (link), CI encountered a panic inside the gRPC runtime. Looking at the stack-trace, it seems impossible: it's hitting a nil pointer dereference on a field named
hl
, but the preceding stack frame setshl
to a non-nil pointer.The only possible way I can think of this happening is if a compressor is incorrectly used concurrently -- like another goroutine resetting this compressor in between the write and subsequent read of this field. And that could happen if it is accidentally returned to a
sync.Pool
before we are actually finished with it.The only commit I could see between 1.70.0 and 1.71.0 that might be a culprit is #7918. I am still analyzing the code and have also been re-running this with the race detector enabled, hoping it could highlight a smoking gun. But I do know
sync.Pool
behaves a little differently when race detection is enabled. So far, it has not found anything, but I was able to reproduce the issue with a different panic (described further below).Here's the full stack trace from the panic:
The second panic I've seen that is also inside the gRPC runtime and related to gzip compression is a slice bounds out of range issue. I have not dug into the stack trace as deeply as the one above, but it smells like it could also be a concurrent modification issue.
The text was updated successfully, but these errors were encountered: