-
Notifications
You must be signed in to change notification settings - Fork 24
Remove calls to grpc.channel.close(), which could cause segfaults
#266
Conversation
PR Review ChecklistDo not edit the content of this comment. The PR reviewer should simply update this comment by ticking each review item below, as they get completed. Code
Architecture
|
| type: foreground | ||
| command: | | ||
| pyenv install 3.7.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had to make a number of changes to the way we load Python and Pip and install packages in order to work in the new Factory CI machines.
|
|
||
| def close(self) -> None: | ||
| super().close() | ||
| self._channel.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See PR description for the full explanation of why we do this. In brief, explicitly calling close on a gRPC Channel in Python can result in resources being deallocated while they are still being used, which in turn may cause gRPC's native C libraries to segfault. Not closing the channel is the approach recommended in an issue in the gRPC repository itself.
## What is the goal of this PR? We upgraded Client Python to the latest version, which should fix intermittent and common test failures in CI. ## What are the changes implemented in this PR? Our CI jobs have been failing for some time due to a "metadata elements leaked" error, which we've fixed in Client Python in typedb/typedb-driver-python#266, and now we've upgraded to the latest release of Client Python.
What is the goal of this PR?
We no longer call
closeon any of our gRPC Channels. This fixes possible segfaults caused by resources being deallocated while they are still in use.What are the changes implemented in this PR?
Users in a wide variety of scenarios had reported intermittent crashes, often accompanied by warnings saying "1 metadata element(s) leaked". The logs would look similar to the following:
The same issue has also been reported in googleads/google-ads-python#384, and a fix was suggested in:
From this issue we infer that
Channel.closeis not behaving nicely in gRPC Python, and can cause resources to be deallocated while they are still in use. As gRPC itself uses native C libraries, this results in segfaults and crashes. We determine that the best course of action is to not close the Channel ourselves.We've simply deleted the 3 places in our code that called
closeon a gRPCChannel. It has passed our CI tests and fixed user-reported issues, and it is what the gRPC maintainers themselves appear to recommend in the linkedgrpcissue.