-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
9374 Fixing circular references in t.p.policies and t.p.tls #955
9374 Fixing circular references in t.p.policies and t.p.tls #955
Conversation
It doesn't matter how many cycles you actually have in your process: (on CPython) the garbage collector runs with the same frequency regardless, and does the same amount of work. The purpose of the garbage collector is to find the cycles among all the objects that exist in your process. It has slightly more work to do actually collecting the cycles that it finds but this is basically negligible.
Since GC CPU considerations are orthogonal, this is basically orthogonal too (that is, you may choose to tweak your GC settings but doing so has little or nothing to do with these cycles). The difference that will be made is that some memory may be releasable to the OS slightly earlier if the cycles are broken earlier. In practice, I suspect it would be hard to observe this effect since memory is almost never released back to the OS anyway and over-commit means that unused memory is practically as good as released memory anyway. |
|
Sorry, but I'm not agree that GC in CPython has fixed run frequency.
So we can affect collection frequency by:
For example, I have the server that receives tons of short-lived TLS connections with loosened GC thresholds. Before this fix it took around 1.2Gb of RAM after 1 day after being started. With this fix it takes around 200Mb and not growing. |
So it does. I had forgotten about that trait. Thanks for the doc reference and sorry about the incorrect post.
It would be great if you could contribute this as a repeatable benchmark. |
|
Here is the benchmark script: https://gist.github.com/IlyaSkriblovsky/4dd3abfd5f67c64b13f1c673f56466f9 (the script is Py3-only because I don't know direct way to get collection count in Py2) The script simulates 10k (set by Results on my machine:
So, with default GC settings there is small effect on RAM usage, but several times lower GC run count. With high GC thresholds there is significant difference in RAM usage. |
|
@IlyaSkriblovsky please follow all the steps at http://twistedmatrix.com/trac/wiki/TwistedDevelopment#SubmittingaPatch . You've done most of the steps, but are missing step 11. |
|
Did it, thanks |
|
Looks like some of the earlier Windows failures with this PR were due to a broken version of pypiwin32: mhammond/pywin32#1151 . I triggered another build now that the broken version has been removed from pypiwin32 |
src/twisted/test/test_sslverify.py
Outdated
| clientWrappedProto = ListeningClient() | ||
| serverWrappedProto = GreetingServer() | ||
|
|
||
| cF = protocol.Factory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not to use abbreviations and call this clientFactory. ... why not clientFact ? :)
...same for clientWrappedProto ... why not cWP ?
I see that the existing code uses abbreviations, but I don't think that this is a good reason to continue to write new code in this way :)
src/twisted/protocols/tls.py
Outdated
| @@ -196,7 +198,9 @@ def makeConnection(self, transport): | |||
| Connect this wrapper to the given transport and initialize the | |||
| necessary L{OpenSSL.SSL.Connection} with a memory BIO. | |||
| """ | |||
| self._tlsConnection = self.factory._createConnection(self) | |||
| # Using weakref.proxy to avoid circular references | |||
| selfProxy = weakref.proxy(self) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just asking. Is using a weakref the only way to solve this?
Why not break it in connectionLost like done in the other place?
Using weakref can make the tests harder to read / write as you will need to take extra care on the scope of an instance..
...and I would like to write Python without having to care about the lifecycle of the objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this.
Happy to see performance improvements :)
Is there a test for new the code in ProtocolWrapper ?
Can this be implemented without weakref?
I would like to see fewer (to none) abbreviations in the code and more docstrings, especially for the test part.
At 8AM and in the context of this PR the code is easy to ready. Hacking at 11PM and getting into some strange bugs, might prove that the code is not that easy to read.
Especially the dummy protocols and the Failure.trap in connectionLost
| @@ -0,0 +1 @@ | |||
| t.p.p.ProtocolWrapper and t.p.t.TLSMemoryBIOProtocol no longer create circular references that still alive after connection is closed | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use full names and end the sentence with a full stop :)
src/twisted/protocols/policies.py
Outdated
| @@ -124,6 +124,9 @@ def connectionLost(self, reason): | |||
| self.factory.unregisterProtocol(self) | |||
| self.wrappedProtocol.connectionLost(reason) | |||
|
|
|||
| # Breaking reference cycle between self and wrappedProtocol | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use full for a sentence.
In this way, it is clear that the sentence has an end and that it was not left unfinished by accident.
| @@ -42,7 +44,8 @@ | |||
|
|
|||
| from twisted.internet.error import ConnectionDone, ConnectionLost | |||
| from twisted.internet.defer import Deferred, gatherResults | |||
| from twisted.internet.protocol import Protocol, ClientFactory, ServerFactory | |||
| from twisted.internet.protocol import (Protocol, ClientFactory, ServerFactory, | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine, but to reduce future dif size, I would prefer to have this list alphabetically ordered and each import on a line.
from twisted.internet.protocol import (
ClientServerFactory,
Factory,
Protocol,
ServerFactory,
)
| def test_noCircularReferences(self): | ||
| """ | ||
| TLSMemoryBIOProtocol doesn't leave circular references that make | ||
| it alive after connection is closed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it alive What does it mean?
| TLSMemoryBIOProtocol doesn't leave circular references that make | ||
| it alive after connection is closed. | ||
| """ | ||
| def nObjectsOfType(type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if this is test code please consider adding a docstring to describe the purpose of this function.
There are no tests for the test code so in order to check whether the test code itself is correct we can only read the code.
| gc.disable() | ||
|
|
||
| try: | ||
| class DummyServerProtocol(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this protocol "dummy" ? What means dummy?
This looks like a fully working protocol which is closing the connecting first time it receives data.
I have used these types of protocol in production code :)
Please consider adding some docstring to document this code.
| self.transport.loseConnection() | ||
|
|
||
|
|
||
| class DummyClientProtocol(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why dummy?
This looks like a fully functional protocol which will ignore all received data and will send a hello message on a connection.
| self.transport.write(b'hello') | ||
|
|
||
| def connectionLost(self, reason): | ||
| reason.trap(ConnectionDone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just asking. I have not executed (yet) the tests. Can the connection be closed for other reasons?
Please document why ConnectionDone is important here.
|
Thanks for review! I would be happy to explicitly break
|
| @@ -0,0 +1 @@ | |||
| twisted.protocols.policies.ProtocolWrapper and twisted.protocols.tls.TLSMemoryBIOProtocol no longer create circular references that still alive after connection is closed. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
references that still alive after :) please also update this part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
AFIK @exarkun Can you please confirm if there should be any case in which dataReceived is called after connectionLost ? I see this info in the API I can see that the API explicitly talks about cleaning circular references here, but does not make any statement about when dataReceived can be called... but since circularReferences are broken in connectionLost, I expect that this should be the last call for a certain instance. The fact that I will give it a try and see why dataReceived is called after connectionLost in this test. Thanks! |
|
May be Sorry if my explanations are confusing or inaccurate. |
|
For the TLS/SSL connections there is one corner case which might get into the way of this patch... and this might be an existing bug... see https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/tls.py#L346 When you connect to a TLS/SSL socket and the connection is closed right away without sending any data, the connection is not actually closed. I have checked the code and I see that connectionLost is called twice ...and before dataReceived, so I guess that there is a bug in iosim :) |
It is probably called once for the server and once for the client. But then one of the sides calls But I'm not sure where is the source of the bug: whether |
True. I am stupid and confused :) Let me dig more :) Also, if I run a single tests, I see 14 errors reported by trial. |
That's because I like the idea of adding check for |
All real It would be a good thing to make any test double for Three bonus points for also adding tests which demonstrate that |
|
@IlyaSkriblovsky I see that you are member of Twisted team. Also, please consider writing tests for any code that you write, including tests for the testing infrastructure, as suggested by Jean Paul... and with the docstring to document that calling write after connection lost is acceptable :) Thanks! |
|
@adiroiban I'm member of Twisted team because I'm was contibuting to TxMongo, but I haven't ever contributed to Twisted itself. Okay, I will create branches in the main repo next time. I will try to write tests for |
That is fine, you are good to create branches in Twisted :). The main reason why we don't allow anybody to commit to the main repo, is that commits will trigger builds and we don't want to end up with PR which are executing malicious code on our buildslaves. |
|
I've added a test for
Should I also find other fake |
I think that this can be done outside, so that we can merge this as it is. Thanks! |
| """ | ||
| return sum(1 for x in gc.get_objects() if isinstance(x, type)) | ||
|
|
||
| gc.disable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw you can get rid of the whole try/finally with self.addCleanup(gc.enable) right here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I forgot about it
Sorry, I was wrong. Two
You are right, sorry my C-ism. But to be even more presice, |
Ah yes, sorry, that's what I meant but my statement was ambiguous (I meant the attribute doesn't exist on the object, not that the object doesn't exist 😄 ). Thanks for the clarification. |
|
Should this be back in review? It seems like all the blockers are dealt with but there’s a lot of discussion to review here :) |
|
Yes, please review it again.
I've added trivial check if TLSMemoryBIOProtocol already received
`connectionLost()` in its `loseConnection()`. I hope it is reasonable check
to do.
|
|
Sorry, I wasn't asking because I was about to review it :) The place Twisted reviewers (such as yourself, congrats on your addition to the team :)) should look to see what needs reviewing is here: Unfortunately this is what Trac is still required for, tracking this state. Please stick the 'review' keyword back on the Trac ticket so that it will show up there. |
|
Sorry my silliness. Still getting familiar with the process. I've added "review" keyword and reassigned the ticket to nobody. |
No worries. The process is definitely a bit clunky. One day somebody will have enough free time to migrate the rest of it to github and it won't require manually pushing so many buttons and twirling so many dials :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot. Happy to see a green build :)
Looks good... but I think that it is missing a few tests.
Other than that, it should be ready for merging soon :)
|
|
||
| def loseConnection(self): | ||
| """ | ||
| Send a TLS close alert and close the underlying connection. | ||
| """ | ||
| if self.disconnecting: | ||
| if self.disconnecting or not self.connected: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a test for this change? :)
| @@ -124,6 +124,9 @@ def connectionLost(self, reason): | |||
| self.factory.unregisterProtocol(self) | |||
| self.wrappedProtocol.connectionLost(reason) | |||
|
|
|||
| # Breaking reference cycle between self and wrappedProtocol. | |||
| self.wrappedProtocol = None | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have tests for this change?
Besides updating the existing tests for policies, I was expecting a new test (or update of an existing test) for connectionLost in which of which the docstring describe that on connection lost the wrappedProtocol is None
|
@IlyaSkriblovsky before pushing changes to this, please merge with trunk. Also, before pushing a code, try to run pyflakes and pycodestyle on local code to reduce the load on our testing servers.... Travis and Appveyor are free... but in the end, somebody has to pay for their usage |
|
@adiroiban you are right, sorry. Will do as you said. I've cancelled Travis build, but I can't cancel AppVeyor. |
|
@adiroiban test failure looks like unrelated to this PR because all buildbot builds was green until today changes, but today I only added some tests without changing implementation. Is it possible to restart this particular build to check if failure is permanent? |
|
@IlyaSkriblovsky I have restarted it. You can also restart it... check for credentials here https://twistedmatrix.com/trac/wiki/ContinuousIntegration/DeveloperWorkflow#WriteAccess :) |
|
but that is a flaky test... I saw it failing in other branches |
|
Thank you. I'm going to put this back in review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
My only minor comment is about the short cF, sF, serverWrappedProto, cProto ... ex why cProto and not cP ... or why cF and not cFactory or cFact ...
I would like to see non-abreviated names... but I will not block the merge on that :)
Feel free to merge :)
Thanks a lot for the great work!
…ky-circular-refs-in-tls
|
I've renamed So, waiting for testing results after updating to trunk and merging. |
|
I'm not sure why this build failed: https://buildbot.twistedmatrix.com/builders/osx10.10-py2.7/builds/1350 What is going on? |
|
@IlyaSkriblovsky feel free to merge it. This is a known issue (see the conversation from the mailing list) It was green before...but while testing the OSX built, I picked your branch for testing. Sorry :) The OSX is executed over circle-ci so we are covered. Thanks! |
|
@adiroiban Done, thanks! |
https://twistedmatrix.com/trac/ticket/9374
ProtocolWrapper creates the circular reference between itself and its wrappedProtocol. Also TLSMemoryBIOProtocol creates the circular reference between itself and its _tlsConnection.
Both circles are never broken and still alive after connections are closed until garbage collector collects them.
This seems like a performance issue especially for apps handling many short-lived TLS connections. For default GC thresholds this will cause GC to be called very often wasting CPU time. For loosened GC thresholds (for the sake of performance) this leads to eating RAM very quickly.
Discussed in mailing list: https://twistedmatrix.com/pipermail/twisted-python/2018-January/031798.html