New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure the name cache is rolled back when the packet reaches maximum size #318
Conversation
809c3d8
to
6ed8c0a
Compare
Any easy way we can modify the test suite to verify this? |
Coverage increased (+0.01%) to 93.387% when pulling 6ed8c0aaeb74e2c817e48d34539f1c73da544e08 on bdraco:fix_reset_on_packets into 4da1612 on jstasiak:master. |
I'll mark this as a draft while I think of a way to write a test for it. It will probably need to wait to the weekend when I have some more time. Its been like this for a while so I don't think its on fire and needs attention right now. |
Definitely needs a test as I just got another case of it in the logs so I may not have fixed it or at least not all the instances of it.
|
|
Looks like I need to dig in more |
Was able to verify its choking on a packet it is sending.
|
I think something in the first packet is pointing to something in the second packet. |
I think I've got enough to replay the packet construction now |
Here is the DNSOutgoing repr that triggers the issue:
|
And a second one to be sure
|
@stephenrauch Any chance you can take a look? I've got a script that replicates the compression issue: https://github.com/bdraco/python-zeroconf/blob/choke_fix/choke.py |
I finally found the actual problem. When we rollback because the packet is too large we fail to remove the names so we create pointers to names that have been moved to the next packet. |
6ed8c0a
to
0869f62
Compare
I'm going to take |
b40eeeb
to
f88f738
Compare
Verified the no more |
Nice! I'll have a look at this soon. |
f88f738
to
daae7dc
Compare
8f2667a
to
a71fa51
Compare
CI looks happy now: https://travis-ci.org/github/jstasiak/python-zeroconf/builds/752058074 |
a71fa51
to
5ca3c2b
Compare
I think I should take out the exact packet matching. As long as re-loading it with DNSIncoming doesn't throw we know its ok |
5ca3c2b
to
bef4c0d
Compare
That's cleaner. I think this is ready to go. I just retested it |
Running for 24 hours and no more errors in the log on the site that this was happening 👍 |
Still looking good. @jstasiak Any chance we can get a release with this before monday so we can get it into the next Home Assistant release on wednesday? Thank you |
@bdraco I'm sorry, didn't have time to look at this until now. The new test finishes successfully for me (Mac OS 10.14, CPython 3.9.0) without changes to the library code so while I trust the issue is fixed the test we have doesn't actually verify it. |
I'll roll the test back to a prior revision. I looks I was too aggressive in trying to reduce the size of the test case. |
…size If the packet was too large, it would be rolled back at the end of write_record. We need to remove the names that were added to the name cache (self.names) as well to avoid a case were we would create a pointer to a name that was rolled back. The size of the packet was incorrect at the end after the inserts because insert_short would increase self.size even though it was already accounted before. To resolve this insert_short_at_start was added which does not increase self.size. This did not cause an actual bug, however it sure made debugging this problem far more difficult. Additionally the size now inserted and then replaced when the actual size is known because it made debugging quite difficult since the size did not previously agree with the data.
bef4c0d
to
f20e851
Compare
I was able to keep the test almost as-is and add this:
Verified it passes on the new code and fails on the old code. |
@jstasiak retested after walking away for a minute and should be good to go. Thanks for validating the test. Sorry I missed it in the first place. |
Nice! I was afraid the binary dump of the DNS packets would be back (I could live with it, but it's better not to have it). I'll review (and hopefully release) soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@bdraco Released as version 0.28.8. No worries about the test. :) |
Thank you! |
Fixes #308
If the packet was too large, it would be rolled back at the end of
write_record
.We need to remove the names that were added to the name cache (
self.names
)as well to avoid a case were we would create a pointer to a name that was
rolled back.
The size of the packet was incorrect at the end after the inserts because
insert_short
would increaseself.size
even though it was already accountedbefore. To resolve this
insert_short_at_start
was added which does notincrease
self.size
. This did not cause an actual bug, however it suremade debugging this problem far more difficult.
Additionally the size now inserted and then replaced when the actual
size is known because it made debugging quite difficult since the size
did not previously agree with the data.