New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates Tz Singletons and Caches with `WeakValueDictionary`s #672

Merged
merged 6 commits into from Apr 14, 2018

Conversation

Projects
2 participants
@cs-cordero
Contributor

cs-cordero commented Apr 13, 2018

Closes #635.

@cs-cordero

This comment has been minimized.

Contributor

cs-cordero commented Apr 13, 2018

@pganssle - I'm trying to figure out how my pytest failed on the AppVeyor CI. It failed on a line that asserts that the weak references are cleared immediately following a couple del statements and a gc.collect() statement.

Any ideas how this could happen? Is it possible that the gettz tests are running twice asynchronously? I noticed that if I comment out the test, the count of passing test cases jumps from 1700 to 1698. Maybe I have race condition introduced somewhere -- if two test runs happen to occur at the same time, one could be late to run a del on a strong reference and the data doesn't get cleaned up.

@@ -4,6 +4,10 @@
from datetime import datetime, time
class FakeWeakRef:

This comment has been minimized.

@pganssle

pganssle Apr 13, 2018

Member

.utils is a public module and very much not the right place for this. It should probably be in tz._common.py

@@ -1,5 +1,10 @@
from datetime import timedelta
try:
import weakref

This comment has been minimized.

@pganssle

pganssle Apr 13, 2018

Member

I would change this to:

try:
    from weakref import WeakRefDictionary
except ImportError:
    WeakRefDictionary = dict

This comment has been minimized.

@pganssle

pganssle Apr 13, 2018

Member

Actually it turns out that Python 2 has weak references, so we can just drop the whole compatibility shim. My bad.

from functools import partial
IS_WIN = sys.platform.startswith('win')
NOT_PY3 = not sys.version_info[0] == 3

This comment has been minimized.

@pganssle

pganssle Apr 13, 2018

Member

Normally you would not have to define this. There's already a six.PY3 and six.PY2 that will give you that.

That said, I may have been wrong about weak references being a Python 3 thing. Seems like they are available in Python 2.7, so we should drop this skip test.

@pganssle

This comment has been minimized.

Member

pganssle commented Apr 13, 2018

Hard to tell if this is just holding a strong reference somewhere we don't realize or if the assumptions about one of the inputs is wrong. Seems to only happen on Windows on old versions of Python, though.

@cs-cordero

This comment has been minimized.

Contributor

cs-cordero commented Apr 14, 2018

Sorry for the flurry of seemingly random commits. I can't seem to replicate AppVeyor's failure on my local machine even after matching the Python version, so I'm trying to troubleshoot with small changes.

@cs-cordero

This comment has been minimized.

Contributor

cs-cordero commented Apr 14, 2018

Just reverted all the troubleshooting commits, but I'm kind of at a loss here for how this is failing on the AppVeyor CI.

The stdout from the 46bdab7 commit is really what's really tripping me up:
https://ci.appveyor.com/project/dateutil/dateutil/build/2.3.1234/job/4ru7kdg7wipal15w

In this commit, I print the keys inside the WeakValueDictionary before and after deleting the strong ATK1 and ATK2 references, expecting the WeakValueDictionary to clear itself, but it does not. I also used pytest.raises(NameError) to assert that ATK1 and ATK2 are indeed deleted, and they are. There shouldn't be any remaining strong references to the value in the dictionary, but somehow it is persisting. Running the equivalent code locally off of CI yields the expected behavior.

I'll keep thinking about how to get around this. One idea I had was to have some kind of _cache_hit attribute on the factory that keeps a count of every time it is returning a value from its WeakValueDictionary, and not incrementing when it's creating a brand new instance. We can assert that this counter does/does not change depending on the test.

@pganssle

This comment has been minimized.

Member

pganssle commented Apr 14, 2018

I'm guessing that this is a real issue with Windows and not a problem with the tests. I'll grab this branch and try debugging it on a Windows VM.

@pganssle

This comment has been minimized.

Member

pganssle commented Apr 14, 2018

Ah, I figured this out. It's because ZoneInfoFile is read once, so it actually contains strong references to each of the tzfile objects it created. For now we'll mark it as xfail on Windows, and either it'll be solved by moving zoneinfo over to using a weak reference cache or we do some kind of thing where we install zoneinfo files somewhere and modify the Windows path as part of the tests, or just change that particular one to a skipif.

@pganssle

This comment has been minimized.

Member

pganssle commented Apr 14, 2018

Cleaned up the commit history. Will merge when tests pass.

Thanks so much @cs-cordero!

@cs-cordero

This comment has been minimized.

Contributor

cs-cordero commented Apr 14, 2018

Happy to help. Thanks for taking it home with that clutch Windows ZoneInfoFile issue.

@pganssle

This comment has been minimized.

Member

pganssle commented Apr 14, 2018

Hm.. The Python 3.5 test failed but succeeded on reboot. This "id won't get reused" gambit may be less stable than I hoped.

I just had a new idea for a more robust test, though. Create a new weak reference to the object before deleting it and make sure its referrent is gone after the gc step.

@pganssle pganssle added this to Submitted PRs in PyData NYC Sprint Apr 14, 2018

@pganssle

This comment has been minimized.

Member

pganssle commented Apr 14, 2018

@cs-cordero One last thing, can you add yourself to AUTHORS.md - names are in alphabetical order. You are welcome to put your real e-mail and encouraged to put github name. Make sure to put a D after your name line, which indicates your contributions are under the dual license.

@cs-cordero

This comment has been minimized.

Contributor

cs-cordero commented Apr 14, 2018

Will do re: authors.md once I get access to a computer and little later in the day. Thanks!

@pganssle

This comment has been minimized.

Member

pganssle commented Apr 14, 2018

@cs-cordero No this test assumes that the weakref module works as expected and tests that we didn't screw up and accidentally start holding a strong reference to the tzinfo objects in the cache.

If you do:

# Checkout base of the branch
git checkout e6644a4
# Checkout the commit where the tests were added
git checkout 75ae199 -- dateutil/test/test_tz.py
pytest -k test_tz.py

You'll see that these three tests fail, which means the problem - that the cache itself holds a strong reference to the cached time zones - is solved by this patch.

@cs-cordero

This comment has been minimized.

Contributor

cs-cordero commented Apr 14, 2018

Ah yes that makes perfect sense! Thank you for explaining.

cs-cordero and others added some commits Apr 12, 2018

Change weakref test mechanism
This is a more reliable mechanism to test that nothing continues
to hold a strong reference to the cached values.

I have also switched the tests over to using pytest-style bare
functions, and marked them all as smoke tests, since this will
not actually be a guaranteed behavior of the interface.
@pganssle

This comment has been minimized.

Member

pganssle commented Apr 14, 2018

@cs-cordero You'll probably need to do a force pull when you edit this branch again, because I've rebased your branch and rewritten the history. If you don't know how to do that it's git pull --force

@pganssle pganssle merged commit 07d8dc1 into dateutil:master Apr 14, 2018

3 checks passed

codecov/project 95.26% (target 80%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

PyData NYC Sprint automation moved this from Submitted PRs to Merged PRs Apr 14, 2018

@cs-cordero cs-cordero deleted the cs-cordero:weakrefs branch Apr 14, 2018

@cs-cordero

This comment has been minimized.

Contributor

cs-cordero commented Apr 14, 2018

🎉

@pganssle pganssle modified the milestones: Bugfix release, 2.7.3 Apr 23, 2018

@pganssle pganssle modified the milestones: 2.7.3, 2.8.0 May 9, 2018

@pganssle pganssle referenced this pull request May 9, 2018

Merged

Revert weakref #716

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment