Break Cluster-Topic refcycle #262

yungchin · 2015-09-10T13:01:43Z

This partially addresses #257, as discussed there. I figured it was best to make a separate pullreq, as it doesn't have much to do with the changes in #258.

This breaks a reference cycle between Cluster and Topic. As per the discussion in #257, we didn't want to make Topic._cluster a weakref, as this would break user code that retains a reference to the Topic only. Instead, we make TopicDict lazily instantiate Topics (which should have the added benefit of saving a bunch of space if a cluster has many topics and partitions), while storing only a weakref to it. This also moves Cluster._update_topics() into TopicDict, because it needs to know about the weakref implementation details: if a Topic hasn't been instantiated, it shouldn't try to update it. Signed-off-by: Yung-Chin Oei <yungchin@yungchin.nl>

Signed-off-by: Yung-Chin Oei <yungchin@yungchin.nl>

The grandparent commit (which made TopicDict carry weakrefs) introduced a bug, namely that topic auto-creation via __missing__() caused an AttributeError to be raised, because I forgot to update __missing__ for the new nature of the dict entries. There was no test coverage to catch this, so tests are added here too. To make those tests pass reliably, we needed to ensure that topic auto-creation works even on a fresh cluster. It turns out that with the updated version of Cluster._get_metadata (per the grandparent commit), that was just a one-line change - that is, as a side-effect, we've now also addressed #175: auto-creation works on a freshly bootstrapped kafka 0.8.2 cluster. Signed-off-by: Yung-Chin Oei <yungchin@yungchin.nl>

yungchin · 2015-09-10T22:53:53Z

As noted in the last commit msg, this now also happens to deal with #175 - topic auto-creation is enabled even on a fresh, empty cluster (there's still no broker metadata, but it can fall back on the bootstrap brokers).

emmettbutler · 2015-09-19T17:26:16Z

@yungchin thanks for making this totally separate from #258 - it makes them both a lot easier to think about. This will be next on my list of pull requests to review.

emmettbutler · 2015-09-20T22:19:41Z

I've spent a while trying to grok this, and I understand it satisfactorily. Even outside of the specific object-leak bugfix, lazily creating Topics is a good idea to mitigate memory usage. I'd like to run a few more tests on my own, but I'm close to totally convinced.

yungchin · 2015-09-20T22:58:27Z

@emmett9001 thanks for checking this out! I was a bit spooked by the two test failures on travis just now:
https://travis-ci.org/Parsely/pykafka/jobs/81294673#L516

Have not seen these in local testing, but to save me some RAM I often run a two-node cluster (whereas we run a three-node cluster on travis), and I guess the smaller cluster is quicker to finish setting up a new topic.

In the failing tests, I think what goes wrong is this: test_topic_autocreate tries to auto-create a topic but gives up on it too soon (the cluster is still electing a leader and stuff), and so the test fails. Then we enter test_topic_updates, and in the middle of that our new topic from test_topic_autocreate finally appears, so now this test errors too because it sees more new topics than it expected. (And both tests are new, which is why this has never come up before, but this seems to be the kind of failure that will keep coming up randomly....)

I'm not really sure what the best fix would be - I suppose TopicDict._create_topic needs to wait longer or should really be smarter about the wait, but I've no good suggestions on that :(

The tests added with d6255c5 would sometimes fail, when for one reason or another the test cluster would take longer to set up the new topic than Cluster._create_topic was willing to wait. We had to give up after some retries, because we didn't try to distinguish between cluster settings where auto-create was enabled or disabled. This commit adds that ability, by letting _create_topic inspect the error value in the metadata response. If the error is a LeaderNotAvailable, that means the cluster is actually busy setting up our new topic, so we know it may be worth waiting a bit longer than the fixed 5 retries we had before. Signed-off-by: Yung-Chin Oei <yungchin@yungchin.nl>

yungchin · 2015-09-21T22:05:43Z

Out of curiosity, I restarted the test on travis earlier (the previous run's log output is here for posterity and such), and indeed it passed just fine the second time around. Of course that doesn't mean it's all ok - the test would continue to occasionally fail. The commit I've just pushed should fix that.

Break Cluster-Topic refcycle

yungchin added 2 commits September 10, 2015 13:38

broker: fix incorrect docstring

9b5c47c

Signed-off-by: Yung-Chin Oei <yungchin@yungchin.nl>

amontalenti added the in progress label Sep 10, 2015

yungchin added needs review and removed in progress labels Sep 10, 2015

yungchin mentioned this pull request Sep 10, 2015

forget to disconect socket ? #257

Closed

merge master

a79b204

emmettbutler self-assigned this Sep 20, 2015

emmettbutler added a commit that referenced this pull request Sep 24, 2015

Merge pull request #262 from Parsely/fix/break-cluster-topic-refcycle

6768598

Break Cluster-Topic refcycle

emmettbutler merged commit 6768598 into master Sep 24, 2015

amontalenti removed the needs review label Sep 24, 2015

emmettbutler deleted the fix/break-cluster-topic-refcycle branch September 24, 2015 19:18

yungchin mentioned this pull request Oct 23, 2015

Register atexit handlers for balancedconsumer and producer #311

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Break Cluster-Topic refcycle #262

Break Cluster-Topic refcycle #262

Uh oh!

yungchin commented Sep 10, 2015

Uh oh!

yungchin commented Sep 10, 2015

Uh oh!

emmettbutler commented Sep 19, 2015

Uh oh!

emmettbutler commented Sep 20, 2015

Uh oh!

yungchin commented Sep 20, 2015

Uh oh!

yungchin commented Sep 21, 2015

Uh oh!

Uh oh!

Break Cluster-Topic refcycle #262

Break Cluster-Topic refcycle #262

Uh oh!

Conversation

yungchin commented Sep 10, 2015

Uh oh!

yungchin commented Sep 10, 2015

Uh oh!

emmettbutler commented Sep 19, 2015

Uh oh!

emmettbutler commented Sep 20, 2015

Uh oh!

yungchin commented Sep 20, 2015

Uh oh!

yungchin commented Sep 21, 2015

Uh oh!

Uh oh!