Skip to content
This repository has been archived by the owner on Aug 26, 2022. It is now read-only.

Nearest-zone-job adjustments 944186 #4241

Merged
merged 2 commits into from May 19, 2017
Merged

Nearest-zone-job adjustments 944186 #4241

merged 2 commits into from May 19, 2017

Conversation

escattone
Copy link
Contributor

@escattone escattone commented May 19, 2017

This PR addresses two problems introduced in #4209 (addressing bug 944186):

  • it hopefully eases the spike in celery tasks (see bug 1366044)
  • fixes an exception when getting the nearest-zone for a document whose parent-topic has been deleted (bug 1365960)

I wanted to prove (via a test case) that using None for the "empty" value for the DocumentNearestZoneJob class effectively bypasses the cache, along the lines of something like this:

# doc is some document that doesn't have a nearest zone.
assert not DocumentNearestZoneJob().get(doc.pk) 
with mock.patch.object(DocumentNearestZoneJob, 'fetch') as mock_fetch:
    DocumentNearestZoneJob().get(doc.pk)
    mock_fetch.assert_not_called()

which I thought should fail (on the mock assertion) when using None for empty and succeed when using False, but for the life of me I couldn't get it to fail when using None. A long, detailed look at the django-cacheback code has me convinced that it should fail, but it doesn't in my tests.

@escattone escattone requested a review from jwhitlock May 19, 2017 01:32
@jwhitlock
Copy link
Contributor

@escattone - I've done a little more research, and I think I sent you down a dead end.

First, we're on django-cacheback 1.0, so make sure you are reading that code, not the latest. I'd like to update that and all our dependencies, but that's another PR.

First place in the tour is get, which is where all the action happens. If the cache is unset, it returns None, which triggers the first branch. This does a sync or async call to get and store the value. A sync call calls refresh, which calls fetch and stores the result. An async call stores self.empty(), so that future calls will not schedule, and schedules a future refresh.

In both cases, cache_set is called. This stores the tuple (expiry, data). So, if fetch returns None, something like (1495211607.434809, None) is stored in the cache. This is different than the plain None returned by a cache miss.

So, None is a valid return from fetch, and we don't have to go through gymnastics to avoid None.

I have other guesses, but I don't want to send you down any other dead ends, so doing more research first.

@jwhitlock
Copy link
Contributor

OK, further research:

  • With fetch_on_miss=True, I think we go down the sync path for get.
  • The cache is invalidated whenever the root zone changes
  • The cache is refreshed as background tasks every 3 hours.
  • If you fetch a deeply nested translated page, like fr/docs/Web/JavaScript/Reference/Objets_globaux/Array/from, it generates many cacheback jobs (12 in this case, as it walks up the tree). This is what search engine spiders do periodically.

So, I don't think there is a error causing needless cacheback jobs. I'd support increasing the cache limit to something like 29 hours, so that the cache lasts more than a day and maybe avoids being triggered by a once-a-day crawler but instead gets distributed by organic traffic. Plus, 29 is prime.

@escattone
Copy link
Contributor Author

@jwhitlock: Oh, of course, you're absolutely right, thanks for looking into that. It's frustrating how at times something can be "staring you in the face" and I still don't see it. I'll close this PR and submit a new one just for bug 1365960.

@escattone
Copy link
Contributor Author

@jwhitlock: Just read your latest comment. Ah yes, the cache timeout is short, especially given that with #4209 we should be invalidating the cache properly. So how does this sound, I'll modify this PR as follows:

  • increase the cache timeout to 29 hours (or maybe even more?)
  • change the empty case back to None

@jwhitlock
Copy link
Contributor

The only thing I could think of that could go wrong is if you move a page into a zone, it could take a while for the zone to be applied. But, that could be fixed by invalidating on the Document save signal.

@escattone
Copy link
Contributor Author

@jwhitlock We are already invalidating the nearest-zone cache on document save so I think we're covered for that case?

@escattone
Copy link
Contributor Author

OK, I refactored the two commits:

  • I backed-out the change to the "empty" value of nearest-zone jobs, and instead increased their cache lifetime to 29 hours, which should hopefully ease the spike in celery tasks (see bug 1366044)
  • I kept the fix for the exception when getting the nearest-zone for a document whose parent-topic has been deleted (bug 1365960), but slightly modified the comment on the test function.

@escattone escattone changed the title Fix nearest zone job 944186 Nearest-zone-job adjustments 944186 May 19, 2017
@escattone
Copy link
Contributor Author

Ready for your review again @jwhitlock!

@codecov-io
Copy link

Codecov Report

Merging #4241 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #4241   +/-   ##
=======================================
  Coverage   86.56%   86.56%           
=======================================
  Files         150      150           
  Lines        9065     9065           
  Branches     1198     1198           
=======================================
  Hits         7847     7847           
  Misses        990      990           
  Partials      228      228
Impacted Files Coverage Δ
kuma/wiki/jobs.py 96.72% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 101b0cf...9c5750b. Read the comment docs.

Copy link
Contributor

@jwhitlock jwhitlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks good, ship it! ⛵️

@jwhitlock jwhitlock merged commit 5c69670 into mdn:master May 19, 2017
@escattone escattone deleted the fix-nearest-zone-job-944186 branch May 19, 2017 16:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants