New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a five minute cache to get_destination_retry_timings #3933

Merged
merged 9 commits into from Oct 1, 2018

Conversation

Projects
None yet
3 participants
@erikjohnston
Member

erikjohnston commented Sep 21, 2018

Hopefully helps with #3931

Based on: #3932

@erikjohnston erikjohnston requested a review from matrix-org/synapse-core Sep 21, 2018

@erikjohnston

This comment has been minimized.

Member

erikjohnston commented Sep 21, 2018

Hmm, this breaks metrics due to it returning a length estimate less than 0:

2018-09-21 16:11:41,429 - twisted - 243 - CRITICAL -  - 
Traceback (most recent call last):
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/twisted/web/server.py", line 198, in process
    self.render(resrc)
  File "/home/erikj/synapse/synapse/http/site.py", line 116, in render
    Request.render(self, resrc)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/twisted/web/server.py", line 258, in render
    body = resrc.render(self)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/twisted/web/resource.py", line 250, in render
    return m(request)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/prometheus_client/twisted/_exposition.py", line 18, in render_GET
    return generate_latest(self.registry)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/prometheus_client/exposition.py", line 69, in generate_latest
    for metric in registry.collect():
  File "/home/erikj/synapse/synapse/metrics/__init__.py", line 46, in collect
    for metric in REGISTRY.collect():
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/prometheus_client/core.py", line 102, in collect
    for metric in collector.collect():
  File "/home/erikj/synapse/synapse/util/caches/__init__.py", line 85, in collect
    cache_size.labels(cache_name).set(len(cache))
ValueError: __len__() should return >= 0

erikjohnston and others added some commits Sep 21, 2018

Fix ExpiringCache.__len__ to be accurate
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
if value is SENTINEL:
return default
if self.iterable:

This comment has been minimized.

@richvdh

richvdh Sep 25, 2018

Member

I would not call this an eviction. To me, eviction means that we have removed a valid entry to make room for a new one. I'd consider this as equivalent to an update with a magic "doesn't exist" value.

This comment has been minimized.

@hawkowl

hawkowl Sep 28, 2018

Contributor

I think the problem is that we're potentially conflating "eviction because of invalidation" (which this code may do) with "eviction because of some automated process" in the statistics. If something pops from the cache so it won't be cached anymore deliberately, I don't think that's useful to track in what this metric appears to do (assessing whether our cache is effective).

This comment has been minimized.

@richvdh

richvdh Sep 28, 2018

Member

@hawkowl: so I think you're agreeing with me? If so I'll remove this bit of code and merge it.

This comment has been minimized.

@erikjohnston

erikjohnston Oct 1, 2018

Member

Ah, for some reason I thought that this was consistent with what we did elsewhere, but it isn't.

richvdh added a commit that referenced this pull request Sep 26, 2018

Fix ExpiringCache.__len__ to be accurate
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.

(This appears to have been a longstanding bug, but one which has been made more
of a problem by #3932 and #3933).

(This was originally done by Erik as part of #3933. I'm cherry-picking it
because really it's a fix in its own right)

@erikjohnston erikjohnston requested a review from matrix-org/synapse-core Oct 1, 2018

@richvdh

richvdh approved these changes Oct 1, 2018

lgtm otherwise

@@ -95,6 +98,13 @@ def __getitem__(self, key):
return entry.value
def pop(self, key, default=None):
value = self._cache.pop(key, SENTINEL)

This comment has been minimized.

@richvdh

richvdh Oct 1, 2018

Member

this is the same as return self._cache.pop(key, default) now.

docstring wouldn't go amiss.

This comment has been minimized.

@erikjohnston

erikjohnston Oct 1, 2018

Member

this is the same as return self._cache.pop(key, default) now.

It's not, actually, as self._cache.pop won't raise if default is None. Thinking about it we can probably just do return self._cache.pop(key, **kwargs)

This comment has been minimized.

@erikjohnston

erikjohnston Oct 1, 2018

Member

Argh, I've completely messed this up

if value is SENTINEL:
raise KeyError(key)
return value

This comment has been minimized.

@erikjohnston

erikjohnston Oct 1, 2018

Member

I could have just done:

def pop(self, key, *args, **kwargs):
    "Identical functionality to `dict.pop(..)`"
    return self._cache.pop(key, *args, **kwargs)

But that feels quite opaque to me?

This comment has been minimized.

@richvdh

richvdh Oct 1, 2018

Member

yeah sorry, I hadn't quite twigged that dict.pop(foo) was different to dict.pop(foo, None).

@richvdh

richvdh approved these changes Oct 1, 2018

lgtm

@erikjohnston erikjohnston merged commit 8f5c23d into develop Oct 1, 2018

6 checks passed

ci/circleci: sytestpy2merged Your tests passed on CircleCI!
Details
ci/circleci: sytestpy2postgresmerged Your tests passed on CircleCI!
Details
ci/circleci: sytestpy3merged Your tests passed on CircleCI!
Details
ci/circleci: sytestpy3postgresmerged Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment