Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Speed up cached function access #2075

Merged
merged 7 commits into from Mar 31, 2017

Conversation

Projects
None yet
2 participants
Owner

erikjohnston commented Mar 28, 2017

Calls to cached functions are starting to turn up in the flame graphs of matrix.org.

We improve it by doing two things:

  • Don't wrap the return value of the cache descriptors in a deferred if the cache has the full value. This does mean that the storage functions may now return either a deferred or an actual value, but pretty much everywhere uses yield func(..) which happily deals with it.
  • Remove the call from getcallargs from the inner function as that is quite expensive. We can pre-compute a bunch of stuff when we first set up the cache.

These two changes speed up cache access by 10x on my desktop.

@erikjohnston erikjohnston requested a review from richvdh Mar 28, 2017

@erikjohnston erikjohnston changed the title from Speed up the cache to Speed up cached function access Mar 28, 2017

@@ -101,7 +101,7 @@ def remove(r):
return d
else:
success, res = self._result
- return defer.succeed(res) if success else defer.fail(res)
+ return res if success else defer.fail(res)
@richvdh

richvdh Mar 29, 2017

Member

I bet this will break something somewhere :/

@richvdh

richvdh Mar 30, 2017

Member

to be clear: I'm not suggesting doing much about it, other than watching for breakage when it lands.

@erikjohnston

erikjohnston Mar 30, 2017

Owner

To be honest I'm concerned about this, but it makes cache hits much cheaper.

@richvdh

richvdh Mar 30, 2017

Member

probably worth documenting it in a docstring.

synapse/util/caches/descriptors.py
@@ -200,6 +200,7 @@ def __init__(self, orig, num_args, inlineCallbacks, cache_context=False):
arg_spec = inspect.getargspec(orig)
all_args = arg_spec.args
+ self.arg_spec = arg_spec
@richvdh

richvdh Mar 29, 2017

Member

what's this for?

synapse/util/caches/descriptors.py
@@ -229,6 +230,14 @@ def __init__(self, orig, num_args, inlineCallbacks, cache_context=False):
self.num_args = num_args
self.arg_names = all_args[1:num_args + 1]
+ if arg_spec.defaults:
+ self.arg_defaults = dict(zip(
@richvdh

richvdh Mar 29, 2017

Member

please can you document what this thing is and what type it has.

synapse/util/logcontext.py
@@ -311,6 +311,9 @@ def preserve_context_over_deferred(deferred, context=None):
"""Given a deferred wrap it such that any callbacks added later to it will
be invoked with the current context.
"""
+ if not isinstance(deferred, defer.Deferred):
@richvdh

richvdh Mar 29, 2017

Member

I am not in favour of further bodges on top of preserve_context_over_deferred, since afaict it is broken. Let me send you a counter-PR here.

Member

richvdh commented Mar 30, 2017

ok so this now has a conflict, and I think the preserve_context_over_deferred hackery isn't needed.

@richvdh richvdh assigned erikjohnston and unassigned richvdh Mar 30, 2017

Owner

erikjohnston commented Mar 30, 2017

Thanks for doing the preserve PR! Will rebase this in a bit

erikjohnston added some commits Mar 28, 2017

Manually calculate cache key as getcallargs is expensive
This is because getcallargs recomputes the getargspec, amongst other
things, which we don't need to do as its already been done
Owner

erikjohnston commented Mar 30, 2017

@richvdh PTAL

@erikjohnston erikjohnston assigned richvdh and unassigned erikjohnston Mar 30, 2017

@@ -101,7 +101,7 @@ def remove(r):
return d
else:
success, res = self._result
- return defer.succeed(res) if success else defer.fail(res)
+ return res if success else defer.fail(res)
@richvdh

richvdh Mar 30, 2017

Member

probably worth documenting it in a docstring.

synapse/util/caches/descriptors.py
self.arg_names = all_args[1:num_args + 1]
+ # The arg spec of the wrapped function, see `inspect.getargspec` for
+ # the type.
+ self.arg_spec = arg_spec
@richvdh

richvdh Mar 30, 2017

Member

I still don't think this is used?

@erikjohnston

erikjohnston Mar 30, 2017

Owner

Err, good point

@richvdh

richvdh Mar 30, 2017

Member

... and yet it is still here...

@@ -341,7 +370,10 @@ def onErr(f):
cache.set(cache_key, result_d, callback=invalidate_callback)
observer = result_d.observe()
- return logcontext.make_deferred_yieldable(observer)
+ if isinstance(observer, defer.Deferred):
@richvdh

richvdh Mar 30, 2017

Member

make_deferred_yieldable will work ok with a non-deferred, so I think this is redundant. otoh I guess it optimises the common path?

@erikjohnston

erikjohnston Mar 30, 2017

Owner

Because some of the speed up comes from not having to bounce through all the deferred stuff (which is much more complicated than just unwrapping to get the value) at the call sites.

Though we can also leave that to another PR

@richvdh

richvdh Mar 30, 2017

Member

understood, leave it how it is

synapse/util/logcontext.py
@@ -315,6 +315,9 @@ def preserve_context_over_deferred(deferred, context=None):
the deferred follow the synapse logcontext rules: try
``make_deferred_yieldable`` instead.
"""
+ if not isinstance(deferred, defer.Deferred):
@richvdh

richvdh Mar 30, 2017

Member

now redundant afaict?

@richvdh richvdh assigned erikjohnston and unassigned richvdh Mar 30, 2017

erikjohnston added some commits Mar 30, 2017

Owner

erikjohnston commented Mar 30, 2017

The concern that changing a bunch of functions to sometimes return deferreds is a bit risky, though at worst it should only cause an exception to be raised when someone tries to add a callback to the value returned. (i.e., we're not going to get incorrect values coming out)

lgtm apart from spurious self.arg_spec.

synapse/util/caches/descriptors.py
self.arg_names = all_args[1:num_args + 1]
+ # The arg spec of the wrapped function, see `inspect.getargspec` for
+ # the type.
+ self.arg_spec = arg_spec
@richvdh

richvdh Mar 30, 2017

Member

... and yet it is still here...

@@ -341,7 +370,10 @@ def onErr(f):
cache.set(cache_key, result_d, callback=invalidate_callback)
observer = result_d.observe()
- return logcontext.make_deferred_yieldable(observer)
+ if isinstance(observer, defer.Deferred):
@richvdh

richvdh Mar 30, 2017

Member

understood, leave it how it is

@erikjohnston erikjohnston merged commit 9cee0ce into develop Mar 31, 2017

7 of 8 checks passed

Sytest Dendron (Commit) Build #1864 origin/erikj/cache_speed in progress...
Details
Sytest Dendron (Merged PR) Build finished.
Details
Sytest Postgres (Commit) Build #2688 origin/erikj/cache_speed succeeded in 7 min 39 sec
Details
Sytest Postgres (Merged PR) Build finished.
Details
Sytest SQLite (Commit) Build #2759 origin/erikj/cache_speed succeeded in 6 min 27 sec
Details
Sytest SQLite (Merged PR) Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

psaavedra added a commit to psaavedra/synapse that referenced this pull request May 19, 2017

Merge tag 'v0.21.0' into v0.21.0_no_federate_by_default
Changes in synapse v0.21.0 (2017-05-18)
=======================================

No changes since v0.21.0-rc3

Changes in synapse v0.21.0-rc3 (2017-05-17)
===========================================

Features:

* Add per user rate-limiting overrides (PR #2208)
* Add config option to limit maximum number of events requested by ``/sync``
  and ``/messages`` (PR #2221) Thanks to @psaavedra!

Changes:

* Various small performance fixes (PR #2201, #2202, #2224, #2226, #2227, #2228,
  #2229)
* Update username availability checker API (PR #2209, #2213)
* When purging, don't de-delta state groups we're about to delete (PR #2214)
* Documentation to check synapse version (PR #2215) Thanks to @hamber-dick!
* Add an index to event_search to speed up purge history API (PR #2218)

Bug fixes:

* Fix API to allow clients to upload one-time-keys with new sigs (PR #2206)

Changes in synapse v0.21.0-rc2 (2017-05-08)
===========================================

Changes:

* Always mark remotes as up if we receive a signed request from them (PR #2190)

Bug fixes:

* Fix bug where users got pushed for rooms they had muted (PR #2200)

Changes in synapse v0.21.0-rc1 (2017-05-08)
===========================================

Features:

* Add username availability checker API (PR #2183)
* Add read marker API (PR #2120)

Changes:

* Enable guest access for the 3pl/3pid APIs (PR #1986)
* Add setting to support TURN for guests (PR #2011)
* Various performance improvements (PR #2075, #2076, #2080, #2083, #2108,
  #2158, #2176, #2185)
* Make synctl a bit more user friendly (PR #2078, #2127) Thanks @APwhitehat!
* Replace HTTP replication with TCP replication (PR #2082, #2097, #2098,
  #2099, #2103, #2014, #2016, #2115, #2116, #2117)
* Support authenticated SMTP (PR #2102) Thanks @DanielDent!
* Add a counter metric for successfully-sent transactions (PR #2121)
* Propagate errors sensibly from proxied IS requests (PR #2147)
* Add more granular event send metrics (PR #2178)

Bug fixes:

* Fix nuke-room script to work with current schema (PR #1927) Thanks
  @zuckschwerdt!
* Fix db port script to not assume postgres tables are in the public schema
  (PR #2024) Thanks @jerrykan!
* Fix getting latest device IP for user with no devices (PR #2118)
* Fix rejection of invites to unreachable servers (PR #2145)
* Fix code for reporting old verify keys in synapse (PR #2156)
* Fix invite state to always include all events (PR #2163)
* Fix bug where synapse would always fetch state for any missing event (PR #2170)
* Fix a leak with timed out HTTP connections (PR #2180)
* Fix bug where we didn't time out HTTP requests to ASes  (PR #2192)

Docs:

* Clarify doc for SQLite to PostgreSQL port (PR #1961) Thanks @benhylau!
* Fix typo in synctl help (PR #2107) Thanks @HarHarLinks!
* ``web_client_location`` documentation fix (PR #2131) Thanks @matthewjwolff!
* Update README.rst with FreeBSD changes (PR #2132) Thanks @feld!
* Clarify setting up metrics (PR #2149) Thanks @encks!

@erikjohnston erikjohnston deleted the erikj/cache_speed branch Oct 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment