Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

[discussion] Changing the behaviour of ResponseCache.wrap_conditional to only honour the very first caller #9525

Closed
ShadowJonathan opened this issue Mar 2, 2021 · 7 comments

Comments

@ShadowJonathan
Copy link
Contributor

ShadowJonathan commented Mar 2, 2021

cc @richvdh, @clokep

I feel like this warrants some further discussion, as i think i don't exactly understand the rationale of why we're pretty much doing this, and i think that some gotchas werent received during discussion in #synapse-dev, so i'm making this verbose issue on it to get on the same page.

Status

#9358 introduced ResponseCache.wrap_conditional, #9522 builds on it while fixing #9507, and #9458 adds tests that validate its functionality (as given by #9358)

This feature (as of the making of this issue) has the following functionality;

  • It acts like wrap, but has an extra positional argument which the caller can provide a "conditional" with, this conditional would take the result of the call as it's first argument, and return a bool indicating if the result should be cached in a timed fashion or not.
  • Currently, if another caller would call wrap_conditional with the same key, their "conditional" callable would be added to the set of conditionals, to be all evaluated when the call returns.
  • When the call returns, this set of conditionals need all to return True (and not throw an exception, which is what Harden ResponseCache (and fix #9507) #9522 fixes) for the result to be cached in a timely fashion.
  • Any subsequent call to wrap_conditional will have their conditional argument ignored, no matter if it evaluated to False or not.

Discussion terminology

For the purpose of discussion, I'll define the following terms;

  • wrap* is wrap and/or wrap_conditional, this is in cases where the desired functionality is indistinguishable between the two. (e.g. "doesn't matter if a conditional is supplied or not")
  • the "callers", or "the caller", are any function that calls wrap* (with the same key, in the context of this discussion)
  • "the very first caller" is the first (timeline-wise) to call to wrap* or wrap_conditional, and start the inner call. This excludes any future callers.
  • "the first callers" are the callers that have called wrap* but are waiting for the inner call to complete, awaiting on a Deferred-like object.
  • "subsequent callers" are the callers excluded from the first callers, these callers

Arguments

My argument on going forward with this is to make any caller (first or subsequent) be able to evict/invalidate the cache when a conditional they give evaluates as False.

My rationale for this is that calls to wrap*, as they stand now, should all have the same "effect". i.e. invalidating the cache if conditional returns False, this is the careful approach, and so any cache is invalidated whenever it is deemed unsafe or bad to be cached, "better to be safe than sorry", because invalid cache can result in anything from delays to corruption.

@richvdh's argument is to make the very first caller be able to have a say if the result should be cached, because that is semantically equivalent to how it was previously, where the call would decide if it's value is cachable or not.

I agree with this, and that was how #9358 initially approached it, with a relatively-hacky approach of having any call wrap its value in NoTimedCache (see these 5 commits).

However, I wasn't satisfied with this, and looked at alternatives (like contextvars, which I didn't feel are very stable or compatible with twisted at the moment), and eventually settled on turning the approach on its head (to avoid rewriting semantics of many functions that ResponseCache wrapped around across the codebase, the intention was to be non-invasive) by introducing wrap_conditional.

Current situation

This approach changed some semantics, because now it was the caller to decide if the result would be cached, and there wasn't a good way to get a "voice" from the called function, so it became a "counsel" of first callers that all had to agree on something being cached (better safe than sorry).

Any function could be (or could become) a caller, and so it was best to be as agnostic as possible, and don't make assumptions who would possibly be callers across the codebase (especially not assuming it'd be a controlled subset, as that creates more implicit contracts that're not easily maintainable), and just value their "opinion" (in the form of conditional) equally with vetoing.

With a set of {X Y Z} possibly being the first callers, and only the very first call being able to set the conditional, it could introduce hard-to-resolve bugs where (hypothetically) Z here being the one that introduces a faulty conditional, but then X and Y also being candidates for becoming the very first caller as well, and this creating a situation where synchronisation of these calls had to be taken into account to fix those bugs.

This is where we are right now, and I'm curious how i could change this approach to be more correct while also semantically making sense.

I agree with richard on a call being able to set its cachability, but i think because of how i had to change my approach, a different situation has arisen, and i think we're not on the same page on how to best change the situation/approach, or how to work with the new semantics.

@ShadowJonathan
Copy link
Contributor Author

I think one additional (weak) argument to make any caller be able to invalidate cache is to possibly have their conditional linked with the Cache-Control: no-store header of any HTTP request, which'd then unconditionally return False in their conditional, to make sure any future requests would not be cached. Additional such control mechanisms could happen or arise in other functions (see the "being agnostic" part in my argument), which'd then be able to signal individually if something is cached, rather than have something or someone else decide for them. (this isn't intended to be my prime argument though, arguably this is a bit weak to lean on when considering this behaviour, but i think it could matter)

@richvdh
Copy link
Member

richvdh commented Mar 2, 2021

for our other caches, evicting things from the cache (whether it is done before the original operation completes or not) is done via a separate interface. I'm not convinced we need to scope-creep the wrap interface into supporting eviction at any time.

@ShadowJonathan
Copy link
Contributor Author

Then what do you propose we do? #9358 is reverted, and now we can take another approach, I've lined out what i stumbled on and how i went about making this ergonomic, and why some approaches didn't go well for me.

@richvdh
Copy link
Member

richvdh commented Mar 2, 2021

I think I've been pretty clear about what I propose.

@ShadowJonathan
Copy link
Contributor Author

ShadowJonathan commented Mar 2, 2021

You haven't, what you proposed with making #9358's approach only honor the very first caller will in my opinion only introduce hard-to-discover bugs (which is a principle i'd want your opinion on) for a pretty questionable semantics displacement against the new situation ("let the called function decide cachability" when now the caller is deciding it, but then only the very first caller can decide it because it has to be done "once"?).

I'm not convinced we need to scope-creep the wrap interface into supporting eviction at any time.

btw, this is suggesting you want to switch to another approach entirely, that's why i asked what your thoughts on that was.

@ShadowJonathan ShadowJonathan changed the title [discussion] Changing the behaviour of ResponseCache.wrap_conditional to only honour the very first caller [discussion] Changing the behaviour of ResponseCache.wrap_conditional to only honour the very first caller Mar 24, 2021
@ShadowJonathan
Copy link
Contributor Author

Hey, i made #9739, which applies the Very First Caller behaviour to /sync specifically.

I want to continue talking about this, with this discussion i want to find what behaviour (a) ResponseCache.wrap_conditional would have, so that something like it can be used in more places.

Can you imagine more places where such a thing can be used? A caller deeming the result coming from a function not cachable, or more accurately, where such a function (that is being called) might find "hey, i don't want to cache this, but still return it".

If so, then i'd wanna make a generic ResponseCache.wrap_conditional, if not, then we might better close this discussion(/issue).

@richvdh
Copy link
Member

richvdh commented Jul 8, 2021

I think we resolved this with #10157.

@richvdh richvdh closed this as completed Jul 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants