-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache_key_wrapper to Jinja template processor #7816
Conversation
@duffar12 I would appreciate your thoughts on this PR, as I believe it fixes your immediate problem and could be useful to others, too. |
It seems feedback from the original issue poster isn't forthcoming, but would like to get this processed anyway. @mistercrunch @john-bodley do you have any thoughts on this? |
@villebro I think this approach seems valid. My only question is it seems that |
@john-bodley The idea is basically to gather up all the objects that have been passed to |
@villebro agreed. |
Ok to merge this? |
@villebro, sorry I missed all of this. Looks good though. Thanks for getting this in |
@villebro: We’re running into an issue where charts and dashboards are significantly slower after your change. This seems to be because we’re “compiling” the query every time before checking the cache for it, which results in several requests to our Presto/Hive backend (show partitions, show columns, etc.) to resolve the templates. When we load a dashboard with 20 charts, all with a use of latest partition, it significantly slows down and reduces the usefulness of caching. I had a couple possible solutions to this:
What do you think? |
I propose introducing a method |
CATEGORY
Choose one
SUMMARY
Currently using dynamic filters that change based on the logged in user can unintended consequences when caching is enabled. For example, if a where-clause references the
currenct_username()
function to filter only rows for the currently logged in user, the result will be cached with the same key that another user would get, despite both users getting different results from thecurrent_username()
function. This is because the rendered result of callingcurrent_username()
is never stored in thequery_obj
that is the basis for the cache key.This PR adds a new function
cache_key_wrapper
to the jinja context, which can be wrapped around any function call, and stores the called values in a listextra_cache_keys
, which are added to thecache_dict
prior to hashing. This ensures that both users get unique values when referencing the same datasource. In practice this is done by "compiling" the query before calculation of the cache key, and storing all values that have been passed tocache_key_wrapper
, which is then considered when calling thecache_key
function inviz.py
(legacy) andquery_context.py
/query_object.py
(future). This adds some overhead, as the full SQLAlchemy selectable has to be generated and compiled once prior to cache key calculation, and again if there isn't a cache hit. The selectable could be easily stored and reused if there isn't a cache hit, but since the overhead is rather unnoticeable, I decided against it in favor of code readability.SCREENSHOT OF NEW DOCS
TEST PLAN
Tested locally + CI
ADDITIONAL INFORMATION
REVIEWERS
@mistercrunch @betodealmeida @john-bodley @duffar12