Make Caching and Memoization easy and powerful #1179

lrq3000 · 2020-03-22T11:09:48Z

Panel is a wonderful piece of software, I love it, thanks a lot for making it!

However there is one thing that bugs me out, and it's the lack of caching and memoization. A simple, but I think very common case, is to cache the fetching of an external csv file. So far it seems that everytime a user reloads the page, the whole script is re-executed and the csv file must be redownloaded, which makes the loading quite slow (the user gets a blank page for about a minute, because I have multiple such csv files).

I know streamlit has support for this case, and also hvplot through datashader, but it's a quite complicated case and not as streamlined. Also, it doesn't allow as far as I understand to serve multiple different users with the same cache constructed by the first user, which would be ideal in terms of loading time.

Is there a caching and if possible memoization feature, maybe undocumented, in Panel? If not, is such a feature planned, or is there a known easy workaround (eg, by using another module)?

Thanks a lot in advance,
Best regards,
Stephen

philippjfr · 2020-03-22T13:42:03Z

The initial plans for param.depends and pn.depends based APIs was always that we would eventually add a memoization decorator so this is most certainly in scope. That being said you make a good point that in many cases it would be desirable to memoize across user sessions. In the moment this can be achieved by manually reading and writing to pn.state.cache but if we want to provide some inbuilt form of memoization we will have to tinker a bit because at least with panel serve the script (or notebook) you are serving is re-executed for each user and simple memoization will therefore not work. I definitely think this is a very important feature though and I'd also like to write a whole user guide about performance tips once this is in place.

lrq3000 · 2020-03-23T15:17:18Z

@philippjfr Thank you very much for your fast reply! Those plans sound great, I would love to see Panel support caching and memoization!

Meanwhile, I tried to workaround by using cachier or joblib's Memory, and although both work correctly to reuse the cache between Jupyter Notebook runs (after restarting the kernel), they don't with Panel launched with a Bokeh server, as each cache gets a unique handler instead of reusing the same one (eg: bk_script_1489.my_function() where bk_script_1489. is the part added by Bokeh).

Do you have any idea how I may work around this (forcing all caching requests to use the same cache without bokeh's unique id prepending) by any chance?

lrq3000 · 2020-03-23T15:59:53Z

Update: if anyone needs a workaround in the meantime, I have found that simple_cache and cache (but the latter does not have a licence) work well with pandas dataframes and Panel/Bokeh standalone server. I implemented my solution with simple_cache, and it works well to reuse the same cache across Boker server sessions/users (so different users will reuse the same cache, that's nice!).

Nithanaroy · 2020-03-28T19:06:51Z

Yes! A pattern which automatically caches the source data frame caching would really help. For now was able to achieve this manually using pn.state.cache.

class MyExplorer(param.Parameterized):

    def __init__(self, **kwargs):
        self.df = pn.state.cache["data"] if "data" in pn.state.cache else load_input()
        pn.state.cache["data"] = self.df

    @param.depends("...")
    def make_view():
        plot_df = transform(self.df)
        return hv.Curve...

explorer = MyExplorer(name="")
dashboard = pn.Column(explorer.param, explorer.make_view)

lrq3000 · 2020-03-28T21:14:41Z

Oh thank you for the example, very helpful! I think your code snippet can be easily converted to a simple caching function decorator, which would be better than my current solution because then no other dependency would be needed :-)

MarcSkovMadsen · 2021-05-28T03:04:37Z

Background

I have now for a time been using DiskCache for memoization and caching. It is so easy and powerful to use. It persists your data to disk, i.e. speeds up your development process because your app/ server reloads so much faster. And the experience for users is so great.

Requirements

My requirements for an easier to use/ more powerful caching would be

Provides easy way to memoize
The arguments to memoize are like those for functools.lru_cache and diskcache.Cache.set.
It provides way to set expiration.
It provides way to cache globally across sessions
It provides way to cache per session
It provides way to during development/ debugging cache on a hash of the function code. (I.e. cleared when changing code of function). This truly speeds up your development process
Provides way to persist cache easily and without configuration of external services
It's highly performant
The caching is pluggable and can be extended with/ integrated caching packages/ functions for redis etc. I.e. an initial configuration only extend the caching functionality. And no other changes required.
Can cache most used used data apps objects like Pandas DataFrames, Machine Learning Models, DL Models and HoloViews objects. (Streamlit have been struggling and had lot of bug reports)

Solution

I would suggest building it into pn.bind, pn.depends and param.depends.

Api

pn.bind(my_func, input_value=input_widget, cache=True)
pn.bind(my_func, input_value=input_widget, cache=True, cache_options={"expire": 60}) # expires every 60s
pn.bind(my_func, input_value=input_widget, cache=True, cache_options={"caches": ["panel", "diskcache"]})

@pn.depends(input_value=input_widget, cache=True)
def my_func(input_value):
    ...

philippjfr · 2021-05-28T10:01:20Z

Thanks for that proposal @MarcSkovMadsen. I strongly agree with that and in fact when we first designed param.depends it was always planned that we would eventually build support for memoization into it.

philippjfr · 2022-08-13T21:13:57Z

As of #2411 we now have a panel.cache function which allows caching the return values of functions with options for different eviction policies (least-recently-used, leaf-frequently-used, last-in-first-out), time-to-live and disk caching.

philippjfr added type: discussion Requiring community discussion type: enhancement Minor feature or improvement to an existing feature labels Mar 22, 2020

MarcSkovMadsen changed the title ~~Memoization and caching?~~ Make Memoization and Caching easy and powerful May 28, 2021

MarcSkovMadsen changed the title ~~Make Memoization and Caching easy and powerful~~ Make Caching and Memoization easy and powerful May 28, 2021

philippjfr added this to the v0.12.0 milestone May 28, 2021

philippjfr modified the milestones: v0.12.0, next Jun 29, 2021

philippjfr modified the milestones: next, 0.13.0 Aug 12, 2021

philippjfr modified the milestones: v0.13.0, next Apr 4, 2022

philippjfr modified the milestones: next, Version 0.14.0 Aug 13, 2022

philippjfr closed this as completed Aug 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Caching and Memoization easy and powerful #1179

Make Caching and Memoization easy and powerful #1179

lrq3000 commented Mar 22, 2020 •

edited

Loading

philippjfr commented Mar 22, 2020

lrq3000 commented Mar 23, 2020

lrq3000 commented Mar 23, 2020 •

edited

Loading

Nithanaroy commented Mar 28, 2020

lrq3000 commented Mar 28, 2020 •

edited by jbednar

Loading

MarcSkovMadsen commented May 28, 2021 •

edited

Loading

philippjfr commented May 28, 2021

philippjfr commented Aug 13, 2022

Make Caching and Memoization easy and powerful #1179

Make Caching and Memoization easy and powerful #1179

Comments

lrq3000 commented Mar 22, 2020 • edited Loading

philippjfr commented Mar 22, 2020

lrq3000 commented Mar 23, 2020

lrq3000 commented Mar 23, 2020 • edited Loading

Nithanaroy commented Mar 28, 2020

lrq3000 commented Mar 28, 2020 • edited by jbednar Loading

MarcSkovMadsen commented May 28, 2021 • edited Loading

Background

Requirements

Solution

Api

philippjfr commented May 28, 2021

philippjfr commented Aug 13, 2022

lrq3000 commented Mar 22, 2020 •

edited

Loading

lrq3000 commented Mar 23, 2020 •

edited

Loading

lrq3000 commented Mar 28, 2020 •

edited by jbednar

Loading

MarcSkovMadsen commented May 28, 2021 •

edited

Loading