Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the "Hidden Div" Solution #290

Open
chriddyp opened this issue Jul 13, 2018 · 8 comments
Open

Improving the "Hidden Div" Solution #290

chriddyp opened this issue Jul 13, 2018 · 8 comments
Labels
feature something new P3 not needed for current cycle

Comments

@chriddyp
Copy link
Member

Currently, there are several ways to share data between callbacks. These are outlined here: https://dash.plot.ly/sharing-data-between-callbacks

Some users find these methods complex or clunky and others have trouble understanding which method is best for them.

Let's discuss some ways that we can improve this aspect of Dash.


How Are Hidden Divs Being Used?
A1. To store intermediate state that is expensive to generate (in time, # of connections to e.g. a database, in CPU) and to use that intermediate state in several callbacks
- e.g. 5 graphs generated from the same expensive SQL query
A2. To store data for controls that are being used across multiple pages or tabs
- e.g. Tab 1 is the "controls" page and Tab 2 has several graphs that depend on those controls. Since the controls aren't in the component tree in Tab 2, they are stored in a global div that can be updated from Tab 1 and referenced in Tab 2.
A3. As a mutable, derived-state store
- e.g. Appending click data from a graph into a click data store that contains all of the previous click events (https://community.plot.ly/t/solution-persistent-click-events/6590)

How Is Memoization Caching Being Used?
B1. Like A1, except persisting the data across all user sessions, perhaps with a time expiring cache
- e.g. There is a dropdown with 5 values and it makes 5 expensive unique SQL queries. These get cached so that all users retrieve the same cached value.
B2. A filesystem/redis based version of A1. User sessions IDs are created in the def layout function and used as the memoization key


How Could These Methods Be Improved?

  • For A1 - Allowing Multiple Outputs in a callback would remedy many (all?) of these issues. It's also the easiest to grok as there are no intermediate steps.

  • For A2 and B1 - Easier Serialization would reduce friction. This is the Refactor #1 issue that I hear encounter: confusion around serializing and deserializing JSON or even inability to do so (with complex data structures).

    • Perhaps we can just use pickle instead so that serialization and deseralization "just works"?
    • The Redis memoization is hard to wrap your head around.
      • Decorators are confusing, require a fair amount of refactoring: instead of df = df.read_csv it's df = get_data() and get_data is a complex decorated function (see the solution in https://dash.plot.ly/sharing-data-between-callbacks)
      • Perhaps we can create an object that looks and feels like a global dict but is in fact saving to Redis
    • The per-user Redis cache is also confusing as it requires unique user IDs (see solution in https://dash.plot.ly/sharing-data-between-callbacks)
      • Perhaps we can leverage flask session IDs or roll our own and just make them available to everyone
      • If there was a nice global dict object, then users could figure out how to assign items on a per-user basis to it, e.g. dash.datastore[dash.user_id] = some_dataframe(dash.user_id)
  • For A2 and B1 - Faster Serialization

    • See Apache Arrow serialization above
    • If the issue is network delay, then we could come up with a better solution with Redis.
      • Redis Cache Solution will act differently than dcc.Store as the dcc.Store would be part of the Dash DAG: updating it would trigger other callbacks. The Redis Cache Solution wouldn't, it would just memoize

There's some more stuff to discuss here, but this is as far as I've gotten in exploring this problem space today.


cc @plotly/dash

@nicolaskruchten
Copy link
Contributor

Let's not forget the security concerns raised in #262 :)

@bpostlethwaite
Copy link
Member

A customer suggested having access to an automatically user-scoped dict of some sorts.

Dash-auth could for example provide a user-scoped Store in each callback. It wouldn't need to be saved in the DOM at all - could be Redis backed or fallback transparently to in-process memory if Redis not available.

Shiny has something like this but I haven't investigated it yet.

@cpsievert
Copy link

If any official solution(s) require redis, 'rolling our own' wrappers seems like a good idea...that way we can better ensure a consistent API/experience across languages. Also, FWIW, R has a low-level interface to redis and I doubt many folks would want to work with it directly (for one, R objects must be serialized manually).

@radekwlsk
Copy link

radekwlsk commented Aug 3, 2018

For A1 - Allowing Multiple Outputs in a callback would remedy many (all?) of these issues. It's also the easiest to grok as there are no intermediate steps.

That would solve a lot of other issues also, not only related to hidden divs. And isn't it a bit straightforward? Instead of updating one output with result of the wrapped function, unpack list of outputs and apply it to all?

It could still be done in multiple callbacks on front side, just calculate result of the function in Python once as it is the most expensive task.

As #149 is pretty old and long awaited I think implementing it for users in any way, and then polishing it in the back of Dash keeping the interface would make sense.

@zouhairm
Copy link

While allowing multiple outputs would help in some respects with sharing data between callbacks, it doesn't solve the use case of saving "derived state" (as hidden divs allow as described in A3)

There's mention of using session data from Flask. I posted something about in the community forum as I've had a use case for that.

I found that server side session data implemented in Flask-sessionstore works better since it made sense not too pass data back and forth between client and server in my case.

I'd also add that another thing to account for is the support for data that might not be easily serializable. Flask sessions require json serializability, while the server-side sessionstore doesn't (presumably because it doesn't need to be transported)

@ned2
Copy link
Contributor

ned2 commented Aug 31, 2018

If there was a nice global dict object, then users could figure out how to assign items on a per-user basis to it, e.g. dash.datastore[dash.user_id] = some_dataframe(dash.user_id)

This made me thing of the Flask Request Contect and Application Context.

I wonder if a global context mechanism like these could be used, or if we could potentially even piggy backing off them, since Dash is already tied to Flask.

@prasadovhal
Copy link

I have used json format to transfer dataframe. This method is breaking my app. It works good at start then after 30 seconds it start to lag then my computer freezes.
Any solution for this.?
I have also tried SQL format, but because of this I am format I am losing information like name of attributes are replaced by numbers.
Any solution for this one too?

@gvwilson gvwilson self-assigned this Jul 17, 2024
@gvwilson gvwilson removed their assignment Aug 2, 2024
@gvwilson
Copy link
Contributor

gvwilson commented Aug 7, 2024

@T4rk1n is this one still relevant or do we have more recent solutions?

@gvwilson gvwilson added P3 not needed for current cycle feature something new and removed Status: Discussion Needed labels Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new P3 not needed for current cycle
Projects
None yet
Development

No branches or pull requests

9 participants