Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate callback outputs with identical inputs (and potentially different states) fail #2486

Open
RenaudLN opened this issue Mar 31, 2023 · 13 comments
Assignees
Labels
bug something broken P3 not needed for current cycle

Comments

@RenaudLN
Copy link
Contributor

RenaudLN commented Mar 31, 2023

Environment

dash                 2.9.2

Describe the bug

When 2 callbacks are created with a duplicate output (and allow_duplicate=True) and the same Inputs (States can be different), Dash gives a Duplicate callback outputs error.

from dash import Input, Output, Patch, State, callback

@callback(
    Output("deck", "data", allow_duplicate=True),
    Input("filter1", "value"),
    Input("filter2", "value"),
    State("state1", "value),
    prevent_initial_call=True,
)
def add_layer1(filter1, vilter2, state1):
    layer1 = load_layer1(filter1, vilter2, state1)
    update = Patch()
    update.layers[0] = layer1
    return update


@callback(
    Output("deck", "data", allow_duplicate=True),
    Input("filter1", "value"),
    Input("filter2", "value"),
    State("state2", "value),
    prevent_initial_call=True,
)
def add_layer2(filter1, vilter2, state2):
    layer2 = load_layer2(filter1, vilter2, state2)
    update = Patch()
    update.layers[1] = layer2
    return update 

Expected behavior

This should either:

  • work, the use case is to load the data in several chunks / layers and return it with Patch so that things start displaying on the screen sooner
  • or give a descriptive error message (something like "Several callbacks with duplicate Outputs have identical Inputs")

I assume that this is due to a hashing that is done based on the Inputs only, maybe it could include the decorated function name and the States in the hash as well?

Current workaround: Adding a dummy Input in the callback saves the day.

Screenshots

image

@T4rk1n
Copy link
Contributor

T4rk1n commented Mar 31, 2023

This is the same callback two times, only the state differs and they will fire both at the same time. They should be combined for performance reason so there is only one request.

Duplicate output now create the different callbacks based on the Inputs, so each callback should have a different set of inputs.
I am not really sure we should be fixing this and as I think this create an undesirable effect, if the returned object is not a patch, there is no guarantee which one will update first without a priority system. Maybe add a more comprehensive error instead.

@RenaudLN
Copy link
Contributor Author

I'm fine with a different error message, the current one is just really unhelpful to solve the problem :)

@alexcjohnson
Copy link
Collaborator

alexcjohnson commented May 31, 2023

An interesting point from @tlauli, merging this in from #2546:

I think that two callbacks with matching signatures and allow_duplicate=True on all Outputs should be allowed.
My use case would be using Patch to update individual traces of a dcc.Graph. Each callback would update a different trace, so there is no ambiguity in the final state. It also allows for the callbacks to run in parallel, improving responsibility.

Being able to parallelize a callback this way is a really interesting use case. Unfortunately it's also one that potentially breaks many of the things we could do to disambiguate these callbacks in our internal tracking - which needs to be deterministic ESPECIALLY when we're talking about parallelizing since that implies multiple processes. Consider this pattern, which might look natural for this case:

def create_trace_callback(trace_num):
    @callback(Output("graph", "figure", allow_duplicate=True), Input("control", "value"))
    def update_one_trace(value):
        update = Patch().data[trace_num]
        update.x, update.y = get_new_data(value, index=trace_num)
        return update

for i in range(n_traces):
    create_trace_callback(i)

(edit: example updated, @T4rk1n points out my original syntax of creating the callback directly in a for loop would leave i as the final value in all cases when the callback is executed)

These would be all exactly the same function, just with a different value of the variable i from the enclosing scope. So what do we use to create the unique ID for tracking each instance of this callback separately? The only solution that's apparent to me is to let you to specify a disambiguation key in the callback definition, then creating a nice error message that says something like "You have duplicate callbacks with identical inputs and outputs. If this is really what you want, please add a non-random key to distinguish these" which would look something like:

@callback(Output("graph", "figure", allow_duplicate=True), Input("control", "value"), key=i)

@T4rk1n
Copy link
Contributor

T4rk1n commented May 31, 2023

I intended allow_duplicate to really allow any duplicate, if it returns a Patch object there should be no problem with response order but it would trigger 2x render.

Initially it created a random uuid for the duplicate output but with multi workers and server restarts the uuid would not be the same and the callback would not be found. Thus the id for the outputs needs to be deterministic and the id is now created with a hash of the callback inputs.

A workaround is to change the order of the inputs or add dummy inputs that are different across the callbacks.

I think the states could be added to the created hash for that case can be a different callback that would be common as reported in this issue.

@callback(Output("graph", "figure", allow_duplicate=True), Input("control", "value"), key=i)

That could be a good solution it would take the key instead of the hash.

@alexcjohnson
Copy link
Collaborator

Or include the key in the hash, in case you have other callbacks with the same outputs but different inputs.

but it would trigger 2x render

Does it? In principle we should be able to get around that, detecting that another callback is in progress for the same output and waiting to rerender until that's done.

@T4rk1n
Copy link
Contributor

T4rk1n commented May 31, 2023

Kinda, react do handle batch updates, it changed to be more automatic in react 18.
We have in our code something that handle this I think for same callbacks only.

@tlauli
Copy link

tlauli commented Jun 5, 2023

Triggering render multiple times might be desirable. Back to the example with updating multiple traces of a dcc.Graph, triggering the render only once would mean waiting until all traces are ready to update to draw them, right? And if one trace takes a significant amount of time to get (slow db call, post-processing...), then all the other traces would would be delayed by the one slow trace.

@olejorgenb
Copy link

olejorgenb commented Nov 7, 2023

Note that this happens if the callbacks have different outputs in addition to the common duplicated output as well.

I guess I see the argument that they might as well be combined, but this can make the structure less organized IMO. One example would be when an interval is the input and multiple mostly independent things should happen regularly. The logic inside these callback could be such that the same input rarely will trigger writing to the same output at the same time. (and when they do the only expense is some compute)

A better error message at the very least would be helpful.

And if I understand correctly this does not trigger if the inputs is not exactly the same? The exact same issues can easily happen for such callbacks, so not sure I see why there should be an exception for the case where the input set is exactly the same.

(adding the error message in plaintext for searchability: "Output 1 (...) is already in use. To resolve this, set allow_duplicate=True on
duplicate outputs")

(Also note that it's ok to have two such callbacks if one of them use Output and the other OutputDuplicate)

@Coding-with-Adam
Copy link

Thank you @olejorgenb and @tlauli for your comments.
Making note of possible ways for us to move forward:

  • Include State in the hash
  • Improve error message
  • Add an arbitrary extra key (folks that prefer having identical inputs running the callbacks in parallel won't have to add extra inputs)

@franekp
Copy link

franekp commented Feb 7, 2024

I have an input (a button in this case) and 2 callbacks. The first callback is fast and inserts the "loading..." indicator into the UI. The second, slow one clears the loading indicator and displays results.

I believe this is a valid use-case for having 2 callbacks with exactly the same inputs and a duplicate output.

At the moment this seems to be the simplest way to show a loading indicator for a callback that runs in the same process as the main app.

@T4rk1n
Copy link
Contributor

T4rk1n commented Feb 7, 2024

@franekp We have loading_state for that: https://dash.plotly.com/loading-states and you can achieve the same behavior with the running argument of background callbacks, the first argument is Output, then the value when running and finally the value after running.

Example:

running=[(Output("status", "children"), "Running", "Finished")],

@franekp
Copy link

franekp commented Feb 7, 2024

Oh, this is nice, thanks!

@gvwilson
Copy link
Contributor

@T4rk1n is this one still relevant?

@gvwilson gvwilson added P3 not needed for current cycle bug something broken labels Aug 13, 2024
@gvwilson gvwilson changed the title [BUG] Duplicate callback outputs with identical inputs (and potentially different states) fail Duplicate callback outputs with identical inputs (and potentially different states) fail Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken P3 not needed for current cycle
Projects
None yet
Development

No branches or pull requests

8 participants