Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to send inputs as dictionary #2550

Merged
merged 15 commits into from Oct 28, 2022
Merged

Allow user to send inputs as dictionary #2550

merged 15 commits into from Oct 28, 2022

Conversation

aliabid94
Copy link
Collaborator

This is an implementation of a new API that allows users to not have to specify input and output components anymore. The way it works is, if an event listener has inputs='all', then all the IOComponents in the app are inputs to this function. The function now only receives a single input - a dictionary that maps components to their values. To change values as output, simply edit this dictionary - gradio will diff to find the edits made to the dictionary and send those back only. This fixes much of the existing ugliness with the Gradio API, and I think could eventually become the default API of event listeners.

Example code in demo/calculator_all_inputs/run.py, snippet copied below:

with gr.Blocks() as demo:
    a = gr.Number(label="a")
    b = gr.Number(label="b")
    add_btn = gr.Button("Add")
    c = gr.Number(label="sum")

    def add(data):
        data[c] = data[a] + data[b]

    add_btn.click(add, inputs="all")

Earlier I listed the problems with our existing API, copied again below:

  1. Inputs defined in 3 different places (event listener, function header, and wherever inputs are used in function body)
  2. Each input component also needs a second name as a function arg. Sometimes this can be the same name as the variable name of the component, but this will overwrite access to the original component. If different names are used, then confusion from 2 different names.
  3. Outputs defined in 2 different places (event listener, return statement)
  4. Outputs can only be set in the return statement, so all outputs must be set simultaneously
  5. Must collect all input data from frontend at once (cannot collect conditionally)

This API would fix 1-4.

Drawbacks of this approach:

  • Requires frontend to send ALL component data to the backend. This could be fixed keeping a copy of the data always in the backend, and on every data submit to the frontend, we only send what has changed since the last submit.
  • We no longer know the inputs/outputs of this type of event listener beforehand. Side effects of this:
    • ‘interactive’ will have to be defined manually for components
    • Cannot automatically generate API documentation for this type of event listener
    • Will not have loading animation on outputs b/c we don’t know what they are (maybe we can animate the triggering component, e.g. the button clicked itself instead?)

Fixes: #2471

@pngwn
Copy link
Member

pngwn commented Oct 27, 2022

I don't understand where this data goes, what is the output (edit: i worked it out as i looked more)? I think this looks cleaner but i don't think it is really easier to understand due to the magic that is happening, i think we need to be careful not to conflate 'clean' with 'simple'.

Additionally, I think the drawbacks of this approach degrade the UX, and potentially the DX, so much that I really think we should consider other approaches. Typically users will stick with defaults and the 'defaults' in this case lead to a much worse experience for the end user. If app authors do decide to annotate certain parts of the app then we are back to more manual work for them which reduces the value of a leaner API like this. The fact that we will lose the loading notifications and the generated API documentation feels like a bigger issue.

I'm also not entire sure i agree with some of these 'pain points', for example:

Inputs defined in 3 different places (event listener, function header, and wherever inputs are used in function body)

Inputs aren't really defined in three places. The function we write is just a callback with a specific signature, which naturally means the parameters need defining. This isn't a redefinition of the inputs, it's just the nature of callbacks. Passing all of the args as a dict doesn't really mean we don't need to make those function definitions, they're just hidden in a dict. We'd still need to pull them off the dict in order to do anything with them, so I'm not sure we gain much. I'm not against the components receiving the args as a dict, i think it is probably simpler to not rely on order, especially when there are more than 2/3 inputs but suggesting this change removes the need to define them is a little misleading imo, it is just a change in signature which we should probably do more widely to make things easier.

This 'all' API really only removes one instance of passing that variable, potentially it could remove the need to assign it to a variable but we would need to figure out how to generate a 'name' for that component then. The other changes are due to the signature being different. I actually think this is another downside because we introduce a different event handler API depending on whether or not this flag is set. I would rather we change all of the function signatures rather than having a special case like this. Since this would be a breaking change we could also take care of this issue which would also be a huge improvement.

Likewise, i don't really think returning some data is a redefinition of the outputs.

I'm not really sure how i feel about mutating the dict directly and doing a deep diff or the keys, i think it would be better to return a new dict with only the keys you care about set, then we can send a partial dict down to the frontend and it can be merged there, this will reduce the payload that is sent back at least and avoid mutating inputs (which can introduce nasty bugs, make things harder to test and harder to reason about). eg:

# state is {"a": 1, "b":2, "c": 3}
fn predict(state) {
  return {
    "c": state["a"] + state["b"] + state["c"]
  }
}

We could just send this partial state down with some additional metadata and merge it on the frontend easily enough. In some instances it may be technically possible to figure out what the return values are in order to know what the output for a given function is without calling it but due to the dynamic nature of python it isn't foolproof.

@aliabid94
Copy link
Collaborator Author

aliabid94 commented Oct 27, 2022

After the discussion, I agree that much of the existing confusion would simply solved by allowing inputs to take a dictionary. The code above could become:

with gr.Blocks() as demo:
    a = gr.Number(label="a")
    b = gr.Number(label="b")
    add_btn = gr.Button("Add")
    c = gr.Number(label="sum")

    def add(data):
        return {c: data[a] + data[b]}
    add_btn.click(add, [a,b], [c], input_format=dict)

I think it would be a good idea to push for the input_format=dict to become the default way of setting inputs, but it would not be backwards compatible to set that as the kwarg default value. What if we also allow users to bind event listeners via decorators, and that method uses dict as default? E.g.

with gr.Blocks() as demo:
    a = gr.Number(label="a")
    b = gr.Number(label="b")
    add_btn = gr.Button("Add")
    c = gr.Number(label="sum")

    @add_btn.click([a,b], [c])
    def add(data):
        return {c: data[a] + data[b]}

I think this is the cleanest approach, and has none of the drawbacks listed earlier. (Will have to see if its possible to use a decorator like this when we've already defined add_btn.click() to normally to have a function as its first param)

@aliabid94
Copy link
Collaborator Author

aliabid94 commented Oct 27, 2022

Actually there might be an intuitive way (that might be too "clever") to distinguish b/w list input vs dict input.
List mode:

add_btn.click(add, [a, b], [c], input_format=dict)

Dict mode:

add_btn.click(add, {a, b}, [c], input_format=dict)

Basically we pass in a set using curly brackets for "inputs as dictionary" mode . This matches the curly brackets of a dictionary, and makes sense as the keys of a dictionary are a set. So [input1, input2] refers to list input, and {input1, input2} refers to dict input.

@abidlabs
Copy link
Member

Actually there might be an intuitive way (that might be too "clever") to distinguish b/w list input vs dict input.

You can just drop the input_format=dict in these two examples, right?

@aliabid94
Copy link
Collaborator Author

Yep

@abidlabs
Copy link
Member

abidlabs commented Oct 27, 2022

To get the similar behavior as we started out with (with inputs="all"), we could even define a Blocks.components variable that is a set of all of the components in a given Blocks. Then, you could pass in

with gr.Blocks() as demo:
  ...
  add_btn.click(add, demo.components, [c])

to pass in all of the Blocks components if you're working with a lot of components and you don't want to deal with adding them in both places

@freddyaboulton
Copy link
Collaborator

freddyaboulton commented Oct 27, 2022

Looks cool!

If you use the "set syntax"

add_btn.click(add, {a, b}, [c])

You can still return/yield like you can right now right? You don't have to return a dict and can still return a single value, a tuple of values, or a dict.

Basically this PR only impacts how the inputs of an event are defined?

@freddyaboulton
Copy link
Collaborator

freddyaboulton commented Oct 27, 2022

I don't think we should implement demo.components or any other shorthand for "pass me all components" because we would still suffer from these downsides @aliabid94 noted, right?

We no longer know the inputs/outputs of this type of event listener beforehand. Side effects of this:
‘interactive’ will have to be defined manually for components
Cannot automatically generate API documentation for this type of event listener

@abidlabs
Copy link
Member

abidlabs commented Oct 27, 2022

There are use cases where I think the trade off is worth it! For example, it will significantly make the DX easier for complex GUIs like Stable Diffusion Web UI where you don’t really expect anyone to use the Gradio app’s REST API. It’s quite painful to have to reference the components in multiple places

Even with the set notation, you go from having to maintain a list to maintaining a set — which is better since you don’t have to worry about the order of parameters but you still have to add components in two different places and it would be nice to have a way to avoid that (with the caveats documented)

@abidlabs
Copy link
Member

Let’s make sure this works with batch functions as well, lmk if you wanna discuss that @aliabid94

@freddyaboulton
Copy link
Collaborator

There are use cases where I think the trade off is worth it!

I think that is true @abidlabs ! I'm worried about losing automatic interactivity though. Seems like it's a big loss in DX if components that were previously interactive by default are no longer. Not sure how frequently that will come up in practice though.

I think the progress animations will still be preserved if we all blocks.components since the outputs are still explicitly defined, which is good!

@freddyaboulton
Copy link
Collaborator

Very cool proposal @aliabid94 ! I worry about adding another input type to our event signature but since it's an opt-in feature, I think it's worth it to implement so we can get some validation from users about what the new api should be for 4.0 ahead of time. Let's implement and convert some demos to see how we like it!

@github-actions
Copy link
Contributor

All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-2550-all-demos

@aliabid94
Copy link
Collaborator Author

I agree with @freddyaboulton, let's not deal with sending all inputs right now, there are many tradeoffs to consider there. Let's keep the scope of this PR small, and figure out the larger API changes we want to make as part of 4.0. Right now, this PR should simplify complex apps a good deal without any tradeoffs.

@aliabid94 aliabid94 marked this pull request as ready for review October 28, 2022 04:07
@aliabid94
Copy link
Collaborator Author

Ready for review

@aliabid94 aliabid94 changed the title Allow user to send all inputs, and get state dictionary Allow user to send inputs as dictionary Oct 28, 2022
@pngwn
Copy link
Member

pngwn commented Oct 28, 2022

Agree with @aliabid94 and @freddyaboulton about holding off on applying all components as inputs/outputs.

I like the new api proposal, i think it should be straightforward enough to follow due to the syntactic similarities between sets + dicts. My only question is if users wanted to rename the keys of their inputs to the function, is there anything stopping us supporting sets and dictionaries? A user may not want the key names to match the variable names.

Small thing though, and I'm happy for this to go in.

@d8ahazard
Copy link

d8ahazard commented Oct 28, 2022

Please merge this?

I have a function that needs to pass like, 23 ui values back and forth, and this would make my life so much easier. ;)

@aliabid94
Copy link
Collaborator Author

@pngwn Do you mean passing a dictionary to inputs= that remaps the components to string keys? I don't think we need to expand this API for this use case just yet, maybe if we hear about it?

@d8ahazard you got it!

@aliabid94 aliabid94 merged commit 10552cb into main Oct 28, 2022
@aliabid94 aliabid94 deleted the state_event_listener branch October 28, 2022 17:52
@d8ahazard
Copy link

d8ahazard commented Oct 28, 2022 via email

@freddyaboulton freddyaboulton mentioned this pull request Oct 28, 2022
7 tasks
@vzakharov
Copy link
Contributor

This is great! Is there any way to add an additional parameter, though, key or something, to avoid using the labels for that? Labels are best when they are descriptive, not just some snake_case.

@aliabid94
Copy link
Collaborator Author

This is great! Is there any way to add an additional parameter, though, key or something, to avoid using the labels for that? Labels are best when they are descriptive, not just some snake_case.

Sorry @vzakharov I didn't understand, could you post an example snippet of what you are suggesting?

@aliabid94
Copy link
Collaborator Author

aliabid94 commented Oct 28, 2022

Sweet! Any idea when this will hit pypi?

@d8ahazard sometime today, will keep you updated!

EDIT: Now in 3.8

@An-u-rag
Copy link

An-u-rag commented Aug 1, 2023

Hi @aliabid94 , I am just looking into Gradio and trying to use it to build an interactive UI for my simple training and testing of a few models. I tried to use Blocks to specify the args that my function takes in as a dictionary input with keys as the parameter names and values as their user input values. The problem I am running into however is with components used as keys. That part makes no sense to me. My applications spans over a few different scripts calling each other and obviously I don't have the variable scope for the components in other scripts.

When I do something like this:

model = gr.Textbox(label="model")
dataPath = gr.Textbox(label="dataPath")
test = gr.Button("Test")
status = gr.Textbox()

inputs = {model, dataPath}
test_button.click(test.main, inputs=inputs, outputs=status)

In my test.main(args), my input args looks like this:
{textbox = "unet", textbox = "./data"}
What I expect it to look like:
{model = "unet", dataPath = "./data"}

SO I tried something else,

model = gr.Textbox(label="model")
dataPath = gr.Textbox(label="dataPath")
test = gr.Button("Test")
status = gr.Textbox()

inputs = {
"model": model, 
"dataPath": dataPath
}
test_button.click(test.main, inputs=inputs, outputs=status)

Now this gives me an error that Gradio can't find the attribute "_id" in the dict. I am assuming that Gradio uses that attribute to figure out the component type. How can I achieve what I want without breaking this?

@pngwn Do you mean passing a dictionary to inputs= that remaps the components to string keys? I don't think we need to expand this API for this use case just yet, maybe if we hear about it?

Pretty sure this is what I mean.

@aisu-wata0
Copy link

aisu-wata0 commented Aug 11, 2023

Changing inputs to a dict, my function receives a dictionary with multiple equal keys, I thought this wasn't even possible.
Is this expected?

inputs = {
    input_text,
    input_speaker,
    lang,
    noise_scale,
    noise_scale_w,
    length_scale,
    symbol_input,
}

btn.click(tts_fn, inputs=inputs, outputs=[out_message, out_audio], api_name=api_name)

# ...

# Printing the inputs (the first) argument on the function call (inside `tts_fn` registered above):

inputs={textbox: 'ザコザコザコ', dropdown: None, dropdown: 1, checkbox: False, slider: 0.6, slider: 0.668, slider: 1}`
type(inputs)=<class 'dict'>
inputs.keys()=dict_keys([textbox, dropdown, dropdown, checkbox, slider, slider, slider])
#            for k in text.keys():
#                print(f"inputs['{k}'] =", text[k])
inputs['textbox'] = ザコザコザコ
inputs['dropdown'] = None
inputs['dropdown'] = None
inputs['checkbox'] = False
inputs['slider'] = 0.6
inputs['slider'] = 0.668
inputs['slider'] = 1

How do these last 3 keys even work?

(Actually I got here because I wanted to change API to named arguments, through dictionaries/json, changing it from it being a list on data. This doesn't actually do that?)
(gradio Version: 3.39.0)

@freddyaboulton
Copy link
Collaborator

The keys are the component instances - it's just that the string representation of a component is just a component name

@aisu-wata0
Copy link

ah, assumed they would be strings because my mind was on the API, thanks

@wasidy
Copy link

wasidy commented Feb 17, 2024

I still can not use custom names for dict keys.
'dict' object has no attribute '_id'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pass in input parameters as a dict whose keys are all of the input components
9 participants