Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make detection of files in gradio_client.Client more robust #7360

Closed
abidlabs opened this issue Feb 8, 2024 · 3 comments · Fixed by #7575
Closed

Make detection of files in gradio_client.Client more robust #7360

abidlabs opened this issue Feb 8, 2024 · 3 comments · Fixed by #7575
Assignees
Labels
gradio_client Related to the one of the gradio client libraries needs designing The proposed feature needs to be discussed and designed before being implemented python Backend-related issue (Python)

Comments

@abidlabs
Copy link
Member

abidlabs commented Feb 8, 2024

The gradio_client.Client accepts string filepaths which are converted to file objects and uploaded to the server if they correspond to an input component that accepts files. However, this detection is not very robust.

Specifically, if any part of the input component payload can be a file, all parts of the payload are treated as potentially containing files. This leads to issues such as this:

import gradio as gr

with gr.Blocks() as demo:
    chatbot = gr.Chatbot()
    msg = gr.Textbox()
    clear = gr.Button("Clear")
    st = gr.State([1, 2, 3])

    def respond(message, st, chat_history):
        assert st[0] == 1 and st[1] == 2 and st[2] == 3
        bot_message = "I love you"
        chat_history.append((message, bot_message))
        return "", chat_history

    msg.submit(respond, [msg, st, chatbot], [msg, chatbot], api_name="submit")
    clear.click(lambda: None, None, chatbot, queue=False)
    
_, url, _ = demo.launch(inline=False)

from gradio_client import Client

client = Client(url)

initial_history = [["", None]]
message = "Hello"
ret = client.predict(message, initial_history, api_name="/submit")

This produces an error because the "" in the initial_history is treated as a potential file and is tried to be uploaded to the server. We should really validate against the API info to figure out which part of the payload could be a file and which format it needs to be in. We should only upload those files, from a security and robustness perspective. Or we need to think of something else altogether

@abidlabs abidlabs added python Backend-related issue (Python) gradio_client Related to the one of the gradio client libraries needs designing The proposed feature needs to be discussed and designed before being implemented labels Feb 8, 2024
@abidlabs abidlabs added this to the Clients 1.0 📡 milestone Feb 10, 2024
@abidlabs abidlabs self-assigned this Feb 12, 2024
@abidlabs
Copy link
Member Author

Ok so I'll work on writing a function to traverse a given value based on an arbitrary type hint and identify all parts of that value that could potentially match a FileData in the type hint. Its probably not going to be perfect, but it'll be better than what we have now. But I'm planning on making one big change. Right now, the process of detecting and uploading files happens like this:

  1. The Client decides whether a particular string could potentially be a file
  2. If it is, then it uploads it to the server and replaces the string with a FileData object
  3. It then sends the FileData object to the prediction API

Instead, I'd like to propose we do this:

  1. The Client sends the data as is to the Gradio prediction API
  2. As part of preprocess(), we figure out whether whether a particular string should really be a FileData object. Because we have access to the input components and their type hints as actual Python Types instead of just strings, its a little bit easier
  3. If indeed a particular string should be a FileData, then the Gradio server requests the client to upload the file and a FileData object is generated using the uploaded filepath

The big advantage of doing it this way is that we don't have to implement this logic in both of the clients separately. It may also help us w.r.t. #7407

cc @freddyaboulton @aliabid94 for your thoughts

@juanluisrto
Copy link

Not sure but I think this issue I just opened is related

@abidlabs
Copy link
Member Author

Yes I think its related @juanluisrto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gradio_client Related to the one of the gradio client libraries needs designing The proposed feature needs to be discussed and designed before being implemented python Backend-related issue (Python)
Projects
None yet
2 participants