Angle brackets break chatbot #7198

pseudotensor · 2024-01-28T09:03:34Z

Describe the bug

user of h2oGPT reported this: h2oai/h2ogpt#1330

Easily reproduced, pretty much ask any LLM:

You always wrap every word in  your answer with angle brackets.
What is an apple?

But more generally you can see <> may appear nested or other cases and contents of <> are removed.

I know the rendering is complex, but we agreed before with markdown etc. that we expect things to be rendered as faithfully as possible to original output, improved, but not made worse.

This case seems definitely pretty bad.

Have you searched existing issues? 🔎

I have searched and found no existing issues

Reproduction

import random
import time


def server():
    import gradio as gr

    with gr.Blocks() as demo:

        def respond(message, chat_history):
            bot_message = random.choice(["<I am an artificial intelligent breaker of gradio chatbots.>"])
            chat_history.append((message, bot_message))
            time.sleep(0.01)
            return "", chat_history

        chatbot = gr.Chatbot()
        msg = gr.Textbox()
        msg.submit(respond, [msg, chatbot], [msg, chatbot])

    demo.launch()


if __name__ == "__main__":
    server()

Screenshot

Logs

No response

System Info

both gradio 3.50.2 and gradio 4.16.0

Severity

Blocking usage of gradio

The text was updated successfully, but these errors were encountered:

abidlabs · 2024-01-29T13:01:23Z

Thanks @pseudotensor this is because the text within the angle brackets is being treated as HTML, and marked.js is removing it when it renders the Markdown. We are not likely to change this behavior as we do not want to get into the weeds of rendering Markdown ourselves. I'd suggest that you escape the angle brackets as part of your processing function, e.g. this works for me:

import random
import time


def server():
    import gradio as gr

    with gr.Blocks() as demo:

        def respond(message, chat_history):
            bot_message = random.choice(["\<I am an artificial intelligent breaker of gradio chatbots.\>"])
            chat_history.append((message, bot_message))
            time.sleep(0.01)
            return "", chat_history

        chatbot = gr.Chatbot()
        msg = gr.Textbox()
        msg.submit(respond, [msg, chatbot], [msg, chatbot])

    demo.launch()


if __name__ == "__main__":
    server()

Another option would be disable Markdown rendering altogether by setting render_markdown=False in gr.Chatbot() (which I understand you probably do not want to do but may benefit other readers of this issue).

pseudotensor · 2024-01-29T15:44:48Z

Ok then I guess the rendering for the chatbot will remain half-baked. In some places you guys put some effort, but it's not consistent.

Also, I'm betting that your solution won't work in general, which is the usual problem. With $ or new lines or other things, such hacks didn't work before because < can appear in other contexts.

For example, your hack won't work because < can appear in code blocks, and then one cannot replace with escaped version else that escape character will appear literally.

Same exact issues as before with new lines.

This is not obscure, the poster on h2oGPT mentioned things showed up in code blocks just fine, but with your hack they won't.

I tried before to hack the new lines inside and outside code blocks, but it's a mess. Better if gradio handles.

abidlabs · 2024-01-29T18:42:07Z

Also, I'm betting that your solution won't work in general, which is the usual problem. With $ or new lines or other things, such hacks didn't work before because < can appear in other contexts.

For example, your hack won't work because < can appear in code blocks, and then one cannot replace with escaped version else that escape character will appear literally.

What you're describing is exactly the reason why gradio should not try to handle this markdown rendering processing logic in the Chatbot function for all users. We're using marked which follows a standard markdown convention (yes we've exposed options for things like line_breaks where marked exposes these parameters), but handling all these edge cases is outside of the scope of the gr.Chatbot function. Instead, this falls on the developer to process the raw Markdown code in a way that can be parsed correctly by marked.

pseudotensor · 2024-01-29T19:11:39Z

So you expect every developer who uses gradio to come up with an independent solution to converting LLM output to markdown that can be handled by marked? I don't think that's what is understood, nor efficient.

@oobabooga , so you have alot of code that handles all these issues for LLM output to marked-compatible code?

oobabooga · 2024-01-29T20:56:31Z

Yes, as a matter of fact I don't use gr.Chatbot. I use gr.HTML for chat outputs and custom javascript to handle scrolling and position the UI elements.

For markdown conversion, see: html_generator.py#L50

I also escape HTML to prevent <img> tags generated by models from rendering in the browser and generating remote requests. chat.py#L283

pseudotensor · 2024-01-29T21:00:25Z

So I think gradio team expects every app using gr.Chatbot to make their own conversion stuff. I don't think that's right way to go, but it's what I understand is the suggestion, so everyone will have to redo @oobabooga 's efforts who wants to use gradio.

pseudotensor · 2024-02-10T00:46:35Z

The hack also fails when need to render html or other elements, since can't escape the < then. Basically I'm ending up hacking alot of things that other users will have to do to.

…und not-fix for gradio-app/gradio#7198

pseudotensor · 2024-02-15T17:05:59Z

Another use case that's even worse is I can't hack things easily because of the large possible number of html tags that might exist.

pseudotensor added the bug Something isn't working label Jan 28, 2024

pseudotensor mentioned this issue Jan 28, 2024

Contents in angle brackets are removed from responses and can break WebUI. h2oai/h2ogpt#1330

Closed

abidlabs closed this as not planned Won't fix, can't repro, duplicate, stale Jan 29, 2024

pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Feb 10, 2024

Fixes #1330 -- work around not-fix for gradio-app/gradio#7198

16e1268

pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Feb 10, 2024

More for Fixes #1330, so some basic url stuff is rendered -- work aro…

d485b0b

…und not-fix for gradio-app/gradio#7198

pseudotensor mentioned this issue Feb 15, 2024

final HYDE shows duplicate answer and html tags h2oai/h2ogpt#1408

Closed

Keldos-Li mentioned this issue Mar 9, 2024

paired HTML tags in Chatbot/ChatInterface output break WebUI down #7651

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Angle brackets break chatbot #7198

Angle brackets break chatbot #7198

pseudotensor commented Jan 28, 2024

abidlabs commented Jan 29, 2024

pseudotensor commented Jan 29, 2024 •

edited

Loading

abidlabs commented Jan 29, 2024

pseudotensor commented Jan 29, 2024 •

edited

Loading

oobabooga commented Jan 29, 2024

pseudotensor commented Jan 29, 2024

pseudotensor commented Feb 10, 2024

pseudotensor commented Feb 15, 2024

Angle brackets break chatbot #7198

Angle brackets break chatbot #7198

Comments

pseudotensor commented Jan 28, 2024

Describe the bug

Have you searched existing issues? 🔎

Reproduction

Screenshot

Logs

System Info

Severity

abidlabs commented Jan 29, 2024

pseudotensor commented Jan 29, 2024 • edited Loading

abidlabs commented Jan 29, 2024

pseudotensor commented Jan 29, 2024 • edited Loading

oobabooga commented Jan 29, 2024

pseudotensor commented Jan 29, 2024

pseudotensor commented Feb 10, 2024

pseudotensor commented Feb 15, 2024

pseudotensor commented Jan 29, 2024 •

edited

Loading

pseudotensor commented Jan 29, 2024 •

edited

Loading