-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Red Hat provides guardrails for LLMs. These can help check if a model is saying something inappropriate. If an LLM was providing an answer, we should provide guidance on messaging in the case when a guardrail is hit. There are some sample guardrail API messages in the above video. In this case, the response from LLM may not come in. Bill Murdock suggested that with streaming, it is harder to run guardrails first, however. Some models simply remove messages. Danielle Jett suggests running against guardrail first and then simulating streaming, though this may cause a delay.
Assuming we are talking about removing a message from the DOM, @thatblindgeye wonders whether removing a message would be confusing. Eric wonders whether replacing message may be a better solution in the case where guardrail is not hit first before message is displayed.
We should do some research and provide some design guidelines around this.
Potentially relevant research:
Related:
- Has some sample guardrail responses: https://developers.redhat.com/articles/2025/05/28/implement-ai-safeguards-nodejs-and-llama-stack#using_llamaguard_and_promptguard_with_llamastack_and_node_js
- Different approaches for streaming: https://aws.amazon.com/blogs/machine-learning/build-safe-and-responsible-generative-ai-applications-with-guardrails/
Metadata
Metadata
Assignees
Labels
Type
Projects
Status