Feature Request: Validating prompt response from Triton server using NeMo Guardrails #16

programmah · 2024-02-15T13:48:55Z

This feature request is about creating a content that demonstrate how to connect nemo guardrails to Llama-2-7b-chat TensorRT engine deployed on Triton Inference Server. This approach helps avoid the need for an Openai key and bypass NeMo-LLM Service when using NeMo guardrails to guard user prompts to/from the deployed model. You can use the LangChain framework to achieve the task.
The feature is required to complete the End-to-End LLM pipeline.

programmah assigned muntasers Feb 15, 2024

krrishdholakia mentioned this issue Feb 20, 2024

Support nemo guardrails on proxy BerriAI/litellm#2070

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Validating prompt response from Triton server using NeMo Guardrails #16

Feature Request: Validating prompt response from Triton server using NeMo Guardrails #16

programmah commented Feb 15, 2024

Feature Request: Validating prompt response from Triton server using NeMo Guardrails #16

Feature Request: Validating prompt response from Triton server using NeMo Guardrails #16

Comments

programmah commented Feb 15, 2024