<a href="https://colab.research.google.com/github/datastax/ragstack-ai/blob/main/examples/notebooks/nvidia.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# NVIDIA NeMo Guardrails


## Prerequisites

You will need a vector-enabled Astra database. This notebook uses OpenAI, though you can certainly use the NVIDIA models, as NeMo [supports all LLM providers supported by Langchain](https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/docs/user_guides/configuration-guide.md#supported-llm-models).

* Create an [Astra vector database](https://docs.datastax.com/en/astra-serverless/docs/getting-started/create-db-choices.html).
* Create an [OpenAI account](https://openai.com/)
* Within your database, create an [Astra DB Access Token](https://docs.datastax.com/en/astra-serverless/docs/manage/org/manage-tokens.html) with Database Administrator permissions.
* Get your Astra DB Endpoint: 
  * `https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com`

See the [Prerequisites](https://docs.datastax.com/en/ragstack/docs/prerequisites.html) page for more details.

## Setup

In [1]:
! pip install -qU ragstack-ai datasets


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.3.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [18]:
import nest_asyncio

nest_asyncio.apply()

In [5]:
import os
from getpass import getpass

# Enter your settings for Astra DB and OpenAI:
keys = ["ASTRA_DB_APPLICATION_TOKEN", "ASTRA_DB_API_ENDPOINT", "NVIDIA_API_KEY"]
for key in keys:
    if key not in os.environ:
        os.environ[key] = getpass(f"Enter {key}: ")

In [6]:
# Collections are where documents are stored. ex: test
collection = input("Collection: ")

## Create Guardrails

In [50]:
from nemoguardrails import RailsConfig
from nemoguardrails.integrations.langchain.runnable_rails import RunnableRails


          # Your task is to check if the user message below complies with the company policy for talking with the company bot.
def config(engine, model):
    return f"""
    models:
      - type: main
        engine: {engine}
        model: {model}

    prompts:
      - task: self_check_input
        content: |
          Your task is to check if the user message below complies with the following policies. 
                
          Policy for the user messages:
          - should not contain fruits
          - should not contain vegetables

          User message: "{{ user_input }}"

          Question: Should the user message be blocked (Yes or No)?
          Answer:
    
    rails:
      input:
        flows:
          - self check input
    """


engine = "openai"
model_name = "gpt-3.5-turbo-16k"
yaml = config(engine, model_name)
config = RailsConfig.from_content(
    yaml_content=yaml,
)
guardrails = RunnableRails(config)

## Create RAG Pipeline

### Embedding Model and Vector Store

In [8]:
from langchain_openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()

In [9]:
from langchain.vectorstores.astradb import AstraDB

vstore = AstraDB(
    collection_name=collection,
    embedding=embedding,
    token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"),
    api_endpoint=os.getenv("ASTRA_DB_API_ENDPOINT"),
)
print("Astra vector store configured")

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Astra vector store configured


In [51]:
from langchain.schema.output_parser import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.prompts import PromptTemplate
from langchain.chat_models import ChatOpenAI


SAMPLE_DATA = [
    "MyFakeProductForTesting is a versatile testing tool designed to streamline the testing process for software developers, quality assurance professionals, and product testers. It provides a comprehensive solution for testing various aspects of applications and systems, ensuring robust performance and functionality.",  # noqa: E501
    "MyFakeProductForTesting comes equipped with an advanced dynamic test scenario generator. This feature allows users to create realistic test scenarios by simulating various user interactions, system inputs, and environmental conditions. The dynamic nature of the generator ensures that tests are not only diverse but also adaptive to changes in the application under test.",  # noqa: E501
    "The product includes an intelligent bug detection and analysis module. It not only identifies bugs and issues but also provides in-depth analysis and insights into the root causes. The system utilizes machine learning algorithms to categorize and prioritize bugs, making it easier for developers and testers to address critical issues first.",  # noqa: E501
    "MyFakeProductForTesting first release happened in June 2020.",
]

BASIC_QA_PROMPT = """
Answer the question based only on the supplied context. If you don't know the answer, say you don't know the answer.
Context: {context}
Question: {question}
Your answer:
"""

vstore.add_texts(SAMPLE_DATA)
retriever = vstore.as_retriever()

llm = ChatOpenAI(model="gpt-3.5-turbo-16k")
prompt = PromptTemplate.from_template(BASIC_QA_PROMPT)
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

chain_with_rails = guardrails | chain

In [52]:
response = chain_with_rails.invoke("When was MyFakeProductForTesting first released?")
# assert '2020' in response
print(response)
# Would have expected an answer

{'output': "I'm sorry, I can't respond to that."}


In [53]:
response = chain_with_rails.invoke("What color is an apple?")
# assert "I'm sorry" in response
print(response)
# This is expected

{'output': "I'm sorry, I can't respond to that."}


In [54]:
response = chain.invoke("What is the capital of France?")
# response = chain.invoke({"input": "What is the capital of France?"})
# assert "Paris" in response
print(response)
# Would have expected "I cannot answer based on the context provided."

I don't know the answer.


In [46]:
# Now try with slightly different prompt:
from nemoguardrails import RailsConfig
from nemoguardrails.integrations.langchain.runnable_rails import RunnableRails


def config(engine, model):
    return f"""
    models:
      - type: main
        engine: {engine}
        model: {model}

    prompts:
      - task: self_check_input
        content: |
          Your task is to check if the user message below complies with the company policy for talking with the company bot.
                
          Company Policy for the user messages:
          - should not contain fruits
          - should not contain vegetables

          User message: "{{ user_input }}"

          Question: Should the user message be blocked (Yes or No)?
          Answer:
    
    rails:
      input:
        flows:
          - self check input
    """


engine = "openai"
model_name = "gpt-3.5-turbo-16k"
yaml = config(engine, model_name)
config = RailsConfig.from_content(
    yaml_content=yaml,
)
guardrails = RunnableRails(config)
chain_with_rails = guardrails | chain

In [47]:
response = chain_with_rails.invoke("When was MyFakeProductForTesting first released?")
# assert '2020' in response
print(response)

MyFakeProductForTesting was first released in June 2020.


In [48]:
response = chain_with_rails.invoke("What color is an apple?")
# assert "I'm sorry" in response
print(response)
# Would have expected {"output": "I'm sorry, ... "} as per 
# https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/docs/user_guides/langchain/chain-with-guardrails

I don't know the answer.


In [49]:
response = chain.invoke("What is the capital of France?")
# response = chain.invoke({"input": "What is the capital of France?"})
# assert "Paris" in response
print(response)

I don't know the answer to the question.
