# Implement RAG Use Cases in watsonx.ai

In this lab you will review and run examples of LLM applications that implement the Retrieval Augmented Generation (RAG) pattern of working with LLMs. We will expand on the concepts that you learned in the previous labs.

RAG (Retrieval-Augmented Generation) is one of the most common use cases in generative AI because it allows us to work with data "external to the model", for example, data that was not used for model training. Many use cases require working with proprietary company data, and it's one of the reasons why RAG is frequently used in generative AI applications. RAG also allows us to add some guardrails to generated output and reduce hallucination. RAG can be used with several generative AI use cases, including:

- Question and answer
- Summarization
- Content generation

> A "human interaction" analogy of RAG is providing a document to a person and asking 
them to answer question based on the information in the document.

To get started we'll first verify that you have the necessary dependencies installed to run this notebook.

Go ahead and run the following code cell. **This may take a few seconds to complete.**

In [None]:
# Install dependencies
import sys
!{sys.executable} -m pip install -q chromadb==0.4.22
!{sys.executable} -m pip install -q ibm_watson_machine_learning==1.0.342
!{sys.executable} -m pip install -q langchain==0.1.4
!{sys.executable} -m pip install -q langchain_community==0.0.15
!{sys.executable} -m pip install -q pypdf==4.0.1


## Bring in dependencies

In this next code cell we'll bring in all the dependencies we'll need for later use.

Go ahead and run the following code cell. **There should be no ouput**

In [None]:
# Bring in dependencies
# SQLite fix: https://docs.trychroma.com/troubleshooting#sqlite
# __import__('pysqlite3')
# import sys
# sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

import requests
import chromadb
from langchain.text_splitter import RecursiveCharacterTextSplitter
from chromadb.utils import embedding_functions

# Document loaders
from langchain.document_loaders.pdf import PyPDFLoader
from langchain.document_loaders import TextLoader
from langchain_core.documents import Document

# WML python SDK
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
from ibm_watson_machine_learning.foundation_models.utils.enums import DecodingMethods

print("Successfully loaded dependencies!")

FILE_TYPE_TXT = "txt"
FILE_TYPE_PDF = "pdf"


## Some important variables

In this next code cell you'll define some variables that will be used in order to interact with your instance of watsonx.ai.

Go ahead and run the following code cell. **There should be no ouput**

In [None]:
# Update the global variables that will be used for authentication in another function
watsonx_project_id = "PASTE_PROJECT_ID_HERE"
api_key = "PASTE_API_KEY_HERE"
url = "https://us-south.ml.cloud.ibm.com"


## Understanding the code

In this next code cell we'll create some functions that we can use later to interact easier with watsonx.ai. These functions are `get_model`, `create_embedding`, and `create_prompt`: 

- `get_model`: Creates a model object that will be used to invoke the LLM
- `create_embedding`: Loads text data from given file path into the in-memory `chromadb` instance
- `create_prompt`: Generates the prompt that is sent to watsonx.ai API
   - Notice that in the beginning of the function we query the vector database to retrieve information that’s related to our question (semantic search). Search results are appended to the prompt, and the prompt instruction is "to give an answer using the provided text".

Go ahead and run the following code cell. **There should be no ouput**

In [None]:
prompt_template = """
Answer the following question using the context provided. 
If there is no good answer, say "unanswerable".

Context: %s

Question: %s
"""

# Creates a model object that will be used to invoke the LLM
def get_model(model_type,max_tokens,min_tokens,decoding,temperature):

    generate_params = {
        GenParams.MAX_NEW_TOKENS: max_tokens,
        GenParams.MIN_NEW_TOKENS: min_tokens,
        GenParams.DECODING_METHOD: decoding,
        GenParams.TEMPERATURE: temperature
    }

    model = Model(
        model_id=model_type,
        params=generate_params,
        credentials={
            "apikey": api_key,
            "url": url
        },
        project_id=watsonx_project_id
        )

    return model

# Loads text data from given file path into the chromadb instance
def create_embedding(file_path,file_type,collection_name):
    documents = []

    if file_type == FILE_TYPE_TXT:
        if file_path.startswith('http'):
            r = requests.get(file_path)
            metadata = {"source": file_path}
            raw_text = r.text.encode('utf-8').strip()
            documents = [Document(page_content=raw_text, metadata=metadata)]
        else:
            loader = TextLoader(file_path,encoding="1252")
            documents = loader.load()        
    elif file_type == FILE_TYPE_PDF:
        loader = PyPDFLoader(file_path)
        documents = loader.load()

    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    texts = text_splitter.split_documents(documents)
  
    print(type(texts))
    print(len(texts))

    # Load chunks into chromadb
    client = chromadb.Client()
    collection = client.get_or_create_collection(name=collection_name,embedding_function=embedding_functions.DefaultEmbeddingFunction())
    collection.upsert(
        documents=[doc.page_content for doc in texts],
        ids=[str(i) for i in range(len(texts))],  # unique for each doc
    )

    return collection

# Generates the prompt that is sent to watsonx.ai API
def create_prompt(file_path, file_type, question, collection_name):
    # Create embeddings for the text file
    collection = create_embedding(file_path,file_type,collection_name)

    # Query relevant information
    relevant_chunks = collection.query(
        query_texts=[question],
        n_results=3,
    )

    context = "\n\n\n".join(relevant_chunks["documents"][0])
    prompt = prompt_template % ( context, question )
    return prompt


## Gluing it together

The next function, `answer_questions_from_doc`, that we create is created to help combine the previous three that we define. This is the wrapper that we will call when we want to interact with watsonx.ai. 

Go ahead and run the following code cell. **There should be no ouput**

In [None]:
def answer_questions_from_doc(file_path,file_type,question,collection_name):

    # Specify model parameters
    model_type = "meta-llama/llama-2-70b-chat"
    max_tokens = 300
    min_tokens = 100
    decoding = DecodingMethods.GREEDY
    temperature = 0.7

    # Get the watsonx model
    model = get_model(model_type, max_tokens, min_tokens, decoding, temperature)

    # Get the prompt
    complete_prompt = create_prompt(file_path, file_type, question, collection_name)

    generated_response = model.generate(prompt=complete_prompt)
    response_text = generated_response['results'][0]['generated_text']

    # print model response
    print("--------------------------------- Generated response -----------------------------------")
    print(response_text.strip("\n"))
    print("*********************************************************************************************")

    return response_text


## Answering some questions

The next code cell will use all the previous code we've created so far to source information from the input documents and ask a question about them using watsonx.ai (Notice the return of the `answer_questions_from_doc`). 

To do so we'll pass in a question we want to ask, the file we want to reference for said question, and finally the name of the collection where the embeddings of the file exist.

Go ahead and run the next code cell. **You will see output from this cell**

In [None]:
# Test answering questions based on the provided .txt file
question = "What did the president say about corporate tax?"
file_path = "https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/L4assets/watsonx.ai-Assets/Documents/state_of_the_union.txt"
collection_name = "state_of_the_union_remote"
answer_questions_from_doc(file_path,FILE_TYPE_TXT, question, collection_name)

# Test answering questions based on the provided .pdf file
question = "How can you build a Generative AI model?"
file_path = "https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/L4assets/watsonx.ai-Assets/Documents/Generative_AI_Overview.pdf"
collection_name = "generative_ai_doc"
answer_questions_from_doc(file_path, FILE_TYPE_PDF, question, collection_name)
