![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)

# Implement a simple RAG use case with LangChain

_Retrieval Augmented Generation (RAG)_ allows us to use LLMs to interact with "external data" i.e. data that was not used for model training. Many use cases require working with proprietary company data, and it's one of the reasons why RAG is frequently used in generative AI applications.

There is more than one way to implement the RAG pattern, which we will cover in a later lab. In this notebook, we will use _LangChain's RetrievalQA_ API to demonstrate one implementation of a RAG pattern. In general, RAG can be used for more than just question-and-answer use cases, but as you can tell from the name of the API, _RetrievalQA_ was implemented specifically for question-and-answer. 

To get started we'll first verify that you have the necessary dependencies installed to run this notebook.

Go ahead and run the following code cell. **This may take a few seconds to complete.**


In [1]:
# Install dependencies

import sys
%pip install SQLAlchemy==2.0.29
!{sys.executable} -m pip install -q ibm_watson_machine_learning==1.0.342
!{sys.executable} -m pip install -q chromadb==0.4.22
!{sys.executable} -m pip install -q langchain==0.1.4
!{sys.executable} -m pip install -q pypdf==4.0.1
!{sys.executable} -m pip install -q sentence-transformers

# !{sys.executable} -m pip install -q chardet


Collecting SQLAlchemy==2.0.29
  Downloading SQLAlchemy-2.0.29-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Downloading SQLAlchemy-2.0.29-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m34.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: SQLAlchemy
  Attempting uninstall: SQLAlchemy
    Found existing installation: SQLAlchemy 1.4.39
    Uninstalling SQLAlchemy-1.4.39:
      Successfully uninstalled SQLAlchemy-1.4.39
Successfully installed SQLAlchemy-2.0.29
Note: you may need to restart the kernel to use updated packages.
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ibm-watsonx-ai 0.2.6 requires ibm-watson-machine-learning>=1.0.349, but you have ibm-watson-machine-learning 1.0.342 which

## Bring in dependencies

In this next code cell we'll bring in all the dependencies we'll need for later use.

Go ahead and run the following code cell. **There should be no ouput.**

In [2]:
# Bring in dependencies
# SQLite fix: https://docs.trychroma.com/troubleshooting#sqlite
# __import__('pysqlite3')
# import sys
# sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

from langchain.document_loaders.pdf import PyPDFLoader
from langchain.chains import RetrievalQA
from langchain.indexes import VectorstoreIndexCreator
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import CharacterTextSplitter

# WML python SDK
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes, DecodingMethods
from ibm_watson_machine_learning.foundation_models.extensions.langchain import WatsonxLLM



## Some important variables

In this next code cell you'll define some variables that will be used in order to interact with your instance of watsonx.ai.

Go ahead and run the following code cell. **There should be no ouput**

In [None]:

# Update the global variables that will be used for authentication in another function
watsonx_project_id = "PASTE_PROJECT_ID_HERE"
api_key = "PASTE_API_KEY_HERE"
url = "https://us-south.ml.cloud.ibm.com"


## Understanding the code

In this next code cell we'll create some functions that we can use later to interact easier with watsonx.ai. These functions are ***get_model()***, ***get_lang_chain_model()***, and ***answer_question_from_doc()***:

- ***get_model()***: creates a model object that will be used to invoke the LLM. Since the ***get_model()*** function is parametrized, it's the same in all examples.
- ***get_lang_chain_model()***: creates a model wrapper that will be used with the _LangChain_ API.
- ***answer_question_from_doc()*** specifies model parameters, loads the PDF file, creates an index from the loaded document, the instantiates and invokes the chain.

Go ahead and run the following code cell. **There should be no ouput**.

In [None]:
def get_model(model_type,max_tokens,min_tokens,decoding,temperature):

    generate_params = {
        GenParams.MAX_NEW_TOKENS: max_tokens,
        GenParams.MIN_NEW_TOKENS: min_tokens,
        GenParams.DECODING_METHOD: decoding,
        GenParams.TEMPERATURE: temperature
    }

    model = Model(
        model_id=model_type,
        params=generate_params,
        credentials={
            "apikey": api_key,
            "url": url
        },
        project_id=watsonx_project_id
    )

    return model

def get_lang_chain_model(model_type,max_tokens,min_tokens,decoding,temperature):

    base_model = get_model(model_type,max_tokens,min_tokens,decoding,temperature)
    langchain_model = WatsonxLLM(model=base_model)

    return langchain_model


## Gluing it together

The next function, `answer_questions_from_doc`, that we create is created to help combine the previous three that we defined. This is the wrapper that we will call when we want to interact with watsonx.ai.

Go ahead and run the following code cell. **There should be no ouput**.

In [None]:
def answer_questions_from_doc(file_path, question):

  # Specify model parameters
  model_type = "meta-llama/llama-2-70b-chat"
  max_tokens = 300
  min_tokens = 100
  decoding = DecodingMethods.GREEDY
  temperature = 0.7

  # Get the watsonx model that can be used with LangChain
  model = get_lang_chain_model(model_type, max_tokens, min_tokens, decoding, temperature)

  loaders = [PyPDFLoader(file_path)]

  index = VectorstoreIndexCreator(
      embedding=HuggingFaceEmbeddings(),
      text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=100)).from_loaders(loaders)

  chain = RetrievalQA.from_chain_type(llm=model,
                                      chain_type="stuff",
                                      retriever=index.vectorstore.as_retriever(),
                                      input_key="question")

  # Invoke the chain
  response_text = chain.run(question)

  # print model response
  print("--------------------------------- Generated response -----------------------------------")
  print(response_text)
  print("*********************************************************************************************")

  return response_text


## Answering some questions

The next code cell will use all the previous code we've created so far to source information from the input documents and ask a question about them using watsonx.ai (Notice the use of the `answer_questions_from_doc`).

To do so we'll pass in a question we want to ask, the PDF file we want to reference for said question, and finally the name of the collection where the embeddings of the file exist.

Notice the commented questions as well? Feel free to uncomment these or create some or your own to ask

Go ahead and run the next code cell. **You _will_ see output from this cell**

In [None]:
# Test answering questions based on the provided .pdf file
question = "What is Generative AI?"
# question = "What does it take to build a generative AI model?"
# question = "What are the limitations of generative AI models?"
file_path = "https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/L4assets/watsonx.ai-Assets/Documents/Generative_AI_Overview.pdf"

answer_questions_from_doc(file_path, question)
