![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)

# Implement a simple RAG use case with LangChain

_Retrieval Augmented Generation (RAG)_ allows us to use LLMs to interact with "external data" i.e. data that was not used for model training. Many use cases require working with proprietary company data, and it's one of the reasons why RAG is frequently used in generative AI applications.

There is more than one way to implement the RAG pattern, which we will cover in a later lab. In this notebook, we will use _LangChain's RetrievalQA_ API to demonstrate one implementation of a RAG pattern. In general, RAG can be used for more than just question-and-answer use cases, but as you can tell from the name of the API, _RetrievalQA_ was implemented specifically for question-and-answer. 

To get started we'll first verify that you have the necessary dependencies installed to run this notebook.


## Considerations

At current time of writing (October 22th, 2024), **Python 3.13** is not supported for the Pandas package. Run Python 3.11 instead. For more information see [the following blow](https://stackoverflow.com/questions/78718762/getting-error-when-trying-to-install-pandas-using-pip/78719227#78719227).

## Environment Setup

### watsonx.ai notebook environment

If you are running this from a Jupyter Notebook in watsonx.ai, then run the following code cell. 

**This may take a few seconds to complete.**

In [None]:
# Install dependencies
import sys

!{sys.executable} -m pip install -U langchain-ibm | tail -n 1

### Non-watsonx.ai environment

If you are running this from a **outside** watsonx.ai, then run the following code cell. 

In [None]:
# Install dependencies
import sys

%pip install SQLAlchemy==2.0.29

!{sys.executable} -m pip install -q chromadb
!{sys.executable} -m pip install langchain | tail -n 1
!{sys.executable} -m pip install -U langchain-ibm | tail -n 1
!{sys.executable} -m pip install -q pypdf
!{sys.executable} -m pip install -q sentence-transformers
!{sys.executable} -m pip install -U langchain-huggingface
!{sys.executable} -m pip install ibm-watsonx-ai | tail -n 1

# !{sys.executable} -m pip install -q chardet

## Bring in dependencies

In this next code cell we'll bring in all the dependencies we'll need for later use.

Go ahead and run the following code cell. **There should be no ouput.**

In [2]:
# Bring in dependencies
# SQLite fix: https://docs.trychroma.com/troubleshooting#sqlite
# __import__('pysqlite3')
# import sys
# sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

from langchain_community.document_loaders import PyPDFLoader
from langchain.chains import RetrievalQA
from langchain.indexes import VectorstoreIndexCreator
from langchain.text_splitter import CharacterTextSplitter

from langchain_huggingface import HuggingFaceEmbeddings

# WML python SDK
from ibm_watsonx_ai.foundation_models import Model
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods


## Environment Setup

In this next code cell you'll define some variables that will be used in order to interact with your instance of watsonx.ai.

### Defining the WML credentials
This cell defines the WML credentials required to work with watsonx Foundation Model inferencing.

**Action:** Provide the IBM Cloud user API key. For details, see
[documentation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui).

In [3]:
import getpass
from os import environ

try:
    REGION = environ["RUNTIME_ENV_REGION"]
except KeyError:
    # Set your region here if you are not running this notebook in the watsonx.ai Jupyter environment
    # us-south, eu-de, etc.
    REGION = "us-south"

credentials = {
    "url": "https://" + REGION + ".ml.cloud.ibm.com",
    "apikey": getpass.getpass("Please enter your WML api key (hit enter): "),
}

### Defining the project id
The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

In [4]:
try:
    project_id = environ["PROJECT_ID"]
except KeyError:
    # Enter project ID here if not running this notebook in the watsonx.ai Jupyter environment
    project_id = "MY_PROJECT_ID"

## Understanding the code

In this next code cell we'll create the `get_model()` function that we can use later to interact easier with watsonx.ai. 

The function creates a model object that will be used to invoke the LLM. Since the ***get_model()*** function is parametrized, it's the same in all examples. If the `to_langchain` parameter is set to True, a model wrapper to will be used with the _LangChain_ API will be returned.

Go ahead and run the following code cell. **There should be no output**.

In [5]:
def get_model(
    model_id: str,
    model_params: dict = {
        GenParams.MAX_NEW_TOKENS: 300,
        GenParams.MIN_NEW_TOKENS: 10,
        GenParams.DECODING_METHOD: "greedy",
    },
    to_langchain: bool = False,
):

    model = Model(
        model_id=model_id,
        params=model_params,
        credentials=credentials,
        project_id=project_id,
    )

    if to_langchain:
        return model.to_langchain()

    return model

The next function, `answer_questions_from_doc`, that we create is created to help combine the previous three that we defined. This is the wrapper that we will call when we want to interact with watsonx.ai.

The function specifies model parameters, loads the PDF file, creates an index from the loaded document, the instantiates and invokes the chain.

Go ahead and run the following code cell. **There should be no ouput**.

In [6]:
def answer_questions_from_doc(
    file_path: str, question: str, print_res: bool = True
) -> str:

    # Get the watsonx model that can be used with LangChain
    model = get_model(
        model_id="meta-llama/llama-3-8b-instruct",
        model_params={
            GenParams.MAX_NEW_TOKENS: 300,
            GenParams.MIN_NEW_TOKENS: 100,
            GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
        },
        to_langchain=True,
    )

    loaders = [PyPDFLoader(file_path)]

    index = VectorstoreIndexCreator(
        embedding=HuggingFaceEmbeddings(),
        text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=100),
    ).from_loaders(loaders)

    chain = RetrievalQA.from_chain_type(
        llm=model,
        chain_type="stuff",
        retriever=index.vectorstore.as_retriever(),
        input_key="question",
    )

    # Invoke the chain
    response_text = chain.invoke(question)

    if print_res:
        # print model response
        print(
            "--------------------------------- Generated response -----------------------------------"
        )
        print(response_text)
        print(
            "*********************************************************************************************"
        )

    return response_text

## Answering some questions

The next code cell will use all the previous code we've created so far to source information from the input documents and ask a question about them using watsonx.ai (Notice the use of the `answer_questions_from_doc`).

To do so we'll pass in a question we want to ask, the PDF file we want to reference for said question, and finally the name of the collection where the embeddings of the file exist.

Notice the commented questions as well? Feel free to uncomment these or create some or your own to ask

Go ahead and run the next code cell. **You _will_ see output from this cell**

In [None]:
# Test answering questions based on the provided .pdf file
question = "What is Generative AI?"
# question = "What does it take to build a generative AI model?"
# question = "What are the limitations of generative AI models?"
file_path = "https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/L4assets/watsonx.ai-Assets/Documents/Generative_AI_Overview.pdf"

answer_questions_from_doc(file_path=file_path, question=question, print_res=True)


### Authors:
- **Josefina Casanova**, Engagement Lead, Build Lab Americas. Edited for L4 watsonx course. 2024