# RAG with watsonx and langchain

💡 For this lab, we will work on a RAG application that answers questions about a single PDF file to keep it simple. You can use the PDF files provided with this repository or bring your own file.

### Contents

1. [Setup](#setup_environment)
1. [PDF to Text](#pdf_text)
1. [Initialize the model](#initialize_model)
1. [Create the inference function](#inference_function)


<a id="setup_environment"></a>
## 1. Set up the environment

In [None]:
!pip install -U ibm-watson-machine-learning --quiet

In [None]:
credentials = {
    "url": "URL",
    "apikey": "API_KEY"
}

In [None]:
project_id = 'PROJECT_ID'

In [None]:
from langchain.document_loaders import PyPDFLoader
from langchain.chains import RetrievalQA
from langchain.indexes import VectorstoreIndexCreator
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import CharacterTextSplitter

### 1.2 List available models

All avaliable models are presented under `ModelTypes` class.

In [None]:
from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes

print([model.value for model in ModelTypes])

<a id="pdf_text"></a>
## 2. PDF to Text

Let's load a PDF file to extract the text.

In [None]:
loader = PyPDFLoader("../documents/bank_faq_en.pdf")
pages = loader.load()

len(pages)

In [None]:
pages[0].page_content[0:500]

In [None]:
pages[0].metadata

<a id="initialize_model"></a>
## 3. Initialize the model

Initialize the `Model` class with previous set params. `WatsonxLLM` is a wrapper around watsonx.ai models that provide chain integration around the models.

**Action:** For more details about `CustomLLM` check the [LangChain documentation](https://python.langchain.com/docs/modules/model_io/models/llms/custom_llm)

In [None]:
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
from ibm_watson_machine_learning.foundation_models.extensions.langchain import WatsonxLLM

def get_langchain_model(model_id, decoding, credentials, project_id, max_tokens=150, min_tokens=30, temperature=0.5):

    parameters = {
        GenParams.DECODING_METHOD: decoding,
        GenParams.MAX_NEW_TOKENS: max_tokens,
        GenParams.MIN_NEW_TOKENS: min_tokens,
        GenParams.TEMPERATURE: temperature,
    }

    model = Model(
        model_id=model_id,
        params=parameters,
        credentials=credentials,
        project_id=project_id
        )

    langchain_model = WatsonxLLM(model=model)

    return langchain_model

<a id="inference_function"></a>
## 4. Create the inference function

In this section we define the inference function. 

In [None]:
from ibm_watson_machine_learning.foundation_models.utils.enums import DecodingMethods
from langchain.prompts import PromptTemplate


def generate_from_doc(watsonx_credentials, watsonx_project_id, loader, question):

    # Specify model parameters
    model_id = "meta-llama/llama-2-70b-chat"
    max_tokens = 300
    min_tokens = 100
    decoding = DecodingMethods.GREEDY
    temperature = 0.7

    # Get the watsonx model that can be used with LangChain
    watsonx_langchain_model = get_langchain_model(model_id, decoding, watsonx_credentials, watsonx_project_id, 
                                                  max_tokens, min_tokens, temperature)

    index = VectorstoreIndexCreator(
        embedding=HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L12-v2'),
        text_splitter=CharacterTextSplitter(chunk_size=100, chunk_overlap=0)).from_loaders([loader])
    
    # Building prompt template with langchain
    prompt_template = """
    Context: '''{context}'''

    Your task is to answer the question using the context given delimited with '''.
    - Don't add any additional information.
    - If you don't know the answer, just say that you don't know, don't try to make up an answer.
    - Use three sentences maximum.
    - Keep the answer as concise as possible

    Question: {question}

    Answer:"""

    langchain_prompt_template = PromptTemplate.from_template(prompt_template)

    chain = RetrievalQA.from_chain_type(llm=watsonx_langchain_model,
                                        chain_type="stuff",
                                        retriever=index.vectorstore.as_retriever(),
                                        chain_type_kwargs={"prompt": langchain_prompt_template}
                                        )

    # Invoke the chain
    response_text = chain.run({"query": question})

    return response_text

In [None]:
generated_response = generate_from_doc(credentials, project_id, loader, 'How many miles does black card grant?')

print(generated_response)