# Lab 5: Using LangChain with IBM WatsonX

## 1. Intro to LangChain

[LangChain](https://docs.langchain.com/docs/) is an open-source development framework designed to simplify the creation of applications using large language models (LLMs).

The core idea of the library is that we can "chain" together different components to create more advanced use cases around LLMs. Here are the main components for the LangChain

- Model: interact with various LLMs
- Prompts: text that is sent to the LLMs
- Chains: allow to combine different LLM calls and actions automatically
- Embeddings and Vector Stores: break large data into chunks and store those to be queried when relevant
- Agents: enbale the LLMs to dynamically decide which tools to use in order to best respond to a given query

In short, **Langchain is a framework that can orchestrate a series of prompts to achieve a desired outcomes.**


## 2. How to connect LangChain to WatsonX.ai

In [None]:
!pip install "ibm-generative-ai[langchain]"

In [None]:
!pip install pypdf

In [None]:
!pip install sentence_transformers

In [None]:
!pip install chromadb

In [38]:
import os, getpass
import requests
# from dotenv import load_dotenv
from ibm_cloud_sdk_core import IAMTokenManager
from typing import Any, List, Mapping, Optional, Union, Dict
from pydantic import BaseModel, Extra
try:
    from langchain import PromptTemplate
    from langchain.chains import LLMChain, SimpleSequentialChain
    from langchain.document_loaders import PyPDFLoader
    from langchain.indexes import VectorstoreIndexCreator #vectorize db index with chromadb  # does not work in Cloud project
    from langchain.embeddings import HuggingFaceEmbeddings #for using HugginFace embedding models
    from langchain.text_splitter import CharacterTextSplitter #text splitter
    from langchain.llms.base import LLM
    from langchain.llms.utils import enforce_stop_tokens
except ImportError:
    raise ImportError("Could not import langchain: Please install ibm-generative-ai[langchain] extension.")

from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams

In [17]:
#config Watsonx.ai environment
endpoint_url = input("Please enter your WML endpoint url (hit enter): ")

Please enter your WML endpoint url (hit enter): https://us-south.ml.cloud.ibm.com


In [18]:
api_key = getpass.getpass("Please enter your IBM Cloud api key (hit enter): ")

Please enter your IBM Cloud api key (hit enter): ········


In [19]:
access_token = IAMTokenManager(
    apikey = api_key,
    url = "https://iam.cloud.ibm.com/identity/token"
).get_token()

In [20]:
try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id (hit enter): ")

In [21]:
# Helper Function
class Prompt:
    def __init__(self, access_token, project_id):
        self.access_token = access_token
        self.project_id = project_id

    def generate(self, input, model_id, parameters):
        wml_url = f"{endpoint_url}/ml/v1-beta/generation/text?version=2023-05-28"
        Headers = {
            "Authorization": "Bearer " + self.access_token,
            "Content-Type": "application/json",
            "Accept": "application/json"
        }
        data = {
            "model_id": model_id,
            "input": input,
            "parameters": parameters,
            "project_id": self.project_id
        }
        response = requests.post(wml_url, json=data, headers=Headers)
        if response.status_code == 200:
            return response.json()["results"][0]
        else:
            return response.text

In [22]:
##initializing WatsonX model

# Parameters
model_params = {
         "decoding_method": "sample",
         "min_new_tokens": 1,
         "max_new_tokens": 100,
         "random_seed": 42,
         "temperature": 0.5,
         "top_k": 50,
         "top_p": 1
}

# Model ID
model_id="google/flan-ul2"

In [23]:
prompt = Prompt(access_token, project_id)

In [24]:
# test that everything was set up correctly
response = prompt.generate("The sky is ...", model_id, model_params)

In [25]:
response['generated_text']

'blue'

In order to use WatsonX-based LLMs with Langchain, the LLM object must be of class `BaseLanguageModel` (see [Langchain docs](https://api.python.langchain.com/en/latest/schema/langchain.schema.language_model.BaseLanguageModel.html)). We'll use the custom class below to accomplish this.

In [26]:
# Wrap the WatsonX Model in a langchain.llms.base.LLM subclass to allow LangChain to interact with the model

class LangChainInterface(LLM, BaseModel):
    credentials: Optional[Dict] = None
    model: Optional[str] = None
    params: Optional[Dict] = None
    project_id : Optional[str]=None

    class Config:
        """Configuration for this pydantic object."""
        extra = Extra.forbid

    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        """Get the identifying parameters."""
        _params = self.params or {}
        return {
            **{"model": self.model},
            **{"params": _params},
        }
    
    @property
    def _llm_type(self) -> str:
        """Return type of llm."""
        return "IBM WATSONX"

    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        """Call the WatsonX model"""
        params = self.params or {}
        model = Model(model_id=self.model, params=params, credentials=self.credentials, project_id=self.project_id)
        text = model.generate_text(prompt)
        if stop is not None:
            text = enforce_stop_tokens(text, stop)
        return text

In [27]:
creds = {
        "url": endpoint_url,
        "apikey": api_key
    }

In [28]:
llm_model = LangChainInterface(model='google/flan-ul2', credentials=creds, params=model_params, project_id=project_id)

In [29]:
##predict with the model
text = "What is the capital of South Korea?"
llm_model(text)

'seoul'

## 3. Prompt Templates & Chains

In the previous example, the user input is sent directly to the LLM. However, when using an LLM in an application, you will usually need to reuse the same prompt across multiple scenarios

- Accepting user input and contruct a prompt
- Generating mutiple prompts from an collection of data points in a dataset 

In [30]:
# Define the prompt templates
prompt = PromptTemplate(
  input_variables=["country"],
  template= "where is the capital of {country}?",
)

# Chaining 
chain = LLMChain(llm=llm_model, prompt=prompt)

# Getting predictions
countries = ["USA", "England", "Japan", "Saudi Arabia"]
for country in countries:
    response = chain.run(country)
    print(prompt.format(country=country) + " = " + response)

where is the capital of USA? = washington dc
where is the capital of England? = london
where is the capital of Japan? = tokyo
where is the capital of Saudi Arabia? = jeddah


## 4. Simple sequential chains
The utility of LangChain becomes apparent as we chain outputs of one model as input to another model. Here's a simple example where one generates a question which the other model answers.

LangChain determines a model's output based on its response.  In our examples, the first model creates a response to the end prompt of "Question:" which LangChain maps as an input variable called "question" which it passes to the 2nd model.

In [31]:
## Create two sequential prompts 
pt1 = PromptTemplate(input_variables=["topic"], template="Generate a random question about {topic}: Question: ")
pt2 = PromptTemplate(
    input_variables=["question"],
    template="Answer the following question: {question}",
)

In [32]:
flan = LangChainInterface(model='google/flan-ul2', credentials=creds, params=model_params, project_id=project_id)
model = LangChainInterface(model='google/flan-ul2', credentials=creds, project_id=project_id)

In [33]:
prompt_to_flan = LLMChain(llm=flan, prompt=pt1)
flan_to_model = LLMChain(llm=model, prompt=pt2)
qa = SimpleSequentialChain(chains=[prompt_to_flan, flan_to_model], verbose=True)

In [34]:
qa.run("the meaning of life")



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mWhat does the word "eudaimonia" mean?[0m
[33;1m[1;3mgood life[0m

[1m> Finished chain.[0m


'good life'

## 5. Easy Loading of Documents Using Lang Chain
LangChain makes it easy to extract passages from documents so that you can answering questions based on your document's content.

In [None]:
# !pip install -U sqlalchemy

In [39]:
# Locally we load the PDF file
# pdf='what is generative ai.pdf'
# loaders = [PyPDFLoader(pdf)]

# To make our lives a bit easier with this Cloud version notebook, we will load a PDF from a URL instead
loaders = [PyPDFLoader('https://www.mckinsey.com/~/media/mckinsey/featured%20insights/mckinsey%20explainers/what%20is%20generative%20ai/what%20is%20generative%20ai.pdf')]

In [None]:
index = VectorstoreIndexCreator(
    embedding=HuggingFaceEmbeddings(),
    text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)).from_loaders(loaders)

In [None]:
###initializing watsonx flan_ul2 model
params = {
    GenParams.DECODING_METHOD: "sample",
    GenParams.MIN_NEW_TOKENS: 50,
    GenParams.MAX_NEW_TOKENS: 300,
    GenParams.TEMPERATURE: 0.2,
    GenParams.TOP_K: 100,
    GenParams.TOP_P:1
}

model = LangChainInterface(model='google/flan-ul2', credentials=creds, params=params, project_id=project_id)

In [None]:
from langchain.chains import RetrievalQA
chain = RetrievalQA.from_chain_type(llm=model, 
                                    chain_type="stuff", 
                                    retriever=index.vectorstore.as_retriever(), 
                                    input_key="question")

In [None]:
##answering based on the documents 
chain.run("what is generative ai?")