## 1. Programmatically using WatsonX.ai models

In [22]:


import sys

!{sys.executable} -m pip install "ibm-watson-machine-learning>=1.0.320" | tail -n 1
!{sys.executable} -m pip install "pydantic>=1.10.0" | tail -n 1
!{sys.executable} -m pip install langchain | tail -n 1
!{sys.executable} -m pip install langchain-community | tail -n 1
!{sys.executable} -m pip install -q sentence-transformers






In [3]:
%pip install pypdf

Note: you may need to restart the kernel to use updated packages.


In [4]:
%pip install SQLAlchemy==2.0.29

Note: you may need to restart the kernel to use updated packages.


In [20]:
# First import the dependencies we need:
import os
from time import sleep
try:
    from langchain import PromptTemplate
    from langchain.chains import LLMChain, SimpleSequentialChain
    from langchain.document_loaders import PyPDFLoader
    from langchain.indexes import VectorstoreIndexCreator # Vectorize db index with chromadb
    from langchain.embeddings import HuggingFaceEmbeddings # For using HuggingFace embedding models
    from langchain.text_splitter import CharacterTextSplitter # Text splitter

    from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
    from ibm_watson_machine_learning.foundation_models import Model
    from ibm_watson_machine_learning.foundation_models.extensions.langchain import WatsonxLLM
    
except ImportError as e:
    print(e)

print("Done importing dependencies.")


Done importing dependencies.


In [6]:
# Get our API key and URL from .env

project_id = ""
api_key = ""
ibm_cloud_url = "https://us-south.ml.cloud.ibm.com"



if api_key is None or ibm_cloud_url is None or project_id is None:
    raise Exception("One or more environment variables are missing!")
else:
    creds = {
        "url": ibm_cloud_url,
        "apikey": api_key 
    }
print("Done getting env variables.")


Done getting env variables.


In [7]:
# Initialize the WatsonX model
params = {
    GenParams.DECODING_METHOD: "sample",
    GenParams.TEMPERATURE: 0.2,
    GenParams.TOP_P: 1,
    GenParams.TOP_K: 25,
    GenParams.REPETITION_PENALTY: 1.0,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.MAX_NEW_TOKENS: 20
}

llm_model = Model(
    model_id="google/flan-ul2",
    params=params,
    credentials=creds,
    project_id=project_id
)
print("Done initializing LLM.")


Done initializing LLM.


In [8]:
# Predict with the model
countries = ["France", "Japan", "Australia"]

try:
  for country in countries:
    question = f"What is the capital of {country}"
    res = llm_model.generate_text(question)
    print(f"The capital of {country} is {res.capitalize()}")
except Exception as e:
  print(e)


The capital of France is Paris
The capital of Japan is Tokyo
The capital of Australia is Canberra


## 2. Prompt Templates & Chains

In the previous example, the user input is sent directly to the Watsonx LLM, without using Langchain. This is a basic use case, but real applications are rarely so simple. When using an LLM in an application, you will usually need to reuse the same prompt across multiple scenarios. We will now replicate the previous example, but use an LLM chain. This allows us to:

- Accept user input and contruct a prompt
- Generate multiple prompts from a collection of data points in a dataset 

In [9]:
# Define the prompt template
prompt = PromptTemplate(
  input_variables=["country"],
  template= "What is the capital of {country}?",
)

try:
  # In order to use Langchain, we need to instantiate Langchain extension
  lc_llm_model = WatsonxLLM(model=llm_model)
  
  # Define a chain based on model and prompt
  chain = LLMChain(llm=lc_llm_model, prompt=prompt)

  # Getting predictions
  countries = ["Sweden", "Mexico", "Vietnam"]
  for country in countries:
    response = chain.run(country)
    print(prompt.format(country=country) + " = " + response.capitalize())
    sleep(0.5)
except Exception as e:
  print(e)


  warn_deprecated(


What is the capital of Sweden? = Stockholm
What is the capital of Mexico? = Mexico city
What is the capital of Vietnam? = Hanoi


## 3. Simple sequential chains
The utility of LangChain becomes apparent as we chain outputs of one model as input to another model. Here's a simple example where one generates a question which the other model answers.

LangChain determines a model's output based on its response.  In our examples, the first model creates a response to the end prompt of "Question:" which LangChain maps as an input variable called "question" which it passes to the 2nd model.

In [10]:
# Create two sequential prompts 
pt1 = PromptTemplate(input_variables=["topic"], template="Generate a random question about {topic}: Question: ")
pt2 = PromptTemplate(
    input_variables=["question"],
    template="Answer the following question: {question}",
)
print("Done.")


Done.


In [11]:
# Instantiate 2 models (Note, these could be different models depending on use case)
# Note the .to_langchain() method which returns a WatsonxLLM wrapper, like above.
model_1 = Model(
    model_id="google/flan-ul2",
    params=params,
    credentials=creds,
    project_id=project_id
).to_langchain()
model_2 = Model(
    model_id="google/flan-ul2",
    credentials=creds,
    project_id=project_id
).to_langchain()

print("Done.")


Done.


In [12]:
# Construct the sequential chain
prompt_to_model_1 = LLMChain(llm=model_1, prompt=pt1)
prompt_to_model_2 = LLMChain(llm=model_2, prompt=pt2)
qa = SimpleSequentialChain(chains=[prompt_to_model_1, prompt_to_model_2], verbose=True)
print("Done.")


Done.


In [13]:
# Run our chain with the topic: "an animal"
# Play around with providing different topics to see the output. eg. cars, the Roman empire
try:
  qa.run("an animal")
except Exception as e:
  print(e)




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mWhat is the name of the smallest mammal?[0m
[33;1m[1;3mopossum[0m

[1m> Finished chain.[0m


## 4. Easy Loading of Documents Using Lang Chain
LangChain makes it easy to extract passages from documents so that you can answer questions based on your document's content. First download the example PDF file to your working folder: [what-is-generative-ai.pdf](https://github.com/ibm-build-lab/VAD-VAR-Workshop/blob/main/content/Watsonx/WatsonxAI/105/what-is-generative-ai.pdf)

In [18]:
# Load PDF document
# pdf='what-is-generative-ai.pdf'
file_path = "https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/L4assets/watsonx.ai-Assets/Documents/Generative_AI_Overview.pdf"
loaders = [PyPDFLoader(file_path)]
# loaders = [PyPDFLoader(pdf)]
print("Done.")


Done.


In [23]:
# Index loaded PDF
index = VectorstoreIndexCreator(
    embedding=HuggingFaceEmbeddings(),
    text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)).from_loaders(loaders)
print("Done.")


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Done.


In [24]:
# Initialize watsonx google/flan-ul2 model
params = {
    GenParams.DECODING_METHOD: "sample",
    GenParams.TEMPERATURE: 0.2,
    GenParams.TOP_P: 1,
    GenParams.TOP_K: 100,
    GenParams.MIN_NEW_TOKENS: 50,
    GenParams.MAX_NEW_TOKENS: 300
}
model = Model(
    model_id="google/flan-ul2",
    params=params,
    credentials=creds,
    project_id=project_id
).to_langchain()

print("Done.")


Done.


In [25]:
# Init RAG chain
from langchain.chains import RetrievalQA

chain = RetrievalQA.from_chain_type(llm=model, 
                                    chain_type="stuff", 
                                    retriever=index.vectorstore.as_retriever(), 
                                    input_key="question")
print("Done.")


Done.


In [None]:
# Answer based on the document
res = chain.run("what is Machine Learning?")
print(res)


Retrieval Augmented Generation (RAG) is a common AI use case. Many companies have vast amounts of data about which they want an AI system to answer questions, do searches or perform summarization tasks. We will learn more about RAG in lab 106.