## Setup

Required libs:
* langchain for integrating language models and retrieval models.
* ibm-watsonx-ai for accessing the watsonx Granite language model.
* wget for downloading files from the internet.
* sentence-transformers for computing dense vector representations for sentences, paragraphs, and images.
* chromadb for an open source embedding database.
* pydantic for data validation.
* sqlalchemy for SQL toolkit and Object-Relational Mapping (ORM).

In [None]:
%%capture
!pip install langchain==0.2.6 | tail -n 1
!pip install langchain-community==0.2.6 | tail -n 1
!pip install ibm-watsonx-ai==1.0.10 | tail -n 1
!pip install langchain_ibm==0.1.8 | tail -n 1
!pip install wget==3.2 | tail -n 1
!pip install sentence-transformers==3.0.1 | tail -n 1
!pip install chromadb==0.5.3 | tail -n 1
!pip install pydantic==2.8.0 | tail -n 1
!pip install sqlalchemy==2.0.30 | tail -n 1

## Preparing

In [None]:
from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai import Credentials
import os


credentials = Credentials(
                   url = "https://us-south.ml.cloud.ibm.com",
                  )

client = APIClient(credentials)

project_id  = "skills-network"

## Loading

CharakterTextSplitter component: 
- **Chunk size**: The chunk_size parameter specifies the maximum number of characters in each chunk. 
- **Chunk overlap**: The chunk_overlap parameter determines the number of characters that should overlap between consecutive chunks.

In [None]:
import requests
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

# Define filename and URL
filename = 'state_of_the_union.txt'
url = 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/zNYlnZMW6K-9GP72DDizOQ/state-of-the-union.txt'

# Download the file if it does not exist
if not os.path.isfile(filename):
    response = requests.get(url)
    with open(filename, 'wb') as f:
        f.write(response.content)

# Load the document
loader = TextLoader(filename)
documents = loader.load()

# Split the document into chunks using CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

print(f"Number of chunks: {len(texts)}")

Create an embedding model

In [None]:
from ibm_watsonx_ai.foundation_models.utils import get_embedding_model_specs
from langchain_ibm import WatsonxEmbeddings
from ibm_watsonx_ai.foundation_models.utils.enums import EmbeddingTypes
from langchain.vectorstores import Chroma

get_embedding_model_specs(credentials.get('url'))

# Part 1: Create Embedding Model
# Set up the WatsonxEmbeddings object
embeddings = WatsonxEmbeddings(
    model_id=EmbeddingTypes.IBM_SLATE_30M_ENG.value,
    url=credentials["url"],
    project_id=project_id
    )

# Part 2: Embed Documents and Store
docsearch = Chroma.from_documents(texts, embeddings)

# Let us print several embedding vectors.
# Generate and print embedding vectors for a sample of the documents
sample_texts = texts[:3]  # Taking a sample of 3 documents for demonstration
sample_embeddings = embeddings.embed_documents([doc.page_content for doc in sample_texts])

print("Sample Embedding Vectors:")
for i, embedding in enumerate(sample_embeddings):
    print(f"Document {i + 1} Embedding Vector: Length: {len(embedding)}; {embedding}")

Need some help with watsonx.ai?

In [None]:
help(WatsonxEmbeddings)

Define the model: Granite

In [None]:
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods
from langchain_ibm import WatsonxLLM

parameters = {
    GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.MAX_NEW_TOKENS: 100,
    GenParams.STOP_SEQUENCES: ["\n"],
}

# Create a dictionary to store credential information
credentials = {
    "url"    : "https://us-south.ml.cloud.ibm.com"
}

# Indicate the model we would like to initialize. In this case, ibm/granite-3-8b-instruct.
model_id    = 'ibm/granite-3-8b-instruct'

# Initialize some watsonx.ai model parameters
params = {
        "decoding_method": "greedy",
        "temperature": 0.4, 
        "min_new_tokens": 1,
        "max_new_tokens": 100,
        #"stop_sequences":["\n"]
    }
project_id  = "skills-network" # <--- NOTE: specify "skills-network" as your project_id
# space_id    = None
# verify      = False

watsonx_granite = WatsonxLLM(
    model_id=model_id, 
    url=credentials["url"], 
    params=params, 
    project_id=project_id, 
)


Generate a retrieve augmented response

In [None]:
from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(llm=watsonx_granite, chain_type="stuff", retriever=docsearch.as_retriever())

query = "What did the president say about highway and bridges in disrepair"
qa.invoke(query)

Output: 

{'query': 'What did the president say about highway and bridges in disrepair',
 'result': '\n\nThe president announced that this year, over 65,000 miles of highway and 1,500 bridges in disrepair will be fixed.'}