### Installing Required Packages for Hugging Face and LangChain
In this block, we install the necessary packages for using **Hugging Face Hub**,  **LangChain**, and **more** which include community modules and tools for building language model applications.

In [14]:
# Install required packages for Hugging Face and LangChain usage

%pip install -q "langchain" "langchain-community" "langchain-huggingface" \
                 "langchain_openai" "huggingface_hub" "chromadb" "google-search-results"

  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for google-search-results (setup.py) ... [?25l[?25hdone


### Setting Up Hugging Face Access Token
We configure our environment with **access token** for Hugging Face, OpenAI and Google Search. This is necessary for programmatic access to models and datasets available on Hugging Face Hub, as well as access to OpenAI, and Google Search.

In [11]:
# Constants and API Key Configuration
import os
from google.colab import userdata

# === Load API keys securely from Google Colab Secrets ===
def load_api_keys():
    keys = {
        "HF_TOKEN": userdata.get("HF_TOKEN"),
        "OPENAI_API_KEY": userdata.get("OPENAI_API_KEY"),
        "SERPAPI_API_KEY": userdata.get("SERPAPI_API_KEY")
    }
    for key, value in keys.items():
        if not value:
            raise ValueError(f"❌ Missing {key}. Please set this API key in Colab secrets.")
        os.environ[key] = value
    print("✅ All API keys loaded and configured successfully.")

# Execute API key loading upon running this cell
load_api_keys()

✅ All API keys loaded and configured successfully.


### Building a Simple Q&A Chatbot Using LangChain
We will set up a basic **Q&A chatbot** using **LangChain** and a **small language model** from Hugging Face. This demonstrates chaining models and using templates.
Exercises:
- Experiment with different **small** models (uncomment LANGUAGE_MODEL to test alternatives).
- Adjust the temperature setting (TEMP of 0.9 for the most varied responses, 0.1 for the least varied).
- Try using different substitution variables (e.g., 'language':, set to "Spanish").
- Now try **OpenAI's GPT** large language model (by uncomment corresponding line below)
- Last, Alter Prompt to trigger rude response (e.g, ... you dummy)

In [3]:
# Candidate Models

#DEFAULT_MODEL = "openai/gpt-oss-20b"
#DEFAULT_MODEL = "HuggingFaceH4/zephyr-7b-beta"
#DEFAULT_MODEL = "meta-llama/Meta-Llama-3.1-8B-Instruct"
DEFAULT_MODEL = "mistralai/Mistral-7B-Instruct-v0.2"

In [16]:
# --- LangChain Chatbot ---

# Import necessary libraries for the updated LangChain structure
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from langchain_openai import ChatOpenAI

# Define the temperature
TEMP = 0.5

# --- Model and Chain Setup (The 'New Way') ---

# 1. Define the base LLM (HuggingFaceEndpoint, equivalent to the original 'llm' instance)
base_llm = HuggingFaceEndpoint(
    repo_id=DEFAULT_MODEL,
    temperature=TEMP,
    # Setting max_new_tokens ~50-word limit
    max_new_tokens=50,
    # Optional: Set this to False if the model is chat-tuned; check model documentation.
    return_full_text=False
)

# 2A. Wrap the base LLM in ChatHuggingFace
# This allows it to work seamlessly with ChatPromptTemplate messages
chat_llm = ChatHuggingFace(llm=base_llm)

# 2B. Wrap the base LLM in ChatOpenAI
# chat_llm = ChatOpenAI(temperature=TEMP)

# 3. Define the prompt template (Exactly the same as the 'Old Way')
prompt = ChatPromptTemplate.from_messages([
    ('system', 'Please respond in {language} in 25 words or less. {validate}'),
    ('human', '{input}')
])

# 4. Define the chatbot chain
# This uses the new structure: Prompt -> Chat LLM Wrapper -> Output Parser
chain = prompt | chat_llm | StrOutputParser()

# --- Invocation ---

# Invoke the chatbot with the sample input
response = chain.invoke({
    'input': 'Who is the tallest superhero?',
    'language': 'English',
    'validate': 'Keep it clean'
})

# Print the chatbot's response (now guaranteed to be a clean string due to StrOutputParser)
print("--- Response from Converted Chain ---")
print(response)

--- Response from Converted Chain ---
The tallest superhero is Giant-Man, also known as Ant-Man or Goliath, who can grow up to 100 feet tall.


### RAG-Based Document Summarization
Demonstrates a **Retrieval-Augmented Generation** (RAG) process by splitting a document into chunks, embedding it into a searchable database, retrieving relevant information, and generating a summary using a language model.
Exercises:
- Change query_text, perhaps something to do with "radon gas".
- Change Language to Spanish in prompt template
- Update system prompt to include special formatting (E.g., HTML, JSON)

In [6]:
# Import Embeddings model for RAG, and Chroma in memory vector database
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.text_splitter import CharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain.vectorstores.chroma import Chroma
import requests
import logging

logging.getLogger("langchain_text_splitters.base").setLevel(logging.ERROR)

# Load the document from a GITHUB, normalizing special characters
DOC_URL = "https://jerrycuomo.github.io/Think_Artificial_Intelligence/datasets/EPA-consumer-safety-safe-water.txt"
full_text = requests.get(DOC_URL).text.replace("\r\n", "\n").replace("\r", "\n")

# Chunk the document and tokenize
text_splitter = CharacterTextSplitter(chunk_size=300)
texts = text_splitter.split_text(full_text)
print(f"Document has been split into {len(texts)} chunks")

# Initialize the embedding model and create a searchable database from the chunked texts
embeddings = OpenAIEmbeddings()
db = Chroma.from_texts(texts, embeddings)

# Retrieving the context from the DB using similarity search
# query_text = "Can Radon gas enter your home?"
query_text = "What is considered safe drinking water?"
results = db.similarity_search(query_text, 1)

# Configure the prompt template for concise summarization
prompt = ChatPromptTemplate.from_messages([
    ("system", "Please summarize in {language} in 30 words or less. {validate}"),
    ("human", "{question} {input}")
])

# Set up the LangChain LLM for processing the information retrieved, defining the sequence for action
llm = ChatOpenAI(temperature=.2)
chain = prompt | llm

# Execute the chain on the first retrieved document, specifying the output language and summary style
response = chain.invoke({"question": query_text,
                         "input": results[0].page_content,
                         "language": "English",
                         "validate": "Say response in plain english."})
print(response.content)

Document has been split into 113 chunks
Drinking water may contain contaminants like microbes, chemicals, and metals. Regulations by EPA and FDA ensure safe tap and bottled water by limiting certain contaminants. Contact EPA for more information.


### Langchain Agent (Skilled in Web Search and Math)
This program initializes an Langchain-based agent equipped with search and math tools, allowing it to answer complex queries by retrieving information from the web and performing calculations dynamically.

Exercise:
- Try difference queries by uncommenting options below

In [15]:
from langchain_openai import ChatOpenAI
from langchain.agents import load_tools, initialize_agent, AgentType

# Initialize the OpenAI agent with a specific temperature setting
llm = ChatOpenAI(temperature=.2)

# Load necessary tools for the agent, including SERPAPI for searches and llm-math for mathematical queries
tools = load_tools(["serpapi", "llm-math"], llm=llm)

# Initialize the agent with the loaded tools, setting it to a zero-shot react description mode for dynamic response handling
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

# Define a query and invoke the agent to handle it, demonstrating the agent's capability to generate and evaluate responses
#query = "How much would it cost to fill a pool the size of an Olympic swimming pool using the average water price in Los Angeles?"
#query = "What’s the average monthly salary in Switzerland, and how long would it take a person earning that salary to save enough to buy a Tesla Model S, factoring in living costs of 70% of their income?"
#query = "What's the current price of Tesla stock, and how much would 15 shares cost?"

query = "What was the total score of the Super Bowl in the year Justin Bieber was born?"

agent.invoke(query)


  agent = initialize_agent(




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out the total score of the Super Bowl in the year Justin Bieber was born.
Action: Search
Action Input: Super Bowl total score in [Justin Bieber birth year][0m
Observation: [36;1m[1;3m['Justin Bieber ; Born. Justin Drew Bieber. (1994-03-01) March 1, 1994 (age 31). London, Ontario, Canada ; Occupations. Singer; songwriter ; Years active, 2007– ...', "Justin Bieber ranks No. 8 on Billboard's Super Bowl Halftime Show Top Picks: “Following the surprise drop of his R&B-leaning Swag album in ...", 'The singer made eight points, four assists and two rebounds for the West team, which lost the game 54-49.', 'The Super Bowl was a star-studded event. From Jay-Z to Justin Bieber, see all the celebrities who attended. The Chiefs and 49ers brought out ...', 'There is no gig in music like the Super Bowl halftime show. You have 15 minutes to justify your legend. You have 150 million people watching ...', "It's worth remember

{'input': 'What was the total score of the Super Bowl in the year Justin Bieber was born?',
 'output': '43'}