# Building Language Model Applications with LangChain and Hugging Face

This notebook explores the use of **LangChain** and **Hugging Face** to build various language model applications. You will learn how to:

- Set up your environment with necessary packages and API keys.
- Build a simple Q&A chatbot using LangChain and a small language model from Hugging Face.
- Implement a Retrieval-Augmented Generation (RAG) system for document summarization.
- Utilize a LangChain agent equipped with tools for web search and calculations.
- Introduce **CrewAI**, a framework for orchestrating collaborative autonomous agents.

Each section includes code examples and exercises to help you experiment with different models, parameters, and techniques.

### Installing Required Packages for Hugging Face and LangChain
In this block, we install the necessary packages for using **Hugging Face Hub**,  **LangChain**, and **more** which include community modules and tools for building language model applications.

In [None]:
# Install required packages for Hugging Face and LangChain usage

%pip install -q "langchain" "langchain-community" "langchain-huggingface" \
                 "langchain_openai" "huggingface_hub" "chromadb" "google-search-results"

### Setting Up Hugging Face Access Token
We configure our environment with **access token** for Hugging Face, OpenAI and Google Search. This is necessary for programmatic access to models and datasets available on Hugging Face Hub, as well as access to OpenAI, and Google Search.

In [None]:
# Constants and API Key Configuration
import os
from google.colab import userdata

# === Load API keys securely from Google Colab Secrets ===
def load_api_keys():
    keys = {
        "HF_TOKEN": userdata.get("HF_TOKEN"),
        "OPENAI_API_KEY": userdata.get("OPENAI_API_KEY"),
        "SERPAPI_API_KEY": userdata.get("SERPAPI_API_KEY"),
        "SERPER_API_KEY": userdata.get("SERPER_API_KEY")
    }
    for key, value in keys.items():
        if not value:
            raise ValueError(f"❌ Missing {key}. Please set this API key in Colab secrets.")
        os.environ[key] = value
    print("✅ All API keys loaded and configured successfully.")

# Execute API key loading upon running this cell
load_api_keys()

### Building a Simple Q&A Chatbot Using LangChain
We will set up a basic **Q&A chatbot** using **LangChain** and a **small language model** from Hugging Face. This demonstrates chaining models and using templates.
Exercises:
- Experiment with different **small** models (uncomment LANGUAGE_MODEL to test alternatives).
- Adjust the temperature setting (TEMP of 0.9 for the most varied responses, 0.1 for the least varied).
- Try using different substitution variables (e.g., 'language':, set to "Spanish").
- Now try **OpenAI's GPT** large language model (by uncomment corresponding line below)
- Last, Alter Prompt to trigger rude response (e.g, ... you dummy)

In [None]:
# Candidate Models

#DEFAULT_MODEL = "openai/gpt-oss-20b"
#DEFAULT_MODEL = "HuggingFaceH4/zephyr-7b-beta"
#DEFAULT_MODEL = "meta-llama/Meta-Llama-3.1-8B-Instruct"
DEFAULT_MODEL = "mistralai/Mistral-7B-Instruct-v0.2"

In [None]:
# --- LangChain Chatbot ---

# Import necessary libraries for the updated LangChain structure
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from langchain_openai import ChatOpenAI

# Define the temperature
TEMP = 0.5

# --- Model and Chain Setup ---

# 1. Define the base LLM (HuggingFaceEndpoint, equivalent to the original 'llm' instance)
base_llm = HuggingFaceEndpoint(
    repo_id=DEFAULT_MODEL,
    temperature=TEMP,
    # Setting max_new_tokens ~50-word limit
    max_new_tokens=50,
    # Optional: Set this to False if the model is chat-tuned; check model documentation.
    return_full_text=False
)

# 2A. Wrap the base LLM in ChatHuggingFace
# This allows it to work seamlessly with ChatPromptTemplate messages
chat_llm = ChatHuggingFace(llm=base_llm)

# 2B. Wrap the base LLM in ChatOpenAI
# chat_llm = ChatOpenAI(temperature=TEMP)

# 3. Define the prompt template
prompt = ChatPromptTemplate.from_messages([
    ('system', 'Please respond in {language} in 25 words or less. {validate}'),
    ('human',  'What is the capital of North Carolina?'), # Example for model to reference
    ('ai', 'Raleigh is the capital of North Carolina.'),  # Corresponding answer from AI model
    ('human', '{input}')
])

# 4. Define the chatbot chain
# This uses the new structure: Prompt -> Chat LLM Wrapper -> Output Parser
chain = prompt | chat_llm | StrOutputParser()

# --- Invocation ---

# Invoke the chatbot with the sample input
response = chain.invoke({
    'input': 'Who is the tallest superhero?',
    'language': 'English',
    'validate': 'Keep it clean'
})

# Print the chatbot's response (now guaranteed to be a clean string due to StrOutputParser)
print("--- Response from Converted Chain ---")
print(response)

### RAG-Based Document Summarization
Demonstrates a **Retrieval-Augmented Generation** (RAG) process by splitting a document into chunks, embedding it into a searchable database, retrieving relevant information, and generating a summary using a language model.
Exercises:
- Change query_text, perhaps something to do with "radon gas".
- Change Language to Spanish in prompt template
- Update system prompt to include special formatting (E.g., HTML, JSON)

In [None]:
# Import Embeddings model for RAG, and Chroma in memory vector database
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.text_splitter import CharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain.vectorstores.chroma import Chroma
import requests
import logging

logging.getLogger("langchain_text_splitters.base").setLevel(logging.ERROR)

# Load the document from a GITHUB, normalizing special characters
DOC_URL = "https://jerrycuomo.github.io/Think_Artificial_Intelligence/datasets/EPA-consumer-safety-safe-water.txt"
full_text = requests.get(DOC_URL).text.replace("\r\n", "\n").replace("\r", "\n")

# Chunk the document and tokenize
text_splitter = CharacterTextSplitter(chunk_size=300)
texts = text_splitter.split_text(full_text)
print(f"Document has been split into {len(texts)} chunks")

# Initialize the embedding model and create a searchable database from the chunked texts
embeddings = OpenAIEmbeddings()
db = Chroma.from_texts(texts, embeddings)

# Question to ask the embeddig model
# query_text = "Can Radon gas enter your home?"
# query_text = "What is considered safe drinking water?"
query_text = "What is the issue with Arsenic?"

# Retrieving the context from the DB using similarity search
results = db.similarity_search(query_text, 1)

# Configure the prompt template for concise summarization
prompt = ChatPromptTemplate.from_messages([
    ("system", "Please summarize in {language} in 30 words or less. {validate}"),
    ("human", "{question} {input}")
])

# Set up the LangChain LLM for processing the information retrieved, defining the sequence for action
llm = ChatOpenAI(temperature=.2)
chain = prompt | llm

# Execute the chain on the first retrieved document, specifying the output language and summary style
response = chain.invoke({"question": query_text,
                         "input": results[0].page_content,
                         "language": "English",
                         "validate": "Say response in plain english."})
print(response.content)

### Langchain Agent (Skilled in Web Search and Math)
This program initializes an Langchain-based agent equipped with search and math tools, allowing it to answer complex queries by retrieving information from the web and performing calculations dynamically.

Exercise:
- Try difference queries by uncommenting options below

In [None]:
from langchain_openai import ChatOpenAI
from langchain.agents import load_tools, initialize_agent, AgentType

# Initialize the OpenAI agent with a specific temperature setting
llm = ChatOpenAI(temperature=.2)

# Load necessary tools for the agent, including SERPAPI for searches and llm-math for mathematical queries
tools = load_tools(["serpapi", "llm-math"], llm=llm)

# Initialize the agent with the loaded tools, setting it to a zero-shot react description mode for dynamic response handling
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

# Define a query and invoke the agent to handle it, demonstrating the agent's capability to generate and evaluate responses
#query = "How much would it cost to fill a pool the size of an Olympic swimming pool using the average water price in Los Angeles?"
#query = "What’s the average monthly salary in Switzerland, and how long would it take a person earning that salary to save enough to buy a Tesla Model S, factoring in living costs of 70% of their income?"
#query = "What's the current price of Tesla stock, and how much would 15 shares cost?"
#query = "What should I wear today?"

query = "What was the total score of the Super Bowl in the year Justin Bieber was born?"

agent.invoke(query)


### CrewAI Agent: A Collaborative AI Framework
This section introduces **CrewAI**, a framework for orchestrating autonomous agents that can work together to solve complex tasks. In this example, we set up a single agent equipped with a **web search tool** to answer user queries. The agent is designed with a specific **goal** and **backstory** to guide its decision-making process, particularly in determining when to use its tools. The code demonstrates how to:

- Initialize a web search tool.
- Define an agent with a role, goal, backstory, and assigned tools.
- Create a task for the agent with a clear description and expected output.
- Assemble a `Crew` with the defined agent and task.
- Execute the crew with a user query and observe the agent's process and final answer.

This example highlights how CrewAI allows for the creation of intelligent agents capable of deciding whether to use external tools based on the nature of the query and their defined objectives.

In [None]:
%%capture --no-stderr
%pip install -U --quiet 'crewai[tools]' aisuite

In [None]:
import os
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI
from crewai_tools import WebsiteSearchTool

# ----------------------------------------------------------------------
# 🔑 SETUP: Ensure your API Key is set as an environment variable (OPENAI_API_KEY)
# ----------------------------------------------------------------------

# Initialize the Search Tool
# This is the agent's connection to real-time, external information.
general_search_tool = WebsiteSearchTool()

# Define the Language Model (LLM)
# gpt-4o-mini is cost-effective and capable of complex reasoning.
general_llm = ChatOpenAI(
    model_name="gpt-4o-mini",
    temperature=0.5,
    max_tokens=1000
)

# ----------------------------------------------------------------------
# 🧠 AGENT DEFINITION: The Intelligence Core
# ----------------------------------------------------------------------

general_agent = Agent(
    role="General Question Answerer",
    # The GOAL is the primary instruction for the agent's behavior.
    goal="Answer any user query. Use the search tool only for questions requiring current or external information.",
    backstory="A simple assistant ready to answer questions.",

    tools=[general_search_tool],
    llm=general_llm,
    max_iter = 3, # Number of times to try to arrive at an answer
    verbose=True, # Shows the agent's 'Thought' process (CoT reasoning)
    allow_delegation=False
)

# ----------------------------------------------------------------------
# 🎯 TASK DEFINITION: The CoT Instruction
# ----------------------------------------------------------------------

answer_query_task = Task(
    # The agent's goal and the LLM's natural reasoning will now guide the steps.
    description=(
        "**Analyze the user query: '{query}' and provide the most complete and accurate answer possible.** "
        "Use your tool to find any necessary current, real-time, or external data required to fulfill this request. "
        "If the query requires context (like location or time) that is not provided, use your tools to deduce."
    ),
    expected_output="A single, concise, and accurate final answer to the user query.",
    agent=general_agent
)

# ----------------------------------------------------------------------
# ⚙️ CREW & EXECUTION
# ----------------------------------------------------------------------

research_crew = Crew(
    agents=[general_agent],
    tasks=[answer_query_task],
    process=Process.sequential,
    verbose=True # Shows the execution flow and final result
)

def run_lecture_demo(query: str):
    """Executes the CrewAI process for the given query."""
    if 'OPENAI_API_KEY' not in os.environ:
        print("\nERROR: Please set the OPENAI_API_KEY environment variable to run the demo.")
        return

    print(f"\n=====================================================")
    print(f"🧠 RUNNING QUERY: **{query}**")
    print("=====================================================")

    user_inputs = {"query": query}

    # Kickoff the crew execution
    result = research_crew.kickoff(inputs=user_inputs)

    print("\n-----------------------------------------------------")
    print(f"✅ FINAL ANSWER for '{query}':\n")
    print(result)
    print("-----------------------------------------------------\n")

# --- DEMONSTRATION QUERIES ---
if __name__ == "__main__":
    # 💡 Demo 1: Complex query that forces Deconstruction (CoT) and a Search.
    # The agent must realize: 1) It needs current info (weather). 2) It needs context (location).
    run_lecture_demo("What should I wear today?")

    # 💡 Demo 2 (Optional): Simple query that should use internal knowledge only.
    # Uncomment to show the agent intelligently skipping the search tool.
    # run_lecture_demo("Who wrote the novel 'Moby Dick'?")