---
# LangChain Agentic Framework

LangChain is a framework for developing applications powered by language models. It provides abstractions for:

- **Models**: Unified interface for different LLM providers
- **Prompts**: Templates and management for prompts
- **Chains**: Combining multiple components sequentially
- **Agents**: Autonomous systems that decide which actions to take
- **Tools**: Functions that agents can use to interact with the world

Reference: https://docs.langchain.com/oss/python/langchain/overview

In [1]:
# Import LangChain components
from langchain_groq import ChatGroq
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import tool
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_classic.agents import AgentExecutor, create_tool_calling_agent
import re
import json
from dotenv import load_dotenv
import os


In [2]:
# Initialize LangChain models
llm_groq = None
llm_gemini = None
load_dotenv("/Users/filippoguastella/DataBreeders/Corso_ML_Zoccarato/formazione-dbs/keys.env")
print("Environment variables loaded" if os.getenv("GROQ_API_KEY") or os.getenv("GOOGLE_API_KEY") else "No environment variables found")

Environment variables loaded


In [10]:

# Read keys from environment
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
GEMINI_API_KEY = False

if GROQ_API_KEY:
    llm_groq = ChatGroq(
        api_key=GROQ_API_KEY,
        model_name="llama-3.3-70b-versatile",
        temperature=0
    )
    print("LangChain Groq model initialized")

if GEMINI_API_KEY:
    llm_gemini = ChatGoogleGenerativeAI(
        google_api_key=GEMINI_API_KEY,
        model="gemini-2.0-flash",
        temperature=0
    )
    print("LangChain Gemini model initialized")

# Select default model
llm = llm_groq if llm_groq else llm_gemini
print(f"\nDefault LangChain model: {'Groq' if llm_groq else 'Gemini'}")

LangChain Groq model initialized

Default LangChain model: Groq


---
## Application 1: Conversational Assistant with LangChain

LangChain provides a more structured approach to building conversational assistants with built-in memory management.

In [11]:
print("llm initialized" if llm else "No LLM initialized. Please check your API keys in keys.env file.")

llm initialized


The following cell defines a single agent that can handle user queries, maintain context, and provide relevant responses using LangChain's abstractions.

In [12]:
from langchain_core.chat_history import InMemoryChatMessageHistory # For storing chat history
from langchain_core.runnables.history import RunnableWithMessageHistory

# Create a simple conversational chain with memory
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a professional advertising campaign analyst.
Your role is to help marketing professionals understand and optimize their campaigns.
Be concise, data-driven, and use industry-standard terminology."""),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])


chain = prompt | llm # this line creates a chain that first formats the prompt and then passes it to the LLM
# a chain is a runnable that takes input, applies the prompt template, and then calls the LLM

# Store for managing session histories
session_histories = {}

def get_session_history(session_id: str):
    """Get or create a session history."""
    if session_id not in session_histories:
        session_histories[session_id] = InMemoryChatMessageHistory()
    return session_histories[session_id]

# Create runnable with history
conversational_chain = RunnableWithMessageHistory( # RunnableWithMessageHistory to add history capability
    chain,
    get_session_history, # function to get history per session
    input_messages_key="input",
    history_messages_key="history"
)

In [13]:
# Test the conversational chain
print("=" * 60)
print("LANGCHAIN CONVERSATIONAL ASSISTANT")
print("=" * 60)

session_id = "demo_session"

# First message
response1 = conversational_chain.invoke(
    {"input": "I have a TV campaign with 48% reach and frequency of 3.5. What is the GRP?"},
    config={"configurable": {"session_id": session_id}}
)
print(f"\nUser: I have a TV campaign with 48% reach and frequency of 3.5. What is the GRP?")
print(f"\nAssistant: {response1.content}")

# Follow-up (context should be maintained)
response2 = conversational_chain.invoke(
    {"input": "How does that compare to industry benchmarks?"},
    config={"configurable": {"session_id": session_id}}
)
print(f"\n{'='*60}")
print(f"\nUser: How does that compare to industry benchmarks?")
print(f"\nAssistant: {response2.content}")

LANGCHAIN CONVERSATIONAL ASSISTANT

User: I have a TV campaign with 48% reach and frequency of 3.5. What is the GRP?

Assistant: To calculate GRP (Gross Rating Point), you multiply the reach by the frequency. 

GRP = Reach x Frequency
GRP = 48 x 3.5
GRP = 168

So, the GRP for your TV campaign is 168.


User: How does that compare to industry benchmarks?

Assistant: Industry benchmarks for GRP vary depending on the category, target audience, and campaign goals. However, as a general guideline:

* Low-awareness campaigns: 100-200 GRP
* Mid-awareness campaigns: 200-400 GRP
* High-awareness campaigns: 400-600 GRP

Based on this, your campaign's GRP of 168 falls into the low-awareness category. This may indicate that the campaign could benefit from increased frequency or reach to achieve higher awareness levels.


---
## Application 2: Document Analysis Agent with Tools

LangChain agents can use tools to perform specific tasks. This is particularly useful for document analysis where we need structured extraction.

In [14]:
# Define tools for document analysis

@tool
def extract_dates(text: str) -> str:
    """
    Extract all dates mentioned in the text.
    Use this tool when you need to find campaign periods or deadlines.
    """
    patterns = [
        r'\d{4}-\d{2}-\d{2}',  # YYYY-MM-DD, this is a regex, aka a regular expression
        r'\d{1,2}/\d{1,2}/\d{4}',  # DD/MM/YYYY or MM/DD/YYYY
        r'(?:January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{1,2},?\s+\d{4}',
    ]
    
    dates_found = []
    for pattern in patterns:
        matches = re.findall(pattern, text, re.IGNORECASE)
        dates_found.extend(matches)
    
    if dates_found:
        return f"Dates found: {', '.join(dates_found)}"
    return "No dates found in the text."


@tool
def extract_monetary_values(text: str) -> str:
    """
    Extract monetary values and currencies from the text.
    Use this tool when you need to find budgets, costs, or prices.
    """
    patterns = [
        r'(?:EUR|USD|GBP|\$|\u20ac|\u00a3)\s*[\d,]+(?:\.\d{2})?',
        r'[\d,]+(?:\.\d{2})?\s*(?:EUR|USD|GBP|euros?|dollars?)',
    ]
    
    values_found = []
    for pattern in patterns:
        matches = re.findall(pattern, text, re.IGNORECASE)
        values_found.extend(matches)
    
    if values_found:
        return f"Monetary values found: {', '.join(values_found)}"
    return "No monetary values found in the text."


@tool
def extract_percentages(text: str) -> str:
    """
    Extract percentage values from the text.
    Use this tool when you need to find reach, frequency, or performance metrics.
    """
    pattern = r'\d+(?:\.\d+)?%'
    matches = re.findall(pattern, text)
    
    if matches:
        return f"Percentages found: {', '.join(matches)}"
    return "No percentages found in the text."


@tool
def analyze_sentiment(text: str) -> str:
    """
    Analyze the overall sentiment and tone of the document.
    Use this to understand if a report is positive, negative, or neutral.
    """
    positive_words = ['success', 'exceeded', 'growth', 'improved', 'excellent', 'strong', 'achieved']
    negative_words = ['failed', 'declined', 'below', 'poor', 'missed', 'weak', 'challenge']
    
    text_lower = text.lower()
    pos_count = sum(1 for word in positive_words if word in text_lower)
    neg_count = sum(1 for word in negative_words if word in text_lower)
    
    if pos_count > neg_count:
        sentiment = "positive"
    elif neg_count > pos_count:
        sentiment = "negative"
    else:
        sentiment = "neutral"
    
    return f"Document sentiment: {sentiment} (positive indicators: {pos_count}, negative indicators: {neg_count})"

In [15]:
# Create the document analysis agent
tools = [extract_dates, extract_monetary_values, extract_percentages, analyze_sentiment]

agent_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a document analysis assistant specialized in advertising campaign documents.
Your job is to extract and analyze information from documents using the available tools.

When analyzing a document:
1. Use the appropriate tools to extract specific information
2. Synthesize the extracted information into a coherent summary
3. Highlight the most important findings

Always use tools when relevant information can be extracted."""),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# Create the agent
agent = create_tool_calling_agent(llm, tools, agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [16]:
# Test the document analysis agent
analysis_document = """
CAMPAIGN PERFORMANCE REPORT - Q4 2024

Campaign Period: October 1, 2024 - December 31, 2024
Total Investment: EUR 180,000

Key Results:
- Reach achieved: 52.3% (target was 50%)
- Average frequency: 4.1
- Campaign GRP: 214.4

The campaign showed strong performance, exceeding the reach target by 2.3 percentage points.
Digital channels achieved 38% reach while TV contributed 45% reach with some overlap.

Budget utilization was at 92%, with the remaining 8% to be reallocated to Q1 2025.
Overall, the campaign achieved its primary objectives of brand awareness growth.
"""

print("=" * 60)
print("DOCUMENT ANALYSIS AGENT")
print("=" * 60)

result = agent_executor.invoke({
    "input": f"""Analyze this campaign report and extract all key information:

{analysis_document}

Provide a summary of:
1. All dates found
2. All monetary values
3. All percentages
4. The overall sentiment of the report"""
})

print("\n" + "=" * 60)
print("ANALYSIS RESULT:")
print("=" * 60)
print(result["output"])

DOCUMENT ANALYSIS AGENT


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `extract_dates` with `{'text': 'CAMPAIGN PERFORMANCE REPORT - Q4 2024 Campaign Period: October 1, 2024 - December 31, 2024 Total Investment: EUR 180,000 Key Results: - Reach achieved: 52.3% (target was 50%) - Average frequency: 4.1 - Campaign GRP: 214.4 The campaign showed strong performance, exceeding the reach target by 2.3 percentage points. Digital channels achieved 38% reach while TV contributed 45% reach with some overlap. Budget utilization was at 92%, with the remaining 8% to be reallocated to Q1 2025. Overall, the campaign achieved its primary objectives of brand awareness growth.'}`


[0m[36;1m[1;3mDates found: October 1, 2024, December 31, 2024[0m[32;1m[1;3m
Invoking: `extract_monetary_values` with `{'text': 'CAMPAIGN PERFORMANCE REPORT - Q4 2024 Campaign Period: October 1, 2024 - December 31, 2024 Total Investment: EUR 180,000 Key Results: - Reach achieved: 52.3% (target wa

### Exercise 2.2: Extend the Document Analysis Agent

Add a new tool to the document analysis agent.

Requirements:
- Create a new @tool function
- The tool should extract or analyze a specific type of information
- Integrate it with the existing agent
- Test with a sample document

In [10]:
# Exercise 2.2: Your solution here

# TODO: Create a new tool
# @tool
# def your_new_tool(text: str) -> str:
#     """
#     Description of what this tool does.
#     """
#     # Your implementation
#     pass

# TODO: Add the tool to the agent and test
# extended_tools = tools + [your_new_tool]
# extended_agent = create_tool_calling_agent(llm, extended_tools, agent_prompt)
# extended_executor = AgentExecutor(agent=extended_agent, tools=extended_tools, verbose=True)

# See solutions.py for reference implementation

---
## Application 3: Report Generation Agent

An agent that can generate comprehensive reports by using tools to gather data and format output.

In [11]:
# Define tools for report generation

# Simulated database of campaign data
CAMPAIGN_DATABASE = {
    "Q4_2024": {
        "name": "Q4 2024 Brand Campaign",
        "client": "TechCorp",
        "period": {"start": "2024-10-01", "end": "2024-12-31"},
        "budget": {"planned": 200000, "spent": 185000, "currency": "EUR"},
        "performance": {
            "reach": 52.3,
            "reach_target": 50.0,
            "frequency": 4.1,
            "impressions": 6200000,
            "grp": 214.4
        }
    },
    "Q3_2024": {
        "name": "Q3 2024 Summer Campaign",
        "client": "TechCorp",
        "period": {"start": "2024-07-01", "end": "2024-09-30"},
        "budget": {"planned": 150000, "spent": 148500, "currency": "EUR"},
        "performance": {
            "reach": 45.2,
            "reach_target": 48.0,
            "frequency": 3.8,
            "impressions": 5100000,
            "grp": 171.8
        }
    }
}


@tool
def get_campaign_data(campaign_id: str) -> str:
    """
    Retrieve campaign data from the database.
    Use this to get performance metrics, budget information, and other campaign details.
    Available campaigns: Q4_2024, Q3_2024
    """
    if campaign_id in CAMPAIGN_DATABASE:
        data = CAMPAIGN_DATABASE[campaign_id]
        return json.dumps(data, indent=2)
    return f"Campaign '{campaign_id}' not found. Available: {list(CAMPAIGN_DATABASE.keys())}"


@tool
def calculate_campaign_kpis(impressions: int, budget_spent: float, reach: float) -> str:
    """
    Calculate key performance indicators for a campaign.
    Provide impressions (total), budget_spent (in currency), and reach (percentage).
    Returns CPM and cost per reach point.
    """
    if impressions <= 0 or budget_spent <= 0 or reach <= 0:
        return "Error: All values must be positive numbers."
    
    cpm = (budget_spent / impressions) * 1000
    cost_per_reach_point = budget_spent / reach
    
    return f"""Campaign KPIs:
- CPM (Cost per Mille): {cpm:.2f}
- Cost per Reach Point: {cost_per_reach_point:.2f}
- Impressions per Euro: {impressions / budget_spent:.1f}"""


@tool
def compare_campaigns(campaign_id_1: str, campaign_id_2: str) -> str:
    """
    Compare two campaigns and provide a performance comparison.
    Use this when asked to compare campaigns or analyze trends.
    """
    if campaign_id_1 not in CAMPAIGN_DATABASE or campaign_id_2 not in CAMPAIGN_DATABASE:
        return f"One or both campaigns not found. Available: {list(CAMPAIGN_DATABASE.keys())}"
    
    c1 = CAMPAIGN_DATABASE[campaign_id_1]
    c2 = CAMPAIGN_DATABASE[campaign_id_2]
    
    comparison = {
        "campaigns": [campaign_id_1, campaign_id_2],
        "reach_comparison": {
            campaign_id_1: c1["performance"]["reach"],
            campaign_id_2: c2["performance"]["reach"],
            "difference": c1["performance"]["reach"] - c2["performance"]["reach"]
        },
        "frequency_comparison": {
            campaign_id_1: c1["performance"]["frequency"],
            campaign_id_2: c2["performance"]["frequency"],
            "difference": c1["performance"]["frequency"] - c2["performance"]["frequency"]
        },
        "budget_utilization": {
            campaign_id_1: f"{c1['budget']['spent'] / c1['budget']['planned'] * 100:.1f}%",
            campaign_id_2: f"{c2['budget']['spent'] / c2['budget']['planned'] * 100:.1f}%"
        }
    }
    
    return json.dumps(comparison, indent=2)

In [12]:
# Create the report generation agent
report_tools = [
    get_campaign_data, 
    calculate_campaign_kpis, 
    compare_campaigns
]

report_agent_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a professional report generation assistant for advertising campaigns.

Your capabilities:
1. Retrieve campaign data from the database
2. Calculate performance KPIs
3. Compare campaigns

When generating reports:
- Always start by retrieving the relevant campaign data
- Calculate KPIs when performance analysis is needed
- Use compare_campaigns when asked about trends or comparisons
- Structure the output professionally

Be thorough but concise in your analysis."""),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

report_agent = create_tool_calling_agent(llm, report_tools, report_agent_prompt)
report_executor = AgentExecutor(agent=report_agent, tools=report_tools, verbose=True)

In [13]:
# Test the report generation agent
print("=" * 60)
print("REPORT GENERATION AGENT")
print("=" * 60)

result = report_executor.invoke({
    "input": """Generate a performance report for Q4_2024 campaign that includes:
1. Campaign overview
2. Key performance metrics and KPIs
3. Comparison with Q3_2024 campaign
4. Brief recommendations for future campaigns"""
})

print("\n" + "=" * 60)
print("GENERATED REPORT:")
print("=" * 60)
print(result["output"])

REPORT GENERATION AGENT


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_campaign_data` with `{'campaign_id': 'Q4_2024'}`


[0m[36;1m[1;3m{
  "name": "Q4 2024 Brand Campaign",
  "client": "TechCorp",
  "period": {
    "start": "2024-10-01",
    "end": "2024-12-31"
  },
  "budget": {
    "planned": 200000,
    "spent": 185000,
    "currency": "EUR"
  },
  "performance": {
    "reach": 52.3,
    "reach_target": 50.0,
    "frequency": 4.1,
    "impressions": 6200000,
    "grp": 214.4
  }
}[0m[32;1m[1;3m
Invoking: `get_campaign_data` with `{'campaign_id': 'Q3_2024'}`


[0m[36;1m[1;3m{
  "name": "Q3 2024 Summer Campaign",
  "client": "TechCorp",
  "period": {
    "start": "2024-07-01",
    "end": "2024-09-30"
  },
  "budget": {
    "planned": 150000,
    "spent": 148500,
    "currency": "EUR"
  },
  "performance": {
    "reach": 45.2,
    "reach_target": 48.0,
    "frequency": 3.8,
    "impressions": 5100000,
    "grp": 171.8
  }
}[0m[32;1m[1;3m
Invok

### Exercise 3.2: Build a Complete Report Generation System

Create an enhanced report generation agent with additional tools.

Requirements:
- Add at least 2 new tools (e.g., trend analysis, forecast, visualization suggestions)
- Create a comprehensive report for a given campaign
- Include executive summary and detailed analysis sections

In [14]:
# Exercise 3.2: Your solution here

# TODO: Create new tools for enhanced reporting
# @tool
# def analyze_trend(...) -> str:
#     """..."""
#     pass

# @tool
# def generate_forecast(...) -> str:
#     """..."""
#     pass

# TODO: Create enhanced agent with new tools
# enhanced_report_tools = report_tools + [analyze_trend, generate_forecast]
# enhanced_report_agent = create_tool_calling_agent(llm, enhanced_report_tools, report_agent_prompt)
# enhanced_report_executor = AgentExecutor(agent=enhanced_report_agent, tools=enhanced_report_tools, verbose=True)

# TODO: Test with a comprehensive report request

# See solutions.py for reference implementation

---
## Summary

In this notebook, you learned to build three fundamental LLM applications:

### Part 1: Direct LLM Integration
1. **Conversational Assistants**: Managing context and conversation history
2. **Document Analysis**: Extracting structured data with schema validation
3. **Report Generation**: Creating professional documents from data

### Part 2: LangChain Framework
1. **Conversational Chains**: Using LangChain's memory management
2. **Tool-Using Agents**: Creating autonomous agents with specialized tools
3. **Multi-Step Workflows**: Combining tools for complex report generation

### Key Takeaways
- LangChain provides higher-level abstractions that simplify agent development
- Tools extend agent capabilities beyond pure text generation
- Proper prompt engineering is essential for both approaches
- Schema validation ensures reliable structured output
- Template-based approaches provide consistency in generated content

### Next Steps
- Complete the exercises to reinforce your understanding
- Experiment with different LLM providers and models
- Explore LangChain documentation for additional features
- Consider adding error handling and monitoring for production use

---
## Solutions

All exercise solutions are available in `solutions.py`. To check your answers:

```python
from solutions import (
    # Exercise 1.1
    exercise_1_1_solution,
    
    # Exercise 2.1
    exercise_2_1_solution,
    
    # Exercise 2.2
    exercise_2_2_solution,
    
    # Exercise 3.1
    exercise_3_1_solution,
    
    # Exercise 3.2
    exercise_3_2_solution
)
```