# Text summarizer and translator agent

This notebook demonstrates how to build an intelligent agent that can automatically summarize English text and translate that summary into German. Rather than handling these tasks separately, we will create a unified system that orchestrates both operations seamlessly.

Our approach centers on creating specialized tools that the agent can use autonomously, making decisions about when and how to apply each capability. This design pattern is particularly powerful because it mirrors how humans approach complex tasks - breaking them down into smaller, manageable steps and using the right tool for each job.

In [1]:
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.tools import StructuredTool
from pydantic import BaseModel, Field
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Configure OpenAI API key for AI model access
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

### Language model initialization
We will configure our language model with specific parameters that optimize it for our summarization and translation tasks.

In [2]:
# Initialize the language model with optimized parameters
llm = ChatOpenAI(
    model="gpt-4o-mini-2024-07-18",
    max_tokens=1000,  # Limit response length to control output size
    temperature=0  # Set to 0 for deterministic and consistent outputs
)

Here we instantiate our language model with carefully chosen parameters. The `temperature=0` setting ensures our outputs are deterministic and consistent, which is crucial for applications like summarization where we want predictable results. The token limit prevents excessively long responses while still allowing for comprehensive summaries and translations.

### Core function definitions
Now we will create the fundamental functions that perform our summarization and translation tasks. These functions encapsulate the logic for interacting with the language model and will later be wrapped as tools for our agent.

In [3]:
def summarize(text):
    # Create a structured prompt template for summarization
    prompt = PromptTemplate(
        input_variables=["text"],  # Specify the input variable
        template="Summarize the following text:\n\n{text}\n\nSummary:"  # Define the template for summarization
    )
    chain = prompt | llm  # Create a chain by piping the prompt to the language model
    return chain.invoke({"text": text}).content  # Invoke the chain with the input text and return the content of the response

def translate(text):
    # Create a structured prompt template for translation
    prompt = PromptTemplate(
        input_variables=["text"],  # Specify the input variable
        template="Translate the following text to German:\n\n{text}\n\nTranslation:"  # Define the template for translation
    )
    chain = prompt | llm  # Create a chain by piping the prompt to the language model
    return chain.invoke({"text": text}).content  # Invoke the chain with the input text and return the content of the response



These functions represent the core business logic of our application. Each function follows a consistent pattern: creating a prompt template, chaining it with the language model, and invoking the chain with input data. The use of `PromptTemplate` ensures consistent formatting and makes our prompts maintainable and reusable.

### Input validation schema
To ensure data integrity and provide clear interfaces for our tools, we will define a Pydantic model that validates input parameters.

In [4]:
class TextInput(BaseModel):
    # Define a Pydantic model for input validation
    text: str = Field(description="The text to summarize or translate")  # Define a text field with a description

This simple but important class provides input validation and documentation for our tools. Pydantic models automatically validate data types and can provide helpful error messages if invalid data is passed to our functions.

#### Function testing
Before integrating our functions into the agent framework, let's verify they work correctly with a simple test case.

In [5]:
# Test our core functions with a sample sentence
test_text = "The quick brown fox jumps over the lazy dog."

print("Testing summarization function:")
print(summarize(test_text))
print("\nTesting translation function:")
print(translate(test_text))

Testing summarization function:
A fast brown fox leaps over a sluggish dog.

Testing translation function:
Die schnelle braune Füchsin springt über den faulen Hund.


This testing step is used for ensuring our individual components work correctly before we integrate them into the more complex agent system. It allows us to catch and fix any issues early in the development process.

### Tools definition for the agent
Now we will transform our functions into structured tools that the agent can use autonomously. This involves wrapping our functions with metadata that helps the agent understand when and how to use each tool.

In [6]:
# Transform our functions into structured tools for agent use - each tool includes metadata about its purpose and input requirements
tools = [
    StructuredTool.from_function(
        func=summarize,  # The function to be wrapped as a tool
        name="Summarize",  # Name of the tool
        description="Useful for summarizing text",  # Description of what the tool does
        args_schema=TextInput  # The Pydantic model defining the input schema
    ),
    StructuredTool.from_function(
        func=translate,  # The function to be wrapped as a tool
        name="Translate",  # Name of the tool
        description="Useful for translating text to German",  # Description of what the tool does
        args_schema=TextInput  # The Pydantic model defining the input schema
    )
]

By wrapping our functions as `StructuredTool` objects, we provide the agent with rich metadata about each capability. The descriptions are particularly important as they help the agent's reasoning process determine which tool to use in different situations.

### Agent prompt engineering
The prompt template is perhaps the most critical component of our agent system. It provides detailed instructions that guide the agent's behavior and ensure consistent and reliable operation.


In [7]:
# Create a comprehensive prompt template that guides agent behavior
prompt = PromptTemplate(
    input_variables=["input", "agent_scratchpad"],  # Define the input variables for the prompt
    template="""Summarize the following text and then translate the summary to German:

Text: {input}

Use the following steps:
1. Use the Summarize tool to summarize the text. Pass the entire text as the 'text' argument.
2. Use the Translate tool to translate the summary to German. Pass the summary as the 'text' argument.
3. Immediately after using both tools, respond with the final result in the following format:
   Summary (English): [English summary]
   Translation (German): [German translation]

Do not use any tools after providing the formatted output.

{agent_scratchpad}"""  # Define the template for the agent's instructions
)

This prompt template serves as the agent's instruction manual. It clearly defines the workflow (summarize then translate), specifies the exact steps to follow, and establishes the expected output format. The `agent_scratchpad` variable allows the agent to track its progress and reasoning throughout the execution process.

### Agent initialization and configuration
With our tools and prompt ready, we can now create and configure the agent that will orchestrate our text processing workflow.

In [8]:
# Create an agent using the defined tools and prompt - the agent will use these components to make decisions about tool usage
agent = create_tool_calling_agent(llm, tools, prompt)

# Create an AgentExecutor to run the agent - the executor manages agent runtime behavior and constraints
agent_executor = AgentExecutor(
    agent=agent,  # The agent to execute
    tools=tools,  # The tools available to the agent
    verbose=True,  # Enable detailed logging of agent actions
    max_iterations=3,  # Set maximum number of iterations
    early_stopping_method="force"  # Force stop after max_iterations
)

The `AgentExecutor` acts as a runtime environment for our agent, providing important safeguards like iteration limits and verbose logging. These parameters help us monitor the agent's behavior and prevent runaway execution while maintaining transparency in the decision-making process.

### Agent execution function
To simplify the process of running our agent with different inputs, we will create a utility function that handles the execution details.

In [9]:
def run_agent_with_query(agent_executor, query):
    """
    Execute the agent with a given query and return the output.

    Args:
        agent_executor (AgentExecutor): The configured AgentExecutor to run.
        query (str): The input text to be processed by the agent.

    Returns:
        str: The output generated by the agent after processing the query.
    """
    # Invoke the agent_executor with the query as input - the executor handles all the complexity of agent-tool interaction
    result = agent_executor.invoke({"input": query})

    # Extract and return the 'output' field from the result
    return result['output']

This wrapper function abstracts away the details of agent invocation and provides a clean, simple interface for using our text processing system. It handles the execution mechanics and returns just the final result.

### Demonstration and testing
Finally, let's demonstrate our complete system with a realistic example that shows how all components work together.

In [10]:
# Define the input query
query = """The quick brown fox jumps over the lazy dog. This sentence is often used as a pangram in typography
to display font examples, as it contains every letter of the English alphabet. However, it's not the only pangram
in existence. Another example is 'Pack my box with five dozen liquor jugs', which is shorter but less commonly used."""

# Run the agent with the query
result = run_agent_with_query(agent_executor, query)

# Print the original query
print("\nQuery:")
print(query)

# Print the result from the agent
print("\nResult:")
print(result)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `Summarize` with `{'text': "The quick brown fox jumps over the lazy dog. This sentence is often used as a pangram in typography to display font examples, as it contains every letter of the English alphabet. However, it's not the only pangram in existence. Another example is 'Pack my box with five dozen liquor jugs', which is shorter but less commonly used."}`


[0m[36;1m[1;3mThe sentence "The quick brown fox jumps over the lazy dog" is a well-known pangram used in typography to showcase fonts, as it includes every letter of the English alphabet. Another, shorter pangram is "Pack my box with five dozen liquor jugs," though it is less commonly used.[0m[32;1m[1;3m
Invoking: `Translate` with `{'text': "The quick brown fox jumps over the lazy dog. This sentence is often used as a pangram in typography to display font examples, as it contains every letter of the English alphabet. However, it's not the only pangram 

This demonstration shows our complete system in action. The agent will automatically recognize that it needs to summarize the text first, then translate that summary into German, and finally present both results in a clear, formatted output. The verbose logging (enabled in our executor configuration) will show each step of the agent's reasoning and tool usage.

Looking at our output, the agent made **4 tool calls** but only had **3 iterations** available. Here's the sequence:
1. **First call**: `Summarize` with original text
2. **Second call**: `Translate` with original text (WRONG - should be summary)
3. **Third call**: `Summarize` with original text again (redundant)
4. **Fourth call**: `Translate` with the summary (CORRECT)

The issue stems from how the agent interpreted the prompt. Despite our clear instructions to:
1. Summarize first
2. Then translate the summary

The agent got confused and made several mistakes:
- It translated the **original text** instead of the **summary** in the second call
- It repeated the summarization unnecessarily
- It eventually figured out the correct workflow but wasted iterations

The agent continues iterating when:
1. It hasn't completed the task according to its prompt
2. It needs to use multiple tools in sequence
3. It makes mistakes and needs to correct them
4. It hasn't reached the maximum iteration limit

The multiple tool invocations show that the agent is reasoning through the problem step by step and it is ensuring it has the right inputs for each tool and providing the correct final output (summary in English + translation in German).