<a href="https://www.nvidia.com/dli"> <img src="images/nvidia_header.png" style="margin-left: -30px; width: 300px; float: left;"> </a>

# The Surge of Agents: A Comprehensive Tour of Modern Agentic Frameworks

## Introduction

Welcome to this hands-on exploration of cutting-edge agentic frameworks! In this notebook, we'll dissect the architecture and capabilities of four powerful frameworks that are revolutionizing how we build intelligent systems. Agentic systems—computational entities that can perceive, decide, and act autonomously—represent a paradigm shift in AI development, enabling more sophisticated reasoning, planning, and problem-solving capabilities.

We'll use a common task—generating and solving mathematical word problems—to showcase each framework's unique approach and strengths:

- **OpenAI Python Library**: The foundation of many agentic systems, providing direct access to state-of-the-art language models with a clean, intuitive API. We'll see how even with minimal scaffolding, powerful agents can be constructed.

- **LangChain**: A comprehensive toolkit that abstracts away complexity in prompt engineering, output parsing, and workflow composition. We'll explore how it enables structured data handling through Pydantic integration and composable processing chains.

- **CrewAI**: A specialized framework for multi-agent orchestration that models complex systems as collaborative teams with distinct roles, goals, and capabilities. We'll examine how it facilitates agent specialization and structured task delegation.

- **LangGraph**: An emerging framework focused on graph-based workflow management that excels at modeling complex, non-linear interaction patterns. We'll demonstrate its power in creating modular, flexible agent architectures.

By the end of this notebook, you'll understand the distinctive design philosophies of these frameworks and be equipped to select the right tool for your specific agentic system needs.


## Setup and Configuration

Before diving into the frameworks, we need to establish our development environment. We'll configure access to the NVIDIA AI Foundation Models platform, which provides access to powerful open-source models like Llama 3.1.

### Key Configuration Elements:

- **API Key**: The authentication token required to access NVIDIA's API services. In production environments, this should be stored securely as an environment variable rather than hardcoded in your notebooks. In this workshop environment we are providing an API key for your use.

- **Endpoint URL**: The base URL that directs our requests to NVIDIA's AI model serving infrastructure. This endpoint handles all communication between our code and the foundation models. Typically this endpoint URL is `"https://integrate.api.nvidia.com/v1"`, however, in this workshop environment we are sending all calls to NVIDIA's API service through a proxy that we are managing.

- **Model Selection**: We're using `meta/llama-3.1-70b-instruct`, a powerful open-source LLM that balances performance and efficiency contained in an NVIDIA NIM.

Let's begin by setting up these configuration parameters:

In [1]:
import os

# In your own environemnt the `endpoint_url` should be https://integrate.api.nvidia.com/v1". Here we set it to a proxy
# service we use in this workshop environment.
endpoint_url = os.getenv("NVIDIA_BASE_URL")
api_key = os.getenv("NVIDIA_API_KEY")
model_name = "meta/llama-3.1-70b-instruct"

### Obtaining Your Own NVIDIA API Key

If you don't have an NVIDIA API key and need one for work in your own environment, you can generate one for free using the following steps:

1. Login (or sign up) through [build.nvidia.com](https://build.nvidia.com/explore/discover).
2. Click the `Get API Key` button available on the the `meta/llama-3_1-70b-instruct` page, found [here](https://build.nvidia.com/meta/llama-3_1-70b-instruct).

## Framework 1: The OpenAI Python Library

The OpenAI Python library provides a clean, straightforward interface to language models. While its name suggests exclusivity to OpenAI's models, this library can be configured to work with alternative endpoints, as we're doing with NVIDIA's API.

### Key Concepts:

1. **Client Initialization**: We configure the client with our API key, organization, and custom endpoint URL
2. **Completion Creation**: The primary method for generating text is through the `chat.completions.create()` method
3. **Message Structure**: Inputs are formatted as a list of message objects with roles (system, user, assistant) and content
4. **Response Handling**: Outputs are structured objects containing generated text and metadata

### Talking to an LLM:

Here, we will generates a math word problem about pre-algebra.

Depending on your definition of agent, the single call to the LLM in the next cell is not agentic. However, the OpenAI library is powerful, and calls could be chained together to create an agentic framework.

In [9]:
from openai import OpenAI

# Initialize the client with our configuration
openai = OpenAI(
    organization="nvidia",
    api_key=api_key,
    base_url=endpoint_url,
)

        
difficulty = "pre-algebra"

prompt = f"""
Create a math equation suitable for a {difficulty} student that involves solving for a single variable, x.
Use integers and basic operations (e.g., addition, subtraction, multiplication).
Provide only the equation, like "3x - 5 = 10".
"""

# Generate the equation
response = openai.chat.completions.create(
    model=model_name,
    messages=[{"role": "user", "content": prompt}],
    temperature=0.5
)

# Extract the generated problem from the response
equation_text = response.choices[0].message.content

print("Generated Equation:")
print("-" * 50)
print(equation_text)
print("-" * 50)


Generated Equation:
--------------------------------------------------
2x + 7 = 19
--------------------------------------------------


### Analysis - OpenAI Python Library

- **Minimal Setup**: Just a few lines of code to get started
- **Direct Control**: Low-level access to the model's capabilities
- **Flexibility**: Can be used with any compatible API endpoint

The OpenAI library provides a solid foundation for sending and receiving data from LLMs.


## Framework 2: LangChain - Composition and Structure

LangChain provides abstractions for composing multi-step workflows and handling structured outputs. It's designed to make complex agent patterns more manageable through reusable components.

### Key Concepts:

1. **Prompt Templates**: Parameterized text templates that can be reused across different contexts
2. **Output Parsers**: Specialized components that transform unstructured LLM outputs into structured data objects
3. **Chains**: Composable sequences of operations that can be executed as a single unit
4. **Pydantic Integration**: Using Python's type system to validate and structure data

### Building a Structured Agent:

We'll take the math problem generated earlier and build a word problem for it
1. Uses typed templates to generate the problem
2. Parses the output into a structured format using Pydantic models
3. Chains operations together using LangChain's pipeline operator (`|`)

This demonstrates how LangChain promotes robust and maintainable agent architectures.


In [18]:
#  LangChain Implementation
from langchain import PromptTemplate
from langchain.globals import set_debug  # Enables detailed logging of chain execution
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field  # For structured data validation
from langchain_nvidia_ai_endpoints import ChatNVIDIA

# Enable debug mode to see the full chain execution details
set_debug(True)

# Initialize our LLM with the NVIDIA endpoint
llm = ChatNVIDIA(model=model_name, base_url=endpoint_url)

# Define a structured data model for word problems
class WordProblem(BaseModel):
    word_problem: str = Field(description="The text of the math word problem")

# Create a parser that will extract structured data from LLM responses
word_problem_parser = PydanticOutputParser(pydantic_object=WordProblem)

# Define a template for generating word problems with instructions for proper formatting
word_problem_prompt = PromptTemplate.from_template(
    """Given the equation {equation}, create a realistic pre-algebra word problem that matches it.
    The problem should involve a real-world scenario (e.g., shopping, travel) and require solving for x.
    Provide only the word problem.
    Format your response as JSON: {format_instructions}. Do not include any other text but the JSON.""",
    partial_variables={"format_instructions": word_problem_parser.get_format_instructions()}
)

# Compose the entire workflow as a chain using the pipeline operator
chain = word_problem_prompt | llm | word_problem_parser

# Execute the full chain with a single call
result = chain.invoke({"equation": equation_text})

print("Generated Equation:") # from the first agent
print("-" * 50)
print(equation_text)
print("-" * 50)

word_problem = result.word_problem
print("\nWord Problem:")
print("-" * 50)
print(word_problem)
print("-" * 50)

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "equation": "2x + 7 = 19"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > prompt:PromptTemplate] Entering Prompt run with input:
[0m{
  "equation": "2x + 7 = 19"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > prompt:PromptTemplate] [0ms] Exiting Prompt run with output:
[0m[outputs]
[32;1m[1;3m[llm/start][0m [1m[chain:RunnableSequence > llm:ChatNVIDIA] Entering LLM run with input:
[0m{
  "prompts": [
    "Human: Given the equation 2x + 7 = 19, create a realistic pre-algebra word problem that matches it.\n    The problem should involve a real-world scenario (e.g., shopping, travel) and require solving for x.\n    Provide only the word problem.\n    Format your response as JSON: The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"descrip

### Analysis - LangChain

#### Strengths:
- **Structured Data Handling**: Pydantic integration ensures typed, validated outputs
- **Composable Patterns**: The pipeline operator (`|`) enables clean workflow composition
- **Reusable Components**: Templates and parsers can be shared across multiple agents
- **Simplified Chaining**: Automatic passing of outputs between steps

LangChain excels at creating structured, maintainable workflows.


## Framework 3: LangGraph - Graph-Based Workflow Management

LangGraph represents the cutting edge of agentic workflow management, using directed graphs to model complex interactions between components. This approach offers maximum flexibility for creating sophisticated, non-linear agent architectures.

### Key Concepts:

1. **Nodes**: Discrete processing units that perform specific functions
2. **Edges**: Connections between nodes that define data flow and execution order
3. **Graphs**: Complete workflow definitions with nodes and edges
4. **State Management**: Tracking and updating context throughout execution

### Building a Graph-Based Agent:

We'll create a modular workflow that:
1. Defines distinct nodes for problem generation and solving
2. Establishes connections between nodes to control data flow
3. Executes the graph to process our mathematical tasks

This demonstrates LangGraph's power for creating flexible, maintainable agent architectures.


In [None]:
from IPython.display import display, Markdown
from langgraph.graph import Graph, START
from langchain_core.prompts import PromptTemplate
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain.globals import set_debug

# Enable detailed logging
set_debug(True)

# Initialize our LLM
llm = ChatNVIDIA(model=model_name, base_url=endpoint_url)

# Create prompts
equation_solver_prompt = PromptTemplate(
        input_variables=["equation", "word_problem"],
        template="""Given the equation {equation} and matching word problem {word_problem}, solve it by providing only the mathematical steps as a list.
        Each part should be a single equation or expression, showing the progression to the final solution, without any explanatory text. For example, for "5 + x = 13", output:
        5 + x = 13 -> x = 13 - 5 -> x = 8"""
    )
    
equation_solution_explainer_prompt = PromptTemplate(
    input_variables=["equation", "word_problem", "solution"],
    template="""Given the equation {equation}, the matching word problem {word_problem}, and the solution {solution}, explain the solution in plain English using the fewest words possible."""
)

# Create chains
equation_solver_chain = equation_solver_prompt | llm
equation_solution_explainer_chain = equation_solution_explainer_prompt | llm

# Define node functions
def equation_solver_node(input_dict):
    equation = input_dict["equation"]
    word_problem = input_dict["word_problem"]
    solution = equation_solver_chain.invoke({"equation": equation, "word_problem": word_problem})
    # Ensure solution is a string (extract content if it's an AIMessage)
    if hasattr(solution, 'content'):
        solution = solution.content
    return {"solution": solution, "equation": equation, "word_problem": word_problem}

def equation_solution_explainer_node(input_dict):
    equation = input_dict["equation"]
    word_problem = input_dict["word_problem"]
    solution = input_dict["solution"]
    explanation = equation_solution_explainer_chain.invoke({"equation": equation, "word_problem": word_problem, "solution": solution})
    # Ensure explanation is a string (extract content if it's an AIMessage)
    if hasattr(explanation, 'content'):
        explanation = explanation.content
    return {"explanation": explanation, "equation": equation, "word_problem": word_problem, "solution": solution}

# Create our workflow graph
graph = Graph()

# Add our processing nodes
graph.add_node("Solve Equation", equation_solver_node)
graph.add_node("Explain Solution", equation_solution_explainer_node)

# Define the flow between nodes
graph.add_edge(START, "Solve Equation")
graph.add_edge("Solve Equation", "Explain Solution")

# Set the finish point to the last node so its output is returned
graph.set_finish_point("Explain Solution")

# Compile the graph into a runnable workflow
workflow = graph.compile()

# Run the workflow with our input data
workflow_result = workflow.invoke({
    "equation": equation_text,
    "word_problem": word_problem
})

# Print the workflow result to debug
print("Workflow Result:")
print(workflow_result)

# Check if workflow_result is valid before proceeding
if workflow_result is None:
    print("Error: Workflow returned None. Check node execution or LLM invocation.")
else:
    explanation = workflow_result.get('explanation', "No explanation available")
    solution = workflow_result.get('solution', "No solution available")
    # Display the results in a formatted way
    display(Markdown(f"""
    ### Equation
    {equation_text}

    ### Word Problem
    {word_problem}

    ### Step-by-Step Solution
    {solution}

    ### Explanation
    {explanation}
    """))

### Analysis - LangGraph

#### Strengths:
- **Maximum Flexibility**: Graph structures can represent arbitrary workflow patterns
- **Modular Design**: Clear separation of concerns with independent nodes
- **Transparent Data Flow**: Explicit edges show exactly how information passes between components
- **Scalability**: Complex architectures remain manageable through graph visualization

LangGraph provides the most power and flexibility among the frameworks we've explored, making it ideal for complex agent architectures with sophisticated reasoning patterns and state management needs.


## Framework 4: CrewAI - Collaborative Multi-Agent Orchestration

CrewAI takes a different approach by modeling agentic systems as teams ("crews") of specialized agents with distinct roles, goals, and capabilities. This framework is particularly well-suited for complex tasks that benefit from division of labor and specialized expertise.

### Key Concepts:

1. **Agents**: Entities with defined roles, goals, and backstories that shape their behavior
2. **Tasks**: Units of work with descriptions and expected outputs
3. **Crews**: Collections of agents working together in a coordinated process
4. **Process Models**: Different approaches to task sequencing (sequential, hierarchical, etc.)

### Building a Collaborative Agent Team:

We'll create a two-agent system that:
1. Uses a specialized "Word Problem Generator" agent to create challenging problems
2. Delegates problem-solving to a "Math Solver" agent with mathematical expertise
3. Coordinates their collaboration through a sequential workflow

This demonstrates how CrewAI enables specialization through role definitions and coordinated execution.


In [None]:
from crewai import Agent, Task, Crew, LLM

# Initialize the LLM with the correct format for CrewAI
llm = LLM(
    model=f"nvidia_nim/{model_name}", base_url=endpoint_url, api_key=api_key
)

# Agent 1: Accuracy Checker
accuracy_checker_agent = Agent(
    role="Accuracy Checker",
    goal="Verify the mathematical correctness of the word problem, equation, and solution steps",
    backstory="You are a meticulous mathematician with a keen eye for detail. Your expertise lies in ensuring that every calculation and logical step in a math problem is correct, leaving no room for errors. You double-check solutions against the original problem to confirm accuracy.",
    llm=llm,
    verbose=True  # Enable detailed logging of agent actions
)

# Define the accuracy checking task
accuracy_task = Task(
    description="Review the following: word problem '{word_problem}', equation '{equation}', and solution '{solution}'. Verify that the solution steps correctly solve the equation and match the word problem. Output 'Correct' if accurate, or identify any errors if incorrect.",
    expected_output="A concise statement confirming accuracy ('Correct') or detailing any errors found.",
    agent=accuracy_checker_agent,
)

# Agent 2: Clarity Reviewer
clarity_reviewer_agent = Agent(
    role="Clarity Reviewer",
    goal="Ensure the word problem and solution explanation are clear, engaging, and educationally valuable for students",
    backstory="You are an experienced educator with a passion for making math accessible and engaging. You excel at evaluating whether problems and explanations are easy to understand, appropriately challenging, and relevant to students’ learning needs.",
    llm=llm,
    verbose=True,
)

# Define the clarity review task
clarity_task = Task(
    description="Review the following: word problem '{word_problem}' and solution explanation '{explanation}'. Assess if they are clear, engaging, and suitable for middle school students. Provide feedback, including at least one suggestion for improvement if applicable.",
    expected_output="A brief assessment of clarity and educational value, plus one suggestion for enhancement.",
    agent=clarity_reviewer_agent,
)

# Create a crew with both agents and their tasks
crew = Crew(
    agents=[accuracy_checker_agent, clarity_reviewer_agent], 
    tasks=[accuracy_task, clarity_task], 
    process="sequential",  # Tasks will be executed in order 
    verbose=True  # Enable detailed logging of crew coordination
)

# Example inputs from previous pipeline stages
inputs = {
    "word_problem": word_problem,
    "equation": equation_text,
    "solution": solution,
    "explanation": explanation
}

# Execute the full workflow
result = crew.kickoff(inputs=inputs)

# Display the result
print("\nCrewAI Result:")
print("-" * 50)
print(result)
print("-" * 50)

### Analysis - CrewAI

#### Strengths:
- **Role-Based Design**: Agents can be specialized with distinct capabilities and knowledge
- **Explicit Goals**: Each agent has clear objectives that guide its behavior
- **Narrative Elements**: Backstories help shape agent personalities and approaches
- **Flexible Coordination**: Multiple process models for different collaboration patterns

CrewAI excels at modeling collaborative agent teams.
