# Introduccion Agentes con Langchain

LangChain es una de las bibliotecas de código abierto más populares para ingenieros de inteligencia artificial. Su objetivo es abstraer la complejidad al construir software de IA, proporcionar bloques de construcción fáciles de usar y facilitar el cambio entre distintos proveedores de servicios de IA.

En este ejemplo, presentaremos los Agents de LangChain, que añaden la capacidad de utilizar herramientas como búsquedas y calculadoras para completar tareas que los modelos de lenguaje tradicionales no pueden realizar. En este caso, utilizaremos el modelo gpt-4o-mini de OpenAI.

In [6]:
# # Ejecutar en caso que se desee utilizar con google colab
# !pip install -qU \
#   langchain-core==0.3.33 \
#   langchain-openai==0.3.3 \
#   langchain-community==0.3.16 \
#   google-search-results==2.4.2

## 1. Introduccion a Herramientas (Tools)

Las tools (herramientas) son una forma de ampliar las capacidades de nuestros modelos de lenguaje (LLM) permitiéndoles ejecutar código. Una herramienta es simplemente una función formateada de manera que nuestro agente pueda entender cómo utilizarla y luego ejecutarla. Comencemos creando algunas herramientas simples.

Podemos usar el decorador @tool para convertir una función estándar de Python en una herramienta compatible con un LLM. Esta función debe incluir algunos elementos clave para lograr un rendimiento óptimo:

Un docstring que describa qué hace la herramienta y cuándo debe usarse. Esto será leído por nuestro LLM o agente y le permitirá decidir cuándo y cómo utilizarla.

Nombres de parámetros claros, que idealmente indiquen al LLM el propósito de cada uno. Si no es evidente, el docstring debe explicar para qué sirve el parámetro y cómo debe usarse.

Anotaciones de tipo tanto para los parámetros como para el valor de retorno.

In [2]:
from langchain_core.tools import tool

@tool
def add(x: float, y: float) -> float:
    """Add 'x' and 'y'."""
    return x + y

@tool
def multiply(x: float, y: float) -> float:
    """Multiply 'x' and 'y'."""
    return x * y

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the power of 'y'."""
    return x ** y

@tool
def subtract(x: float, y: float) -> float:
    """Subtract 'x' from 'y'."""
    return y - x

In [3]:
add

StructuredTool(name='add', description="Add 'x' and 'y'.", args_schema=<class 'langchain_core.utils.pydantic.add'>, func=<function add at 0x00000221EFC53060>)

In [4]:
print(f"{add.name=}\n{add.description=}")

add.name='add'
add.description="Add 'x' and 'y'."


In [5]:
add.args_schema.model_json_schema()

{'description': "Add 'x' and 'y'.",
 'properties': {'x': {'title': 'X', 'type': 'number'},
  'y': {'title': 'Y', 'type': 'number'}},
 'required': ['x', 'y'],
 'title': 'add',
 'type': 'object'}

Al invocar la herramienta, una cadena JSON generada por el LLM será analizada (parseada) como JSON y luego consumida como argumentos con nombre (kwargs), de forma similar a lo siguiente:

In [7]:
import json

llm_output_string = "{\"x\": 5, \"y\": 2}"  # this is the output from the LLM
llm_output_dict = json.loads(llm_output_string)  # load as dictionary
llm_output_dict

{'x': 5, 'y': 2}

Esto luego se pasa a la función de la herramienta como kwargs (argumentos con nombre), como lo indica el operador **. El operador ** se utiliza para desempaquetar el diccionario en argumentos con nombre.

In [8]:
exponentiate.func(**llm_output_dict)

25

## 2. Crear un Agente

Vamos a construir un agente sencillo que utilice herramientas. Usaremos el LangChain Expression Language (LCEL) para construir el agente. Veremos LCEL en más detalle en el próximo capítulo, pero por ahora, lo único que necesitamos saber es que nuestro agente se construirá usando la siguiente sintaxis y componentes:

```
agent = (
    <input parameters, including chat history and user query>
    | <prompt>
    | <LLM with tools>
)
```
Al crear un agente, necesitamos añadir memoria conversacional para que el agente recuerde interacciones previas. Usaremos la clase más antigua ConversationBufferMemory en lugar de la más reciente RunnableWithMessageHistory, debido a que también utilizaremos el método y la clase más antiguos create_tool_calling_agent y AgentExecutor.


In [12]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# Definir Prompt con el Template

prompt = ChatPromptTemplate.from_messages([
    ("system", "you're a helpful assistant"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Definir el tipo de memoria

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",  # must align with MessagesPlaceholder variable_name
    return_messages=True  # to return Message objects
)

  memory = ConversationBufferMemory(


In [None]:
import os
from getpass import getpass
from langchain_openai import ChatOpenAI

# Definir el modelo de lenguaje (Se incluye API Key)

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") \
    or getpass("Enter your OpenAI API key: ")

llm = ChatOpenAI(
    model_name="gpt-4o-mini",
    temperature=0.0,
)

Ahora inicializaremos nuestro agente. Para ello necesitamos:

* llm: como ya está definido

* tools: por definir (simplemente una lista de nuestras herramientas definidas previamente)

* prompt: como ya está definido

* memory: como ya está definido

In [None]:
from langchain.agents import create_tool_calling_agent

# Construir el agente en funcion del modelo de lenguaje, herramientas y prompt

tools = [add, subtract, multiply, exponentiate]

agent = create_tool_calling_agent(
    llm=llm, tools=tools, prompt=prompt
)

In [14]:
# Exponer respuesta del agente

agent.invoke({
    "input": "what is 10.7 multiplied by 7.68?",
    "chat_history": memory.chat_memory.messages,
    "intermediate_steps": []  # agent will append it's internal steps here
})

[ToolAgentAction(tool='multiply', tool_input={'x': 10.7, 'y': 7.68}, log="\nInvoking: `multiply` with `{'x': 10.7, 'y': 7.68}`\n\n\n", message_log=[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_hinLdDgweN4rhnht16qRwXQ8', 'function': {'arguments': '{"x":10.7,"y":7.68}', 'name': 'multiply'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 146, 'total_tokens': 167, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_dbaca60df0', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-6f5b3b9d-2701-4215-9982-66d014733b8e-0', tool_calls=[{'name': 'multiply', 'args': {'x': 10.7, 'y': 7.68}, 'id': 'call_hinLdDgweN4rhnht16qRwXQ8', 'type': 'tool_call'}], usage_metadata={'input_tokens': 

Aquí podemos ver que el LLM ha generado que debemos usar la herramienta multiply y que la entrada para la herramienta debe ser {"x": 10.7, "y": 7.68}. Sin embargo, la herramienta no se ejecuta. Para que eso ocurra, necesitamos un bucle de ejecución del agente, el cual se encargará de las múltiples iteraciones de generación, llamada a herramienta, generación, etc.

Usamos la clase AgentExecutor para manejar el bucle de ejecución.

In [17]:
from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    memory=memory,
    verbose=True
)

# Ahora probemos la misma consulta con el ejecutor. Nota que el parámetro intermediate_steps que añadimos 
# antes ya no es necesario, ya que el ejecutor lo maneja internamente.

agent_executor.invoke({
    "input": "what is 10.7 multiplied by 7.68?",
    "chat_history": memory.chat_memory.messages,
})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `multiply` with `{'x': 10.7, 'y': 7.68}`


[0m[38;5;200m[1;3m82.17599999999999[0m[32;1m[1;3m10.7 multiplied by 7.68 is approximately 82.18.[0m

[1m> Finished chain.[0m


{'input': 'what is 10.7 multiplied by 7.68?',
 'chat_history': [HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={})],
 'output': '10.7 multiplied by 7.68 is approximately 82.18.'}

In [19]:
print(f'Ejecucion para validar respuesta: {10.7*7.68}')

Ejecucion para validar respuesta: 82.17599999999999


### 2.1 Validar el uso de memoria las herramientas

In [20]:
name = input()

agent_executor.invoke({
    "input": f"My name is {name}",
    "chat_history": memory
})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mNice to meet you, Bric! How can I assist you today?[0m

[1m> Finished chain.[0m


{'input': 'My name is bric',
 'chat_history': [HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='My name is bric', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Nice to meet you, Bric! How can I assist you today?', additional_kwargs={}, response_metadata={})],
 'output': 'Nice to meet you, Bric! How can I assist you today?'}

In [22]:
agent_executor.invoke({
    "input": "What is nine plus 10, minus 4 * 2 adn the result to the power of 3",
    "chat_history": memory
})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `add` with `{'x': 9, 'y': 10}`


[0m[36;1m[1;3m19.0[0m[32;1m[1;3m
Invoking: `multiply` with `{'x': 4, 'y': 2}`


[0m[38;5;200m[1;3m8.0[0m[32;1m[1;3m
Invoking: `exponentiate` with `{'x': 11, 'y': 3}`


[0m[36;1m[1;3m1331.0[0m[32;1m[1;3mThe result of \( (9 + 10 - 4 \times 2)^3 \) is \( 1331 \).[0m

[1m> Finished chain.[0m


{'input': 'What is nine plus 10, minus 4 * 2 adn the result to the power of 3',
 'chat_history': [HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='My name is bric', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Nice to meet you, Bric! How can I assist you today?', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What is nine 

In [25]:
answ = (9+10-(4*2))**3

print(f'Ejecucion para validar respuesta: {answ}')

Ejecucion para validar respuesta: 1331


In [26]:
agent_executor.invoke({
    "input": "What is my name",
    "chat_history": memory
})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mYour name is Bric.[0m

[1m> Finished chain.[0m


{'input': 'What is my name',
 'chat_history': [HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='what is 10.7 multiplied by 7.68?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='10.7 multiplied by 7.68 is approximately 82.18.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='My name is bric', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Nice to meet you, Bric! How can I assist you today?', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What is nine plus 10, minus 4 * 2, to the power of 3', additiona

### 2.2 Web search with SerpAPI

En este ejemplo, utilizaremos la misma configuración de agente y ejecutor que antes, pero añadiremos el servicio SerpAPI para permitir que nuestro agente busque información en la web.

Para usar esta herramienta, necesitas una clave API. Con el plan gratuito, puedes realizar hasta 100 búsquedas por mes.

https://serpapi.com/

In [27]:
os.environ["SERPAPI_API_KEY"] = os.getenv("SERPAPI_API_KEY") \
    or getpass("Enter your SerpAPI API key: ")

In [28]:
from langchain.agents import load_tools

toolbox = load_tools(tool_names=['serpapi'], llm=llm)

Estas herramientas personalizadas pueden revisar tu dirección IP para averiguar dónde te encuentras actualmente. Luego, utilizaremos una función secundaria para obtener la fecha y hora actual, y usaremos esta información para ingresarla en SerpAPI y así encontrar el patrón climático en tu zona y en el momento en que se ejecuta la función.

In [30]:
import requests
from datetime import datetime

# Crear Herramienta que permite obtener la IP para identificar la ubicacion regional

@tool
def get_location_from_ip():
    """Get the geographical location based on the IP address."""
    try:
        response = requests.get("https://ipinfo.io/json")
        data = response.json()
        if 'loc' in data:
            latitude, longitude = data['loc'].split(',')
            data = (
                f"Latitude: {latitude},\n"
                f"Longitude: {longitude},\n"
                f"City: {data.get('city', 'N/A')},\n"
                f"Country: {data.get('country', 'N/A')}"
            )
            return data
        else:
            return "Location could not be determined."
    except Exception as e:
        return f"Error occurred: {e}"

@tool
def get_current_datetime() -> str:
    """Return the current date and time."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")


# Crear Template del prompt

prompt = ChatPromptTemplate.from_messages([
    ("system", "you're a helpful assistant"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}")
])

# Crear Agente

tools = toolbox + [get_current_datetime, get_location_from_ip]

agent = create_tool_calling_agent(
    llm=llm, tools=tools, prompt=prompt
)

agent_executor = AgentExecutor(
    agent=agent, tools=tools, verbose=True
)

# Invocar Respuesta

out = agent_executor.invoke({
    "input": (
        "I have a few questions, what is the date and time right now? "
        "How is the weather where I am? Please give me degrees in Celsius"
    )
})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_current_datetime` with `{}`


[0m[33;1m[1;3m2025-05-18 18:04:48[0m[32;1m[1;3m
Invoking: `get_location_from_ip` with `{}`


[0m[38;5;200m[1;3mLatitude: 4.6097,
Longitude: -74.0817,
City: Bogotá,
Country: CO[0m[32;1m[1;3m
Invoking: `Search` with `current weather in Bogotá, Colombia`


[0m[36;1m[1;3m{'type': 'weather_result', 'temperature': '56', 'unit': 'Fahrenheit', 'precipitation': '65%', 'humidity': '99%', 'wind': '2 mph', 'location': 'Bogotá, Bogota, Colombia', 'date': 'Sunday 6:00 PM', 'weather': 'Light rain'}[0m[32;1m[1;3mThe current date and time is **May 18, 2025, 18:04:48**. 

The weather in **Bogotá, Colombia** is **light rain** with a temperature of **13°C** (which is approximately 56°F). The humidity is at **99%** with a wind speed of **2 mph**.[0m

[1m> Finished chain.[0m


In [31]:
from IPython.display import display, Markdown

display(Markdown(out["output"]))

The current date and time is **May 18, 2025, 18:04:48**. 

The weather in **Bogotá, Colombia** is **light rain** with a temperature of **13°C** (which is approximately 56°F). The humidity is at **99%** with a wind speed of **2 mph**.

## 3. Agente Ejecutador

Cuando hablamos de agentes, una parte importante de un "agente" es simplemente lógica de código, que ejecuta iterativamente llamadas al LLM y procesa sus salidas. La lógica exacta varía considerablemente, pero un ejemplo bien conocido es el agente ReAct.

![ReAct process](https://www.aurelio.ai/_next/image?url=%2Fimages%2Fposts%2Fai-agents%2Fai-agents-00.png&w=640&q=75)

Los agentes Reason + Action (ReAct) utilizan pasos iterativos de razonamiento y acción para incorporar el pensamiento en cadena (chain-of-thought) y el uso de herramientas en su ejecución. Durante el paso de razonamiento, el LLM genera los pasos a seguir para responder a la consulta. Luego, el LLM genera la entrada de acción, que nuestra lógica de código interpreta como una llamada a una herramienta.

![Agentic graph of ReAct](https://www.aurelio.ai/_next/image?url=%2Fimages%2Fposts%2Fai-agents%2Fai-agents-01.png&w=640&q=75)

Después del paso de acción, obtenemos una observación como resultado de la llamada a la herramienta. Luego, alimentamos esa observación nuevamente a la lógica del ejecutor del agente para obtener una respuesta final o continuar con más pasos de razonamiento y acción.

El agente y el ejecutor de agente que construiremos seguirán este patrón.

In [32]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ("system", (
        "You're a helpful assistant. When answering a user's question "
        "you should first use one of the tools provided. After using a "
        "tool the tool output will be provided in the "
        "'scratchpad' below. If you have an answer in the "
        "scratchpad you should not use any more tools and "
        "instead answer directly to the user."
    )),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

Para añadir herramientas a nuestro LLM, utilizaremos el método bind_tools dentro del constructor de LCEL, el cual tomará nuestras herramientas y las añadirá al LLM. También incluiremos el argumento tool_choice="any" en bind_tools, lo cual indica al LLM que DEBE usar una herramienta, es decir, no puede proporcionar una respuesta final directamente (y por lo tanto, no usar una herramienta).

Debido a que configuramos tool_choice="any" para forzar el uso de una herramienta, el campo habitual content estará vacío, ya que ese campo se utiliza para la salida en lenguaje natural, es decir, la respuesta final del LLM. Para encontrar la salida de la herramienta, necesitamos observar el campo tool_calls.

In [33]:
from langchain_core.runnables.base import RunnableSerializable

tools = [add, subtract, multiply, exponentiate]

# define the agent runnable
agent: RunnableSerializable = (
    {
        "input": lambda x: x["input"],
        "chat_history": lambda x: x["chat_history"],
        "agent_scratchpad": lambda x: x.get("agent_scratchpad", [])
    }
    | prompt
    | llm.bind_tools(tools, tool_choice="any")
)

tool_call = agent.invoke({"input": "What is 10 + 10", "chat_history": []})
tool_call

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_r1hIi0hJqhLz2zfzYhDxxcw6', 'function': {'arguments': '{"x":10,"y":10}', 'name': 'add'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 197, 'total_tokens': 214, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_dbaca60df0', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-0a58f971-8ef3-4a15-b8b9-31ecf4f03cd1-0', tool_calls=[{'name': 'add', 'args': {'x': 10, 'y': 10}, 'id': 'call_r1hIi0hJqhLz2zfzYhDxxcw6', 'type': 'tool_call'}], usage_metadata={'input_tokens': 197, 'output_tokens': 17, 'total_tokens': 214, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [34]:
tool_call.tool_calls

[{'name': 'add',
  'args': {'x': 10, 'y': 10},
  'id': 'call_r1hIi0hJqhLz2zfzYhDxxcw6',
  'type': 'tool_call'}]

In [35]:
# create tool name to function mapping
name2tool = {tool.name: tool.func for tool in tools}

In [37]:
tool_exec_content = name2tool[tool_call.tool_calls[0]["name"]](
    **tool_call.tool_calls[0]["args"]
)
tool_exec_content

20

In [38]:
from langchain_core.messages import ToolMessage

tool_exec = ToolMessage(
    content=f"The {tool_call.tool_calls[0]['name']} tool returned {tool_exec_content}",
    tool_call_id=tool_call.tool_calls[0]["id"]
)

out = agent.invoke({
    "input": "What is 10 + 10",
    "chat_history": [],
    "agent_scratchpad": [tool_call, tool_exec]
})
out

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_8t9bYgAHEPVNJaGqzVwWlVMf', 'function': {'arguments': '{"x":10,"y":10}', 'name': 'add'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 227, 'total_tokens': 244, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_dbaca60df0', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-806e054b-3f78-476b-90f6-2589ef79fa8c-0', tool_calls=[{'name': 'add', 'args': {'x': 10, 'y': 10}, 'id': 'call_8t9bYgAHEPVNJaGqzVwWlVMf', 'type': 'tool_call'}], usage_metadata={'input_tokens': 227, 'output_tokens': 17, 'total_tokens': 244, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

A pesar de tener la respuesta en nuestro agent_scratchpad, el LLM sigue intentando usar la herramienta de nuevo. Este comportamiento ocurre porque vinculamos las herramientas al LLM con tool_choice="any". Cuando configuramos tool_choice como "any" o "required", le indicamos al LLM que DEBE usar una herramienta, es decir, que no puede proporcionar una respuesta final directamente.

Hay dos opciones para solucionar esto:

Configurar tool_choice="auto" para indicarle al LLM que puede elegir usar una herramienta o proporcionar una respuesta final.

Crear una herramienta llamada final_answer — lo explicaremos en breve.

### 3.1 Agente Serializable

RunnableSerializable es una interfaz o tipo en LangChain que representa un objeto ejecutable (runnable) que además puede ser serializado y deserializado fácilmente. Esto significa que:

Runnable: Puede recibir una entrada, procesarla y devolver una salida (como una función o pipeline).

Serializable: Puede guardarse en un formato persistente (por ejemplo, JSON, pickle o un formato propio) para almacenarlo, compartirlo o cargarlo después sin perder su funcionalidad.

| Característica                     | Descripción                                                                                               |
| ---------------------------------- | --------------------------------------------------------------------------------------------------------- |
| **Ejecución modular**              | Puede ejecutarse con un método estándar (por ejemplo, `.invoke()`, `.run()` o similar).                   |
| **Composición encadenada**         | Permite encadenar múltiples objetos `Runnable` para construir pipelines complejos fácilmente.             |
| **Serialización/deserialización**  | Puede ser convertido a un formato portable y luego restaurado sin pérdida de estado o configuración.      |
| **Interoperabilidad**              | Compatible con diferentes componentes de LangChain, como prompts, modelos de lenguaje, herramientas, etc. |
| **Flexibilidad en entrada/salida** | Soporta entrada y salida en diferentes formatos (strings, diccionarios, listas, objetos).                 |
| **Soporte para memoria**           | Puede integrarse con mecanismos de memoria para mantener contexto entre ejecuciones.                      |
| **Configurabilidad**               | Puede aceptar configuraciones, como parámetros de LLM, herramientas, y opciones de control.               |




In [39]:
@tool
def final_answer(answer: str, tools_used: list[str]) -> str:
    """Use this tool to provide a final answer to the user.
    The answer should be in natural language as this will be provided
    to the user directly. The tools_used must include a list of tool
    names that were used within the `scratchpad`.
    """
    return {"answer": answer, "tools_used": tools_used}

tools = [final_answer, add, subtract, multiply, exponentiate]

# we need to update our name2tool mapping too
name2tool = {tool.name: tool.func for tool in tools}

agent: RunnableSerializable = (
    {
        "input": lambda x: x["input"],
        "chat_history": lambda x: x["chat_history"],
        "agent_scratchpad": lambda x: x.get("agent_scratchpad", [])
    }
    | prompt
    | llm.bind_tools(tools, tool_choice="any")  # we're forcing tool use again
)

tool_call = agent.invoke({"input": "What is 10 + 10", "chat_history": []})
tool_call.tool_calls

[{'name': 'add',
  'args': {'x': 10, 'y': 10},
  'id': 'call_gQdDOqpIVFgqhEn59LXmzlEu',
  'type': 'tool_call'}]

In [40]:
tool_out = name2tool[tool_call.tool_calls[0]["name"]](
    **tool_call.tool_calls[0]["args"]
)

tool_exec = ToolMessage(
    content=f"The {tool_call.tool_calls[0]['name']} tool returned {tool_out}",
    tool_call_id=tool_call.tool_calls[0]["id"]
)

out = agent.invoke({
    "input": "What is 10 + 10",
    "chat_history": [],
    "agent_scratchpad": [tool_call, tool_exec]
})
out

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Y5UPFPYUrvENFO3XsWPeK8zC', 'function': {'arguments': '{"answer":"10 + 10 equals 20.","tools_used":["functions.add"]}', 'name': 'final_answer'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 299, 'total_tokens': 326, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_dbaca60df0', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-ae975fb6-ec6f-4596-9b23-d9b92f3effdc-0', tool_calls=[{'name': 'final_answer', 'args': {'answer': '10 + 10 equals 20.', 'tools_used': ['functions.add']}, 'id': 'call_Y5UPFPYUrvENFO3XsWPeK8zC', 'type': 'tool_call'}], usage_metadata={'input_tokens': 299, 'output_tokens': 27, 'total_tokens': 326, 'input_

In [41]:
out.tool_calls[0]["args"]

{'answer': '10 + 10 equals 20.', 'tools_used': ['functions.add']}

### 3.2 Agent Execution Loop

Agent
Es la lógica o estrategia que define qué hacer.

Contiene la definición de cómo se eligen las acciones, qué herramientas usar, y cómo procesar la información.

Por ejemplo, un agent puede decidir llamar a una calculadora, hacer una búsqueda, o responder directamente.

En esencia, el agent es la "mente" o el plan de acción.

AgentExecutor
Es el mecanismo o loop de ejecución que maneja cómo se lleva a cabo esa lógica.

Controla las iteraciones entre la generación de acciones por parte del agent, la ejecución de herramientas (tools), la recepción de resultados, y la continuación del proceso.

Se encarga de hacer que el agent pueda funcionar en un flujo continuo hasta llegar a una respuesta final.

En otras palabras, el agent_executor es el "motor" que hace funcionar el agent.

Resumiendo:
El agent define la lógica de decisiones y acciones, mientras que el agent_executor es quien ejecuta esa lógica en un ciclo controlado hasta completar la tarea.

In [42]:
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage


class CustomAgentExecutor:
    chat_history: list[BaseMessage]

    def __init__(self, max_iterations: int = 3):
        self.chat_history = []
        self.max_iterations = max_iterations
        self.agent: RunnableSerializable = (
            {
                "input": lambda x: x["input"],
                "chat_history": lambda x: x["chat_history"],
                "agent_scratchpad": lambda x: x.get("agent_scratchpad", [])
            }
            | prompt
            | llm.bind_tools(tools, tool_choice="any")  # we're forcing tool use again
        )

    def invoke(self, input: str) -> dict:
        # invoke the agent but we do this iteratively in a loop until
        # reaching a final answer
        count = 0
        agent_scratchpad = []
        while count < self.max_iterations:
            # invoke a step for the agent to generate a tool call
            tool_call = self.agent.invoke({
                "input": input,
                "chat_history": self.chat_history,
                "agent_scratchpad": agent_scratchpad
            })
            # add initial tool call to scratchpad
            agent_scratchpad.append(tool_call)
            # otherwise we execute the tool and add it's output to the agent scratchpad
            tool_name = tool_call.tool_calls[0]["name"]
            tool_args = tool_call.tool_calls[0]["args"]
            tool_call_id = tool_call.tool_calls[0]["id"]
            tool_out = name2tool[tool_name](**tool_args)
            # add the tool output to the agent scratchpad
            tool_exec = ToolMessage(
                content=f"{tool_out}",
                tool_call_id=tool_call_id
            )
            agent_scratchpad.append(tool_exec)
            # add a print so we can see intermediate steps
            print(f"{count}: {tool_name}({tool_args})")
            count += 1
            # if the tool call is the final answer tool, we stop
            if tool_name == "final_answer":
                break
        # add the final output to the chat history
        final_answer = tool_out["answer"]
        self.chat_history.extend([
            HumanMessage(content=input),
            AIMessage(content=final_answer)
        ])
        # return the final answer in dict form
        return json.dumps(tool_out)

In [43]:
agent_executor = CustomAgentExecutor()

In [44]:
agent_executor.invoke(input="What is 10 + 10")

0: add({'x': 10, 'y': 10})
1: final_answer({'answer': '10 + 10 equals 20.', 'tools_used': ['functions.add']})


'{"answer": "10 + 10 equals 20.", "tools_used": ["functions.add"]}'

## 4. LCEL (Langchain Expression Language)

In [45]:
from langchain import PromptTemplate

prompt_template = "Give me a small report on {topic}"

prompt = PromptTemplate(
    input_variables=["topic"],
    template=prompt_template
)

import os
from getpass import getpass
from langchain_openai import ChatOpenAI

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") \
    or getpass("Enter your OpenAI API key: ")

llm = ChatOpenAI(
    model_name="gpt-4o-mini",
    temperature=0.0,
)

from langchain.schema.output_parser import StrOutputParser

output_parser = StrOutputParser()

In [46]:
lcel_chain = prompt | llm | output_parser

In [47]:
result = lcel_chain.invoke("retrieval augmented generation")
result 

'### Report on Retrieval-Augmented Generation (RAG)\n\n#### Introduction\nRetrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge bases or documents during the generation process, thereby improving the relevance and accuracy of the generated content.\n\n#### Key Concepts\n1. **Retrieval Mechanism**: RAG employs a retrieval system to fetch relevant documents or pieces of information from a large corpus based on the input query. This is typically done using techniques such as vector embeddings or traditional keyword-based search.\n\n2. **Generation Mechanism**: After retrieving relevant information, a generative model (often based on architectures like Transformers) synthesizes this information into coherent and contextually appropriate responses. The generative model can be fine-tuned to ensure

In [48]:
display(Markdown(result))

### Report on Retrieval-Augmented Generation (RAG)

#### Introduction
Retrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge bases or documents during the generation process, thereby improving the relevance and accuracy of the generated content.

#### Key Concepts
1. **Retrieval Mechanism**: RAG employs a retrieval system to fetch relevant documents or pieces of information from a large corpus based on the input query. This is typically done using techniques such as vector embeddings or traditional keyword-based search.

2. **Generation Mechanism**: After retrieving relevant information, a generative model (often based on architectures like Transformers) synthesizes this information into coherent and contextually appropriate responses. The generative model can be fine-tuned to ensure that the output aligns with the desired style and tone.

3. **Hybrid Approach**: By integrating retrieval and generation, RAG leverages the vast amount of information available in external databases while maintaining the fluency and creativity of generative models. This hybrid approach allows for more informed and contextually rich outputs.

#### Applications
- **Question Answering**: RAG is particularly effective in question-answering systems where users seek specific information. By retrieving relevant documents, the model can provide accurate answers backed by credible sources.
- **Content Creation**: In content generation tasks, RAG can pull in facts, statistics, or quotes from external sources, enriching the generated text and making it more informative.
- **Conversational Agents**: RAG can enhance chatbots and virtual assistants by allowing them to access up-to-date information, improving their ability to handle diverse queries.

#### Advantages
- **Improved Accuracy**: By grounding responses in retrieved information, RAG reduces the likelihood of generating incorrect or misleading content.
- **Dynamic Knowledge**: RAG can adapt to new information without the need for retraining the entire model, as it can access updated external databases.
- **Contextual Relevance**: The retrieval process ensures that the generated content is relevant to the specific context of the query.

#### Challenges
- **Latency**: The retrieval process can introduce latency, which may affect the responsiveness of applications, especially in real-time systems.
- **Quality of Retrieved Information**: The effectiveness of RAG heavily depends on the quality and relevance of the retrieved documents. Poor retrieval can lead to suboptimal generation.
- **Complexity**: Implementing a RAG system requires expertise in both retrieval and generation techniques, making it more complex than traditional models.

#### Conclusion
Retrieval-Augmented Generation represents a significant advancement in the field of natural language processing, merging the capabilities of information retrieval with generative models. Its ability to produce accurate, contextually relevant, and informative content makes it a valuable tool across various applications, from customer support to content creation. As research and development in this area continue, RAG is poised to play a crucial role in the evolution of intelligent systems that require both knowledge and creativity.

### 4.1 Como funciona el operador Pipe?

Antes de pasar a otras funcionalidades de LCEL, tomemos un momento para entender qué hace el operador pipe | y cómo funciona.

Desde el punto de vista funcional, el operador pipe indica que todo lo que se produzca en el lado izquierdo se pasará como entrada al lado derecho. En el ejemplo prompt | llm | output_parser, vemos que prompt se pasa como entrada a llm, y luego su salida se pasa a output_parser.

El operador pipe es una forma de encadenar componentes, y expresa que la salida del lado izquierdo será la entrada del lado derecho.

Vamos a crear una clase básica llamada Runnable que transformará una función proporcionada en una clase ejecutable, que luego podremos usar con el operador pipe |.

Con la clase Runnable, podremos envolver una función dentro de la clase, lo que nos permitirá luego encadenar múltiples de estas funciones ejecutables utilizando el método __or__.

Primero, vamos a crear algunas funciones que encadenaremos:

In [49]:
class Runnable:
    def __init__(self, func):
        self.func = func
    def __or__(self, other):
        def chained_func(*args, **kwargs):
            return other.invoke(self.func(*args, **kwargs))
        return Runnable(chained_func)
    def invoke(self, *args, **kwargs):
        return self.func(*args, **kwargs)

def add_five(x):
    return x+5

def sub_five(x):
    return x-5

def mul_five(x):
    return x*5

add_five_runnable = Runnable(add_five)
sub_five_runnable = Runnable(sub_five)
mul_five_runnable = Runnable(mul_five)

In [50]:
chain = (add_five_runnable).__or__(sub_five_runnable).__or__(mul_five_runnable)

chain.invoke(3)

15

In [51]:
chain = add_five_runnable | sub_five_runnable | mul_five_runnable

chain.invoke(3)

15

### 4.2 LCEL `RunnableLambda`

La clase RunnableLambda es el método incorporado de LangChain para construir un objeto runnable (ejecutable) a partir de una función. Es decir, hace lo mismo que la clase personalizada Runnable que creamos anteriormente. Vamos a probarla con las mismas funciones que antes.

In [52]:
from langchain_core.runnables import RunnableLambda

add_five_runnable = RunnableLambda(add_five)
sub_five_runnable = RunnableLambda(sub_five)
mul_five_runnable = RunnableLambda(mul_five)

chain = add_five_runnable | sub_five_runnable | mul_five_runnable

chain.invoke(3)

15

In [53]:
prompt_str = "give me a small report about {topic}"
prompt = PromptTemplate(
    input_variables=["topic"],
    template=prompt_str
)

chain = prompt | llm | output_parser

result = chain.invoke("AI")
display(Markdown(result))

### Report on Artificial Intelligence (AI)

#### Introduction
Artificial Intelligence (AI) refers to the simulation of human intelligence in machines programmed to think and learn like humans. It encompasses a variety of technologies and methodologies, including machine learning, natural language processing, robotics, and computer vision. AI has rapidly evolved over the past few decades, becoming an integral part of various industries and daily life.

#### Current State of AI
As of 2023, AI technologies are being utilized across multiple sectors, including healthcare, finance, transportation, and entertainment. Key advancements include:

1. **Machine Learning (ML)**: Algorithms that enable computers to learn from and make predictions based on data. Deep learning, a subset of ML, has led to breakthroughs in image and speech recognition.

2. **Natural Language Processing (NLP)**: Technologies that allow machines to understand and respond to human language. Applications include chatbots, virtual assistants, and language translation services.

3. **Computer Vision**: The ability of machines to interpret and make decisions based on visual data. This technology is widely used in facial recognition, autonomous vehicles, and medical imaging.

4. **Robotics**: AI-driven robots are increasingly used in manufacturing, logistics, and even healthcare, performing tasks ranging from assembly line work to surgical assistance.

#### Applications of AI
- **Healthcare**: AI is used for diagnostics, personalized medicine, and predictive analytics, improving patient outcomes and operational efficiency.
- **Finance**: AI algorithms analyze market trends, detect fraud, and automate trading, enhancing decision-making and risk management.
- **Transportation**: Autonomous vehicles leverage AI for navigation and safety, while AI systems optimize traffic management and logistics.
- **Entertainment**: Streaming services use AI to recommend content based on user preferences, while video games employ AI for more realistic and adaptive gameplay.

#### Challenges and Ethical Considerations
Despite its potential, AI poses several challenges:
- **Bias and Fairness**: AI systems can perpetuate existing biases present in training data, leading to unfair outcomes.
- **Privacy Concerns**: The use of AI in data collection raises significant privacy issues, necessitating robust regulations.
- **Job Displacement**: Automation driven by AI may lead to job losses in certain sectors, prompting discussions about workforce retraining and economic impact.
- **Security Risks**: AI technologies can be exploited for malicious purposes, including cyberattacks and misinformation campaigns.

#### Future Outlook
The future of AI is promising, with ongoing research aimed at creating more advanced, ethical, and transparent AI systems. Innovations in explainable AI (XAI) seek to make AI decision-making processes more understandable to users. Additionally, interdisciplinary collaboration will be crucial in addressing the ethical implications and societal impacts of AI.

#### Conclusion
AI is transforming industries and reshaping the way we live and work. While it offers significant benefits, it also presents challenges that require careful consideration and proactive management. As AI continues to evolve, its integration into society will necessitate a balanced approach that maximizes its potential while minimizing risks.

---

This report provides a brief overview of the current landscape of AI, its applications, challenges, and future directions.

In [54]:
# Ejemplo adicionando tareas (Se reemplazan palabras)

def extract_fact(x):
    if "\n\n" in x:
        return "\n".join(x.split("\n\n")[1:])
    else:
        return x

old_word = "AI"
new_word = "skynet"

def replace_word(x):
    return x.replace(old_word, new_word)


extract_fact_runnable = RunnableLambda(extract_fact)
replace_word_runnable = RunnableLambda(replace_word)

chain = prompt | llm | output_parser | extract_fact_runnable | replace_word_runnable

result = chain.invoke("retrieval augmented generation")
display(Markdown(result))

#### Introduction
Retrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge sources, thereby improving the accuracy and relevance of generated responses.
#### Concept Overview
RAG operates on the principle of integrating a retrieval mechanism with a generative model. The process typically involves two main components:
1. **Retrieval Component**: This part of the system retrieves relevant documents or pieces of information from a large corpus based on a given query. It utilizes techniques such as vector embeddings and similarity search to identify the most pertinent data.
2. **Generation Component**: After retrieving the relevant information, the generative model (often based on architectures like Transformers) synthesizes a coherent and contextually appropriate response. This model leverages the retrieved data to enhance its output, ensuring that the generated text is not only fluent but also factually grounded.
#### Advantages
- **Enhanced Knowledge Access**: By incorporating external knowledge, RAG can provide more accurate and contextually relevant information than standalone generative models.
- **Dynamic Updates**: The retrieval component allows the system to access up-to-date information, making it adaptable to new data without the need for retraining the entire model.
- **Improved Contextuality**: RAG can generate responses that are more aligned with user queries, as it bases its output on specific, relevant information retrieved in real-time.
#### Applications
RAG has a wide range of applications, including:
- **Question Answering**: Providing precise answers to user queries by retrieving relevant documents and generating responses based on them.
- **Chatbots and Virtual Assistants**: Enhancing conversational agents with the ability to pull in real-time information, making interactions more informative and engaging.
- **Content Creation**: Assisting writers and content creators by providing relevant data and context, thereby improving the quality of generated content.
#### Challenges
Despite its advantages, RAG faces several challenges:
- **Retrieval Quality**: The effectiveness of the system heavily relies on the quality of the retrieval component. Poor retrieval can lead to irrelevant or incorrect information being used in generation.
- **Complexity**: Integrating retrieval and generation components can increase the complexity of the system, requiring careful tuning and optimization.
- **Latency**: The retrieval process can introduce latency, which may affect the responsiveness of applications, particularly in real-time scenarios.
#### Conclusion
Retrieval-Augmented Generation represents a significant advancement in the field of natural language processing. By effectively combining retrieval and generation, RAG systems can produce more accurate, relevant, and contextually aware outputs. As research and development in this area continue to evolve, RAG is poised to play a crucial role in enhancing various applications, from conversational agents to content generation tools. Future work will likely focus on improving retrieval mechanisms, reducing latency, and addressing the challenges associated with integrating these two components.

### 4.3 LCEL `RunnableParallel` and `RunnablePassthrough`

LCEL nos proporciona varias clases Runnable que nos permiten controlar el flujo de datos y el orden de ejecución a través de nuestras cadenas. Dos de estas son RunnableParallel y RunnablePassthrough.

RunnableParallel — nos permite ejecutar múltiples instancias de Runnable en paralelo, actuando casi como una bifurcación en Y dentro de la cadena.

RunnablePassthrough — nos permite pasar una variable a la siguiente Runnable sin modificarla.

In [55]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import DocArrayInMemorySearch

embedding = OpenAIEmbeddings()

vecstore_a = DocArrayInMemorySearch.from_texts(
    [
        "half the info is here",
        "DeepSeek-V3 was released in December 2024"
    ],
    embedding=embedding
)
vecstore_b = DocArrayInMemorySearch.from_texts(
    [
        "the other half of the info is here",
        "the DeepSeek-V3 LLM is a mixture of experts model with 671B parameters"
    ],
    embedding=embedding
)

  embedding = OpenAIEmbeddings()
  """


In [56]:
# Here you can see the prompt does have three inputs, two for context and one for the question itself.

prompt_str = """Using the context provided, answer the user's question.
Context:
{context_a}
{context_b}
"""

In [57]:
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

prompt = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(prompt_str),
    HumanMessagePromptTemplate.from_template("{question}")
])

In [58]:
from langchain_core.runnables import RunnablePassthrough, RunnableParallel

retriever_a = vecstore_a.as_retriever()
retriever_b = vecstore_b.as_retriever()

retrieval = RunnableParallel(
    {
        "context_a": retriever_a, "context_b": retriever_b, "question": RunnablePassthrough()
    }
)

La cadena que construiremos se verá algo como esto:

![](https://github.com/aurelio-labs/langchain-course/blob/main/assets/lcel-flow.png?raw=1)

In [59]:
chain = retrieval | prompt | llm | output_parser

In [None]:
result = chain.invoke(
    "what architecture does the model DeepSeek released in december use?"
)
result

'The DeepSeek-V3 model, released in December 2024, is a mixture of experts model with 671 billion parameters.'