## Introduction to Creating RAGs (Retrieval-Augmented Generators) with OpenAI

## 1. - Langchan LLM Chain Tutorial

## 1.1 - Build a basic agent

In [None]:
%pip install -U langchain langchain-openai langchain-pinecone pinecone-client 

Collecting langchain
  Using cached langchain-1.2.10-py3-none-any.whl.metadata (5.7 kB)
Collecting langchain-community
  Using cached langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain-core<2.0.0,>=1.2.10 (from langchain)
  Using cached langchain_core-1.2.14-py3-none-any.whl.metadata (4.4 kB)
Collecting langgraph<1.1.0,>=1.0.8 (from langchain)
  Using cached langgraph-1.0.9-py3-none-any.whl.metadata (7.4 kB)
Collecting pydantic<3.0.0,>=2.7.4 (from langchain)
  Using cached pydantic-2.12.5-py3-none-any.whl.metadata (90 kB)
Collecting langchain-classic<2.0.0,>=1.0.0 (from langchain-community)
  Using cached langchain_classic-1.0.1-py3-none-any.whl.metadata (4.2 kB)
Collecting SQLAlchemy<3.0.0,>=1.4.0 (from langchain-community)
  Using cached sqlalchemy-2.0.46-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (9.5 kB)
Collecting PyYAML<7.0.0,>=5.3.0 (from langchain-community)
  Using cached pyyaml-6.0.3-cp312-cp312-man

In [5]:
%pip install langchain langchain-text-splitters langchain-community bs4

Collecting bs4
  Using cached bs4-0.0.2-py2.py3-none-any.whl.metadata (411 bytes)
Collecting beautifulsoup4 (from bs4)
  Using cached beautifulsoup4-4.14.3-py3-none-any.whl.metadata (3.8 kB)
Collecting soupsieve>=1.6.1 (from beautifulsoup4->bs4)
  Using cached soupsieve-2.8.3-py3-none-any.whl.metadata (4.6 kB)
Using cached bs4-0.0.2-py2.py3-none-any.whl (1.2 kB)
Using cached beautifulsoup4-4.14.3-py3-none-any.whl (107 kB)
Using cached soupsieve-2.8.3-py3-none-any.whl (37 kB)
Installing collected packages: soupsieve, beautifulsoup4, bs4
Successfully installed beautifulsoup4-4.14.3 bs4-0.0.2 soupsieve-2.8.3

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m26.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [6]:
%pip install -qU langchain-google-genai


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m26.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [7]:
from langchain.agents import create_agent
import os
import getpass

os.environ["GOOGLE_API_KEY"] = os.environ.get("GOOGLE_API_KEY")

After Installing the necessary packages provided by langchain for creating basic agents, we need to ensure that it is reading the environment variables where our API key is located when using the LLM.


After this we will define the tools that the agent will use, in this case it will predict what the weather is in a city, after this we create our agent, selecting the model, passing it the tools and a prompt to manage it, we invoke it obtaining the respective response.

In [8]:
def get_weather(city: str) -> str:
    """Get weather for a given city."""
    return f"It's always sunny in {city}!"

agent = create_agent(
    model="google_genai:gemini-2.5-flash-lite",
    tools=[get_weather],
    system_prompt="You are a helpful assistant",

)

agent.invoke(
    {"messages": [{"role": "user", "content": "what is the weather in sf"}]}
)

{'messages': [HumanMessage(content='what is the weather in sf', additional_kwargs={}, response_metadata={}, id='ebf68db7-1bf4-4aa1-8af4-a23a0311a60c'),
  AIMessage(content='', additional_kwargs={'function_call': {'name': 'get_weather', 'arguments': '{"city": "sf"}'}}, response_metadata={'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash-lite', 'safety_ratings': [], 'model_provider': 'google_genai'}, id='lc_run--019c859e-4094-73e0-ae63-80def96a9e89-0', tool_calls=[{'name': 'get_weather', 'args': {'city': 'sf'}, 'id': '8afd6c7c-c56a-40b6-b984-dd1e471d4676', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 51, 'output_tokens': 15, 'total_tokens': 66, 'input_token_details': {'cache_read': 0}}),
  ToolMessage(content="It's always sunny in sf!", name='get_weather', id='e5187e44-ccf5-4c45-8481-9829cf28efa5', tool_call_id='8afd6c7c-c56a-40b6-b984-dd1e471d4676'),
  AIMessage(content='', additional_kwargs={}, response_metadata={'finish_reason': 'STOP', 'model_n

## 1.2 - Build a real-world agent

In [9]:
SYSTEM_PROMPT = """You are an expert weather forecaster, who speaks in puns.

You have access to two tools:

- get_weather_for_location: use this to get the weather for a specific location
- get_user_location: use this to get the user's location

If a user asks you for the weather, make sure you know the location. 
If you can tell from the question that they mean wherever they are, use the get_user_location tool to find their location."""

The SYSTEM_PROMPT we previously passed on has the function of limiting the agent and not doing anything that is not requested, so in this case we are going to reframe it by saying that it is an expert like a weather forecaster and that it only uses the tools we provide.


We can make use of the @dataclass annotation which provides us with a constructor, equals, and a string method from an attribute

In [10]:
from dataclasses import dataclass
from langchain.tools import tool, ToolRuntime

@tool
def get_weather_for_location(city: str) -> str:
    """Get weather for a given city. """
    return f"It's always sunny in {city}!"

@dataclass
class Context:
    """Custom runtime context schema. """
    user_id: str


@tool
def get_user_location(runtime: ToolRuntime[Context]) -> str:
    """Retrieve user information based on user ID."""
    user_id = runtime.context.user_id
    return "Florida" if user_id == "1" else "SF"

After this, we restarted the model by specifying parameters and a maximum number of tokens.

In [11]:
from langchain.chat_models import init_chat_model

model = init_chat_model(
    "google_genai:gemini-2.5-flash-lite",
    temperature=0.5,
    timeout=10,
    max_tokens=10000
)

We define a class that will handle the response that it is returned to it.

In [12]:
@dataclass 
class ResponseFormat:
    """Response schema for the agent."""
    punny_response: str
    weather_conditions: str | None = None

We can use this InMemorySaver method to give memory to the agent.

In [13]:
from langgraph.checkpoint.memory import InMemorySaver

checkpointer = InMemorySaver()

After this, we finished by creating the agent, providing the tools, invoking it, and reviewing the response it returned.

In [14]:
from langchain.agents.structured_output import ToolStrategy 

agent = create_agent(
    model=model,
    system_prompt=SYSTEM_PROMPT,
    tools=[get_weather_for_location, get_user_location],
    context_schema=Context,
    response_format=ToolStrategy(ResponseFormat),
    checkpointer=checkpointer
)

config = {"configurable": {"thread_id": "1"}}

response = agent.invoke(
    {"messages": [{"role": "user", "content": "what is the weather outside?"}]},
    config=config,
    context=Context(user_id="1")
)

print(response['structured_response'])

response = agent.invoke(
    {"messages": [{"role": "user", "content": "thank you"}]},
    config = config,
    context = Context(user_id="1")
)

print(response['structured_response'])

ResponseFormat(punny_response='It is always sunny in Florida! I-4 corridor. What a "weather" you are in!', weather_conditions='Sunny')
ResponseFormat(punny_response="You're welcome! If you ever need more weather updates, don't hesitate to ask. I'm always here to shed some light on the situation!", weather_conditions='Sunny')
