### Installation

In [3]:
%pip install -qU langchain langchain-google-genai langchain-pinecone pinecone python-dotenv langchain-community bs4

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.2.1 -> 26.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [4]:
import os
from dotenv import load_dotenv

load_dotenv()

print("GOOGLE_API_KEY cargada:", "GOOGLE_API_KEY" in os.environ)
print("PINECONE_API_KEY cargada:", "PINECONE_API_KEY" in os.environ)

GOOGLE_API_KEY cargada: True
PINECONE_API_KEY cargada: True


### Document upload from the web

In [5]:
import bs4
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs = loader.load()
print(f"Documento cargado: {len(docs)} p치gina(s)")
print(f"Total de caracteres: {len(docs[0].page_content)}")

USER_AGENT environment variable not set, consider setting it to identify your requests.


Documento cargado: 1 p치gina(s)
Total de caracteres: 43047


### Separate in chunks

In [7]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)

splits = text_splitter.split_documents(docs)
print(f"Total de chunks creados: {len(splits)}")

Total de chunks creados: 63


### Create Embeddings with Gemini

In [8]:
from google import genai

client = genai.Client(api_key=os.getenv("GOOGLE_API_KEY"))

for model in client.models.list():
    if "embedContent" in (model.supported_actions or []):
        print(model.name, model.supported_actions)

models/gemini-embedding-001 ['embedContent', 'countTextTokens', 'countTokens', 'asyncBatchEmbedContent']


In [9]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/gemini-embedding-001")

# Test embedding
test = embeddings.embed_query("test de embedding")
print(f"Dimensi칩n del embedding: {len(test)}")

Dimensi칩n del embedding: 3072


### Save in Pinecone

In [10]:
%pip install pyreadline3

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.2.1 -> 26.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [14]:
import time
from langchain_pinecone import PineconeVectorStore
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

INDEX_NAME = "rag-index"
INDEX_DIMENSION = 3072

def _index_names(index_list):
    if hasattr(index_list, "names"):
        return list(index_list.names())
    names = []
    for item in index_list:
        if isinstance(item, dict) and "name" in item:
            names.append(item["name"])
        else:
            names.append(item)
    return names

existing_indexes = _index_names(pc.list_indexes())
if INDEX_NAME in existing_indexes:
    info = pc.describe_index(INDEX_NAME)
    existing_dim = getattr(info, "dimension", None)
    if existing_dim is None and isinstance(info, dict):
        existing_dim = info.get("dimension")
    if existing_dim != INDEX_DIMENSION:
        pc.delete_index(INDEX_NAME)
        while INDEX_NAME in _index_names(pc.list_indexes()):
            time.sleep(1)

if INDEX_NAME not in _index_names(pc.list_indexes()):
    pc.create_index(
        name=INDEX_NAME,
        dimension=INDEX_DIMENSION,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )
    while not pc.describe_index(INDEX_NAME).status["ready"]:
        time.sleep(1)

index = pc.Index(INDEX_NAME)

vector_store = PineconeVectorStore(embedding=embeddings, index=index)
vector_store.add_documents(splits)

print("Documentos guardados en Pinecone correctamente")

Documentos guardados en Pinecone correctamente


### Create the retriever

In [15]:
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}
)

# Test
result = retriever.invoke("What is an AI agent?")
print(f"Chunks recuperados: {len(result)}")
print("\nPrimer chunk:")
print(result[0].page_content[:300])

Chunks recuperados: 3

Primer chunk:
LLM Powered Autonomous Agents
    
Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng


Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring


### Create LLM Model

In [16]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash")

### Build the RAG Chain 

In [17]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an assistant for question-answering tasks. 
Use the following retrieved context to answer the question. 
If you don't know the answer, say that you don't know.
Keep the answer concise and clear.

Context: {context}"""),
    ("human", "{question}")
])

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

print("RAG chain creada correctamente")

RAG chain creada correctamente


### Ask questions to the RAG

In [18]:
question1 = "What is an AI agent?"
answer1 = rag_chain.invoke(question1)
print(f"Q: {question1}")
print(f"A: {answer1}")
print()

question2 = "What are the main components of an AI agent?"
answer2 = rag_chain.invoke(question2)
print(f"Q: {question2}")
print(f"A: {answer2}")
print()

question3 = "What is chain of thought prompting?"
answer3 = rag_chain.invoke(question3)
print(f"Q: {question3}")
print(f"A: {answer3}")

Q: What is an AI agent?
A: An AI agent, as described in the context, is a system that uses a Large Language Model (LLM) as its core controller or "brain." This LLM is complemented by several key components:

*   **Planning:** To break down large tasks into subgoals and refine actions through self-criticism.
*   **Memory:** Including short-term in-context learning and long-term retention via external stores.
*   **Tool use:** To call external APIs for additional information, code execution, or access to proprietary data.

Q: What are the main components of an AI agent?
A: The main components of an AI agent are:

*   **Planning**: Decomposing complex tasks into smaller steps (e.g., Chain of Thought, Tree of Thoughts).
*   **Memory**: Includes short-term (in-context learning) and long-term memory (external vector store, memory stream, retrieval model).
*   **Tool use**: Calling external APIs for additional information or capabilities.
*   **Reflection mechanism**: Synthesizing memories in