# LangChain Chatbot – Minimal RAG Example


1. **RAG workflow** – vector store + retriever + LLM chain.  
2. **Complementary tools** – webpage reader & pandas‑DataFrame agent.  
3. **One additional *custom* tool** (not from LangChain) – a simple currency‑converter.  
4. **Memory** – `ConversationBufferMemory`.  
5. **Pre‑built conversation demo** – shown at the end of notebook.




## 0  Installation & imports

In [2]:
!pip -q install langchain langchain-openai langchain-community chromadb tiktoken duckduckgo-search pandas python-dotenv requests --upgrade


[notice] A new release of pip is available: 24.3.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [4]:
# ⚠️ Uncomment the next line the first time you run to install packages
#!pip -q install langchain openai chromadb tiktoken duckduckgo-search pandas python-dotenv --upgrade

import os, json, getpass, pandas as pd
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import TextLoader, WebBaseLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.memory import ConversationBufferMemory
from langchain.tools import BaseTool, Tool
from langchain.agents import initialize_agent, AgentType
from langchain.chains import RetrievalQA
print('✅ Packages imported')


✅ Packages imported


## 1  API key
Set your `OPENAI_API_KEY` – **never commit keys to git**.

In [5]:
if 'OPENAI_API_KEY' not in os.environ:
    os.environ['OPENAI_API_KEY'] = getpass.getpass('Não inserir nada')


## 2  Create a tiny knowledge base for *OurCo* (fictional company)

In [7]:
docs = [
    ('mission.txt', 'OurCo is a fictional company that designs eco-friendly smart sneakers. '
                    'Founded in 2021, HQ in Porto Alegre, Brazil. Mission: merge technology '
                    'and sustainability for athletes of all levels.'),
    ('products.txt', 'Flagship products: EcoRun One, SolarStride and BioFoam sandals. '
                     'All shoes use 80% recycled materials and solar-charged smart insoles.'),
    ('faq.txt', 'Q: Are the shoes washable? A: Yes, cold cycle & air dry.\n'
                'Q: Battery life? A: 12 hours active usage, wireless charging pad included.')
]

import os
os.makedirs('ourco_docs', exist_ok=True)

for filename, text in docs:
    with open(f'ourco_docs/{filename}', 'w', encoding='utf-8') as f:  # 👈 encoding definido
        f.write(text)


## 3  Embed & store documents

In [11]:
!pip install -q sentence-transformers


[notice] A new release of pip is available: 24.3.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [12]:
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import DirectoryLoader, TextLoader

# 1) Carregar documentos (mesmo código)
loader = DirectoryLoader(
    'ourco_docs',
    glob='**/*.txt',
    loader_cls=TextLoader
)
documents = loader.load()

# 2) Embeddings locais (modelo leve e gratuito)
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# 3) Vetor-store
vectordb = Chroma.from_documents(
    documents,
    embeddings,
    collection_name="ourco_knowledge"
)
retriever = vectordb.as_retriever(search_kwargs={"k": 3})

print("Stored", len(documents), "docs.")


  from .autonotebook import tqdm as notebook_tqdm
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


Stored 3 docs.


## 4  RAG QA chain

In [13]:
llm = ChatOpenAI(temperature=0)
rag_chain = RetrievalQA.from_chain_type(
    llm,
    chain_type='stuff',
    retriever=retriever,
    verbose=False
)


  llm = ChatOpenAI(temperature=0)


## 5  Tools
### 5.1  Website reader (built‑in)

In [15]:
from langchain_community.document_loaders import WebBaseLoader

def read_url(url: str) -> str:
    """Download a webpage and return a short summary."""
    docs = WebBaseLoader(url).load()
    content = docs[0].page_content[:4000]          # limita a 4 k caracteres
    prompt = (
        "Summarize the following web page in a concise paragraph:\n\n"
        f"{content}"
    )
    return llm.predict(prompt)

website_tool = Tool(
    name="WebReader",
    func=read_url,
    description="Summarizes a public webpage. Input must be a valid URL."
)


### 5.2  Custom currency converter (non‑LangChain)

In [16]:
import requests, datetime, functools
@functools.lru_cache()
def fx_rate(pair:str):
    base, quote = pair.upper().split('/')
    r = requests.get(f'https://api.exchangerate.host/latest?base={base}&symbols={quote}').json()
    return r['rates'][quote]

def convert_currency(query:str):
    # expected form '100 USD to BRL'
    parts = query.upper().replace('TO','').split()
    amount, base, quote = float(parts[0]), parts[1], parts[2]
    rate = fx_rate(f'{base}/{quote}')
    return f'{amount} {base} = {amount*rate:.2f} {quote} (rate {rate:.3f})'

currency_tool = Tool(
    name='CurrencyConverter',
    func=convert_currency,
    description='Convert currency, input like "100 USD to EUR".'
)


### 5.3  DataFrame QA tool

In [18]:
# ─── DataFrame com vendas ───────────────────────────────────────────
sales = pd.DataFrame({
    'model': ['EcoRun One', 'SolarStride', 'BioFoam'],
    'q1':    [1200, 950, 760],
    'q2':    [1350, 990, 810]
})

# ─── Função de consulta ─────────────────────────────────────────────
def df_query(q: str) -> str:
    """Answer questions about the quarterly sales DataFrame."""
    table = sales.to_markdown()
    prompt = (
        f"Use the following sales table to answer the question.\n\n"
        f"{table}\n\n"
        f"Question: {q}\n"
        f"Answer:"
    )
    return llm.predict(prompt)

df_tool = Tool(
    name="SalesData",
    func=df_query,
    description='Ask about quarterly shoe sales, e.g. "total q1 pairs"'
)


## 6  Assemble conversational agent

In [19]:
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
tools=[website_tool, currency_tool, df_tool,
       Tool(name='RAG', func=lambda q: rag_chain.run(q),
            description='Use to answer any question about OurCo company, products, or FAQs.')]
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    memory=memory, verbose=False
)
print('🤖 Agent ready.')


🤖 Agent ready.


  memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
  agent = initialize_agent(


## 7  Demo conversation

In [20]:
questions = [
    'What is OurCo’s mission?',
    'How many EcoRun One shoes sold in Q2?',
    'Convert 199 USD to BRL',
]
for q in questions:
    print('\nUser:', q)
    print('Assistant:', agent.run(q))



User: What is OurCo’s mission?


  print('Assistant:', agent.run(q))


RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}