# Stage 2 Usage Examples (LangChain Reasoning & Router)

This notebook demonstrates how to use the Stage 2 chains to query the synthetic payments dataset and obtain concise summaries.

- Routed Q&A via `router_chain.ask()`
- Direct code generation and execution via `query_chain`
- Summarization via `summary_chain`
- Optional: enabling LLM with `OPENAI_API_KEY` (otherwise heuristic fallback is used)


In [1]:
# optional: load .env for local development
#%pip install -q python-dotenv
from dotenv import load_dotenv
load_dotenv()  # loads variables from .env into the process env


True

In [2]:
import logging
logging.basicConfig(level=logging.INFO, format='[%(name)s] %(levelname)s: %(message)s')

In [3]:
# Setup imports and paths
import os, sys
from pathlib import Path
import pandas as pd

# Ensure project root on path
PROJECT_ROOT = str(Path.cwd().parent)
if PROJECT_ROOT not in sys.path:
    sys.path.append(PROJECT_ROOT)

print('Project root:', PROJECT_ROOT)
print('OPENAI_API_KEY set:', bool(os.getenv('OPENAI_API_KEY')))

DATA_PATH = str(Path(PROJECT_ROOT) / 'data' / 'payments.csv')
print('Data path:', DATA_PATH)


Project root: /home/andres/Documents/chat-payments
OPENAI_API_KEY set: True
Data path: /home/andres/Documents/chat-payments/data/payments.csv


In [4]:
# print("LLM_PROVIDER =", os.getenv("LLM_PROVIDER"))
# print("LLM_MODEL    =", os.getenv("LLM_MODEL"))
# print("GROQ_API_KEY set?", bool(os.getenv("GROQ_API_KEY")))

In [5]:
# provider = "groq"
# model = "llama-3.3-70b-versatile"

os.environ["LLM_PROVIDER"] = "groq"
os.environ["LLM_MODEL"] = "llama-3.3-70b-versatile"


In [6]:
from src.chains.router_chain import ask
# resp = ask("Quick test?", provider=provider, model=model)
resp = ask("Quick test?")
print(resp["metrics"])

[src.chains.query_chain] INFO: QueryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.query_chain] INFO: QueryChain using LLM for question: Quick test?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[src.chains.summary_chain] INFO: SummaryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.summary_chain] INFO: SummaryChain using LLM for question: Quick test?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


[router] llm_used=True provider=groq model=llama-3.3-70b-versatile
{'latency_ms': 2077, 'llm_used': True, 'llm_provider': 'groq', 'llm_model': 'llama-3.3-70b-versatile'}


In [6]:
# Load dataset preview
df = pd.read_csv(DATA_PATH, parse_dates=['timestamp'])
print(f"Rows: {len(df):,} | Columns: {len(df.columns)}")
df.head(3)


Rows: 10,000 | Columns: 11


Unnamed: 0,transaction_id,user_id,segment,country,merchant,category,amount,timestamp,device_type,is_refunded,is_fraudulent
0,879463009885,208,SMB,AR,Microsoft,electronics,126.02,2023-01-01 01:04:39+00:00,tablet,False,False
1,404737561214,183,consumer,US,Telcel,telco,48.31,2023-01-01 03:08:19+00:00,tablet,False,False
2,12503321040,155,consumer,BR,Oxxo,retail,10.56,2023-01-01 08:52:04+00:00,mobile,False,False


In [7]:
# Example 1: Routed data Q&A

q1 = "Which merchants had the highest total revenue last month?"
# resp1 = ask(q1, provider=provider, model=model)
resp1 = ask(q1)
print('Route:', resp1['route'])
print('Answer:', resp1['answer'])

# Convert table (list of dicts) back into a DataFrame for display
if resp1.get('table'):
    pd.DataFrame(resp1['table']).head(10)


[src.chains.query_chain] INFO: QueryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.query_chain] INFO: QueryChain using LLM for question: Which merchants had the highest total revenue last month?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[src.chains.summary_chain] INFO: SummaryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.summary_chain] INFO: SummaryChain using LLM for question: Which merchants had the highest total revenue last month?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


[router] llm_used=True provider=groq model=llama-3.3-70b-versatile
Route: data
Answer: BestBuy led last month with $1,719.16 in revenue, followed by Apple at $1,600.81, and Delta at $1,307.85.


In [8]:
# Example 2: Another routed question
q2 = "Which country has the highest average transaction amount?"
# resp2 = ask(q2, provider=provider, model=model)
resp2 = ask(q2)
print('Route:', resp2['route'])
print('Answer:', resp2['answer'])

if resp2.get('table'):
    pd.DataFrame(resp2['table']).head(10)


[src.chains.query_chain] INFO: QueryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.query_chain] INFO: QueryChain using LLM for question: Which country has the highest average transaction amount?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[src.chains.summary_chain] INFO: SummaryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.summary_chain] INFO: SummaryChain using LLM for question: Which country has the highest average transaction amount?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


[router] llm_used=True provider=groq model=llama-3.3-70b-versatile
Route: data
Answer: The US has the highest average transaction amount at $49.84.


#### Tendría que usar QueryChain en lugar de esto?

In [9]:
# Example 3: Direct query chain usage
from src.chains.query_chain import run as run_query

q3 = "Top 10 merchants by total revenue"
res3 = run_query(q3)

print('Keys:', list(res3.keys()))

# Reconstruct DataFrame from 'split' orient
split = res3['table']
df3 = pd.DataFrame(split['data'], columns=split['columns'])
df3.head(10)


[src.chains.query_chain] INFO: QueryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.query_chain] INFO: QueryChain using LLM for question: Top 10 merchants by total revenue
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


Keys: ['answer', 'table']


Unnamed: 0,merchant,amount
0,Apple,75567.62
1,BestBuy,57285.5
2,Microsoft,41248.78
3,Delta,34953.35
4,Airbnb,32691.88
5,Amazon,28881.65
6,Walmart,24177.42
7,MercadoLibre,19230.28
8,Shell,15710.95
9,Alibaba,14354.27


In [10]:
# Example 4: Summarize a result explicitly
from src.chains.summary_chain import SummaryChain

summary = SummaryChain().run(q3, df3)
print(summary)


[src.chains.summary_chain] INFO: SummaryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.summary_chain] INFO: SummaryChain using LLM for question: Top 10 merchants by total revenue
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


Apple led with $75.6K in revenue, followed by BestBuy at $57.3K, and Microsoft at $41.2K, rounding out the top 3 of the 10 merchants listed.


In [13]:
# Example 5 (optional): Async ask()
# If the notebook environment has an event loop, fallback to sync.
try:
    import asyncio
    from src.chains.router_chain import ask_async
    
    def run_async_example():
        q = "What was the total payment volume last week?"
        try:
            resp = asyncio.run(ask_async(q))
        except RuntimeError:
            # Fallback if event loop is already running
            resp = ask(q)
        print('Route:', resp['route'])
        print('Answer:', resp['answer'])
        return resp
    
    _ = run_async_example()
except Exception as e:
    print('Async example skipped:', e)


[src.chains.query_chain] INFO: QueryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.query_chain] INFO: QueryChain using LLM for question: What was the total payment volume last week?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[src.chains.query_chain] INFO: Retrying with stricter instruction due to exec error: 'numpy.float64' object has no attribute 'to_frame'
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[src.chains.summary_chain] INFO: SummaryChain initialized with LLM provider=groq model=llama-3.3-70b-versatile
[src.chains.summary_chain] INFO: SummaryChain using LLM for question: What was the total payment volume last week?
[httpx] INFO: HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


[router] llm_used=True provider=groq model=llama-3.3-70b-versatile
Route: data
Answer: The total payment volume last week was approximately $2,993.80.


  resp = ask(q)
