# Dallas Agent Workshop - Colab Quickstart

This notebook runs the workshop repo in Google Colab.

Steps:
1. Clone the repo + install dependencies
2. Set `OPENROUTER_API_KEY` (and optionally `TAVILY_API_KEY`)
3. Run the preflight + demo calls


## Agenda

- 5:30 - Check-in, food, networking (30)
- 6:00 - Why Agents? (Motivation & Framing) (5)
- 6:05 - Core Concepts & Architectures (5)
- 6:10 - Setup & Environment (30)
- 6:40 - Live Code Walkthrough (20)
- 7:00 - Hands-On Build Session (40)
- 7:40 - Engineering Discipline for Agents (10)
- 7:50 - Next: Virtual sessions, submitting PRs (10)
- 8:00 - Curated Resources


## Why Agents? (Motivation & Framing)

An agent is a loop that can **plan**, **use tools**, and **iterate** toward a goal. In this workshop, the agent can:
- Write Python code
- Execute it in a controlled way
- Read the result (stdout/stderr) and try again

Why this matters: lots of real work is not a single prompt. It needs multi-step problem solving, verification, and guardrails.


## Core Concepts & Architectures

We will use a simple LangGraph workflow that looks like:

`plan -> exec -> (fix -> exec)* -> finish`

Key ideas:
- **State**: the data that flows between steps (task, generated code, last run result).
- **Tools**: the controlled actions the agent can take (here: running Python; for research: web search).
- **Guardrails**: constraints that keep the agent safe/reliable (timeouts, blocked imports, "must print" requirement).
- **Evaluation mindset**: make the agent produce observable outputs so you can debug quickly (stdout, logs, reproducible steps).

During the live walkthrough, we will inspect `agent_lib.py` and connect each node to what you see on screen.


In [1]:
!git clone https://github.com/jiankunliu-ai/dallas-ai-agent-workshop.git
%cd dallas-ai-agent-workshop
!pip -q install -r requirements.txt


Cloning into 'dallas-ai-agent-workshop'...
remote: Enumerating objects: 69, done.[K
remote: Counting objects: 100% (69/69), done.[K
remote: Compressing objects: 100% (45/45), done.[K
remote: Total 69 (delta 37), reused 53 (delta 24), pack-reused 0 (from 0)[K
Receiving objects: 100% (69/69), 38.23 KiB | 9.56 MiB/s, done.
Resolving deltas: 100% (37/37), done.
/content/dallas-ai-agent-workshop
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.1/50.1 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.2/144.2 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m32.4 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
blobfile 3.2.0 requires urllib3>=2, but you have urllib3 1.

## Set API Keys

Use Colab's prompt to avoid hardcoding secrets into the notebook.

If you get a 401 later, restart the runtime and rerun this cell.


In [None]:
import os
from getpass import getpass

# openrouter_key = getpass('OPENROUTER_API_KEY: ').strip()
# if not openrouter_key:
#     raise RuntimeError('OPENROUTER_API_KEY is required')

os.environ['OPENROUTER_API_KEY'] = openrouter_key
os.environ['OPENROUTER_MODEL'] = 'arcee-ai/trinity-large-preview:free'

# tavily_key = getpass('TAVILY_API_KEY (optional, press enter to skip): ').strip()
# if tavily_key:
  os.environ['TAVILY_API_KEY'] = tavily_key


In [None]:
!python test_model.py


## 1) Sanity check: run the Python execution tool

The workshop uses a controlled Python execution tool (timeouts + basic safety blocks).
This cell verifies that it runs and returns stdout/stderr as expected.


In [None]:
from tools import run_python

run_python("print('hello from sandbox')")


## 2) Run a single agent task

This is the core loop: plan -> code -> exec -> fix (repeat) -> finish.


In [None]:
from agent_lib import run_task

task = "Write a Python function to compute Fibonacci(n) efficiently and print Fibonacci(35)."
result = run_task(task)

result['last_run']


## 3) Workshop exercises

Try a few tasks that require parsing, statistics, and basic data logic.


In [None]:
from agent_lib import run_task

tasks = [
    "Parse this CSV string and compute the average of the 'latency_ms' column:\n\nts,latency_ms\n1,120\n2,110\n3,130\n4,90\n",
    "Implement rolling z-score anomaly score for this list and print the top 3 most anomalous points: [10,11,9,10,10,200,11,10,9,10]",
    "Given a list of (user_id, event_time, event_type), compute per-user session counts (30-min gap) and print a dict."
]

for t in tasks:
    print('\n' + '='*80)
    print('TASK:', t)
    out = run_task(t)
    print('OK:', out['last_run']['ok'])
    print('STDOUT:\n', out['last_run']['stdout'])
    if not out['last_run']['ok']:
        print('STDERR:\n', out['last_run']['stderr'])


## 3a) Advanced: tighten/loosen execution policy

Optional: inspect `tools.py` to see what imports/calls are blocked, and discuss tradeoffs.
In a real product you would add stronger sandboxing, allowlists, and auditing.


## 4) Applied Exercise: Research Agent

Unlike the code-execution agent, this agent:

- Plans multi-step searches
- Gathers information from the web via Tavily
- Synthesizes findings into a structured report

Use case: competitive intelligence, market research, due diligence


In [None]:
from research_agent import run_research

question = "What are the top 3 AI chip companies in 2024 and what's their competitive advantage?"

print(f"RESEARCH QUESTION:\n{question}\n")

result = run_research(question)

print('\n' + '='*60)
print('FINAL REPORT:')
print('='*60)
print(result["report"])


## 5) Engineering Discipline for Agents

Agents feel "magical" until they fail. The fastest way to make them reliable is good engineering hygiene:

- **State management**: write down what the agent knows (inputs/outputs) and pass it explicitly.
- **Context management**: keep prompts short and structured; include only what is necessary; summarize when needed.
- **Memory (optional)**: decide what should persist across runs (none vs. per-session vs. long-term).
- **Token budgeting**: constrain output formats; avoid dumping large logs or huge documents into the model.
- **Governance & safety**: limit tools and permissions; log tool calls; treat credentials carefully; assume untrusted outputs.

In this repo, we keep things workshop-safe by forcing observable stdout, logging generated code, adding timeouts, and blocking risky Python calls.


## 7) Next: Virtual sessions + Submitting PRs

Stretch goals (great for follow-up sessions):
- Add more tools (file I/O, data APIs) with careful safety boundaries
- Turn the workflow into multiple collaborating agents (planner + specialists)
- Add lightweight evaluations (golden tests, regression prompts, success criteria)

If you improve the workshop materials, please open a PR against this repo.


## 8) Curated Resources

- Prompting best practices (Claude): https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices
- LangGraph docs: https://langchain-ai.github.io/langgraph/
- OpenRouter docs: https://openrouter.ai/docs

Tip: when learning, keep a small set of repeatable test prompts (like `2+2`, Fibonacci, CSV parse) to validate changes quickly.
