feat(grounding): Tavily web search, opt-in via TAVILY_API_KEY by heitor-am · Pull Request #7 · heitor-am/python-tutor-chatbot

heitor-am · 2026-04-19T00:51:22Z

Summary

Adds the optional grounding upgrade outlined in ADR-006: a TavilySearch tool wired into the agent when TAVILY_API_KEY is set. Reduces hallucination on the things LLMs reliably get wrong — API signatures, version-specific behaviour, library APIs.

Changes

app/tools.py — build_tools() returns [TavilySearch(max_results=3, include_answer="basic", search_depth="basic")] when the env var is set; [] otherwise. The Tavily-summarised answer + 3 sources keep the LLM-side payload around ~300 tokens vs dumping raw snippets.
app/agent.py — no change required (already computed with_grounding=bool(tools) and passed tools=build_tools()).
app/prompts/tutor.py — no change required (rule 9 already toggleable via build_system_prompt(with_grounding=)).
tests/test_tools.py — locks both branches with a fake key; defensive "grounding disabled by default" test.
tests/test_agent.py — TestSystemPromptToggle class spies on build_system_prompt to assert with_grounding matches bool(build_tools()). Catches the silent regression where the prompt would instruct the model to call a tool that doesn't exist.
docs/adr/006-grounding-via-tavily-search.md — Tavily over DuckDuckGo (snippet quality, scraper stability) / over context7 (indexes libs, not Python core) / over custom RAG over docs.python.org (out of scope).
README + PRD updated: grounding status planned → delivered, with link to ADR-006.

Operator step (after merge, optional)

To enable grounding in production:

fly secrets set TAVILY_API_KEY=tvly-... --app python-tutor-chatbot

Without the secret, the deploy keeps current v1.0.0 behaviour exactly (tool list is empty, system prompt rule 9 is dropped). The agent never lies about what it can do.

Test plan

uv run ruff check . / format --check — passes
uv run mypy app — passes (strict)
uv run pytest — 34 passed, 100% coverage on app/
Post-merge with TAVILY_API_KEY set: smoke a version-specific question (e.g. asyncio.TaskGroup) → Chainlit Step shows tavily_search call
Smoke a basic question (e.g. "o que é uma lista?") → answers directly, no tool call

Reduces hallucination on the things LLMs reliably get wrong (API signatures, version-specific behaviour, library APIs). - app/tools.py — build_tools() returns [TavilySearch(max_results=3, include_answer="basic", search_depth="basic")] when TAVILY_API_KEY is set; [] otherwise. The 3 sources + Tavily-summarised answer keep the LLM-side payload around ~300 tokens vs raw snippet dumps. - app/agent.py — already passes tools=build_tools() and computes with_grounding=bool(tools); no change needed. - app/prompts/tutor.py — already wires rule 9 conditionally via build_system_prompt(with_grounding=); no change needed. Tests: - test_tools.py updated — locks both branches (key set / unset) + defensive default-disabled test. - test_agent.py adds TestSystemPromptToggle — spies on build_system_prompt to assert with_grounding flag matches bool(build_tools()). Catches the silent regression where the prompt would tell the model to call a tool that doesn't exist. Docs: - ADR-006 — Tavily over DuckDuckGo (snippet quality, lib stability) / over context7 (mismatch — indexes libs, not Python core) / over custom RAG over docs.python.org (out of scope). - README + PRD §2.3 reflect "delivered" status. 34 tests, 100% coverage on app/. Without TAVILY_API_KEY the v1.0.0 behaviour is preserved exactly; with it the agent picks up grounding + rule 9 automatically on next session.

heitor-am merged commit 16dfb13 into main Apr 19, 2026
4 checks passed

heitor-am deleted the feat/grounding branch April 19, 2026 00:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(grounding): Tavily web search, opt-in via TAVILY_API_KEY#7

feat(grounding): Tavily web search, opt-in via TAVILY_API_KEY#7
heitor-am merged 1 commit into
mainfrom
feat/grounding

heitor-am commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

heitor-am commented Apr 19, 2026

Summary

Changes

Operator step (after merge, optional)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant