docs: add Enterprise Text-to-SQL and Search Agent recipes#395
docs: add Enterprise Text-to-SQL and Search Agent recipes#395dhruvnathawani merged 8 commits intomainfrom
Conversation
Add two new recipes derived from dev notes: - Enterprise Text-to-SQL: dialect-specific SQL generation with distractor table/column injection, dirty data handling, conditional sampling, and multi-judge scoring (from text-to-sql dev note) - Search Agent: multi-turn deep research trajectories using BM25 retriever MCP server with search/open/find tools and LLM judge rejection sampling (from deep-research-trajectories dev note) Made-with: Cursor
Switch from LocalStdioMCPProvider with bm25s to Tavily's hosted MCP endpoint (streamable_http). Removes bm25s/PyStemmer/mcp dependencies from the recipe, simplifies the code, and matches the battle-tested pattern from the search agent dev note and GTC notebooks.
Greptile SummaryThis PR adds two production-grade recipe scripts and their corresponding documentation pages: Enterprise Text-to-SQL (a five-stage pipeline with distractor injection, dirty data, dialect-specific SQL, and 5 LLM judges producing 15 score columns) and Search Agent (a Tavily-powered MCP pipeline generating multi-turn BrowseComp-style search trajectories from Wikidata seeds). Both recipes follow established repo conventions and integrate cleanly into the MkDocs navigation. Key observations:
|
| Filename | Overview |
|---|---|
| docs/assets/recipes/code_generation/enterprise_text_to_sql.py | New 929-line recipe for enterprise text-to-SQL; five-stage pipeline (seed → prompt → schema → SQL → 5 LLM judges) is well-structured, score extraction math is correct (15 columns across 5 judges), and CLI follows existing conventions. No issues found. |
| docs/assets/recipes/mcp_and_tooluse/search_agent.py | New search-agent recipe using Tavily MCP; two issues: TAVILY_API_KEY validated with is None instead of truthiness check (misses empty-string case), and --artifact-path is exposed in help output contrary to the argparse.SUPPRESS convention used by all other MCP recipes. |
| docs/recipes/code_generation/enterprise_text_to_sql.md | Recipe doc page; references a dev note (engineering-an-enterprise-grade-text-to-sql-dataset-with-nemo-data-designer) that does not exist in the repo — will produce a 404 when deployed before that post is published. |
| docs/recipes/mcp_and_tooluse/search_agent.md | Recipe doc page; previously missing H1 title is now fixed. References a dev note (search-agent-sft-data-teaching-llms-to-browse-the-web) that does not exist in the repo — same forward-link 404 risk as the enterprise_text_to_sql doc. |
Sequence Diagram
sequenceDiagram
participant S as Seed / Sampler
participant L as LLM
participant V as Validator
participant J as LLM Judges
participant T as Tavily MCP
rect rgb(30, 60, 90)
Note over S,J: Enterprise Text-to-SQL Pipeline
S->>S: Stage 1 — Category/Subcategory sampling<br/>(industry, topic, sql_complexity, sql_concept,<br/>sql_task_type, data_quality, knowledge, style)
S->>L: Stage 2 — Generate sql_prompt (NL request)
L->>L: Stage 3 — Generate sql_context (DDL + INSERT<br/>core tables + distractor tables + dirty data)
L->>L: Stage 4 — Generate sql (dialect-specific SQL)
L->>V: Stage 5a — SQL syntax validation (SQLite/MySQL/PG)
L->>J: Stage 5b — 5 LLM judges → 15 score columns
end
rect rgb(60, 30, 90)
Note over S,T: Search Agent Pipeline
S->>S: Stage 1 — Wikidata KG seed rows<br/>(seed_entity, final_answer_entity, readable_path)
S->>L: Stage 2a — Draft multi-hop search riddle
L->>L: Stage 2b — BrowseComp-style obfuscation
L->>T: Stage 3 — Agent loop: tavily_search calls<br/>(max 25 turns, 300 s timeout, ALL_MESSAGES trace)
T-->>L: Search results (observations)
L->>L: Stage 4 — Normalize raw output → AgentSolution JSON
end
Comments Outside Diff (3)
-
docs/assets/recipes/mcp_and_tooluse/search_agent.py, line 359 (link)--artifact-pathshould be suppressed from help outputThe other MCP-based recipes —
pdf_qa.py(line 528) andbasic_mcp.py(line 199) — both passhelp=argparse.SUPPRESSfor--artifact-pathbecause it's an internal argument used only formake test-run-recipescompatibility and not intended to be surfaced to end users.search_agent.pyis inconsistent with this pattern by exposing the argument in the help output. -
docs/recipes/code_generation/enterprise_text_to_sql.md, line 4 (link)Referenced dev note does not exist yet
The link
../../devnotes/engineering-an-enterprise-grade-text-to-sql-dataset-with-nemo-data-designer/points to a dev note that is not present in the repository. Searchingdocs/devnotes/posts/shows only four existing posts:deep-research-trajectories.md,design-principles.md,rqa.md, andstructured-outputs-from-nemotron.md. There is no post with the slugengineering-an-enterprise-grade-text-to-sql-dataset-with-nemo-data-designer.If this recipe page is deployed before the corresponding dev note is merged, all users who click the "Dev Note" link will hit a 404. The same issue applies to
search_agent.md→search-agent-sft-data-teaching-llms-to-browse-the-web.Either create the dev note files in this PR, or replace the links with a forward-looking note until the dev notes are published.
-
docs/assets/recipes/mcp_and_tooluse/search_agent.py, line 367-368 (link)TAVILY_API_KEYvalidation misses empty-string caseos.environ.get("TAVILY_API_KEY") is Noneonly catches a completely absent variable. If a user sets the variable to an empty string"", theis Nonecheck passes andbuild_config()constructs an MCP endpoint URL with an empty API key value, which will fail at the Tavily API level rather than with the clearRuntimeErrorhere.
Last reviewed commit: 7676f8d
- Add ASCII pipeline diagram to docstring - Add all 5 LLM judges (prompt, SQL, context, data quality, knowledge) with production rubrics (15 scoring dimensions) - Expand samplers: 10 industries/50 topics, conditional task types, data quality concepts, knowledge dependency concepts - Use dialect-specific prompts for schema and SQL generation - Extract all 15 judge scores into flat columns - Remove dev note references; recipe is fully standalone
- Rename recipes to "Nemotron Super Text to SQL" and "Nemotron Super Search Agent" across nav, cards, headings, and docstrings - Add Nemotron Super training context to Python docstrings (BIRD benchmark results for text-to-sql, 7k trajectories for search agent) - Add dev note links as admonition boxes in recipe markdown pages - Add seed dataset guidance (required columns, generation process) to search agent recipe page
|
Tested both workflows after all the changes and working! Let me know if anything else |
johnnygreco
left a comment
There was a problem hiding this comment.
this is awesome, thanks @dhruvnathawani !!
Summary
Adds two new recipes that turn techniques from the dev notes into ready-to-run code:
Enterprise Text-to-SQL — A five-stage pipeline (seed → prompt → schema with distractors → dialect-specific SQL → validation + judges) based on the text-to-sql dev note. Demonstrates SubcategorySamplerParams for conditional sampling, distractor table/column injection, dirty data handling, per-dialect code validation (SQLite/MySQL/PostgreSQL), two LLM judges with score extraction, and prompt style diversification (instruction style × linguistic register × politeness level).
Search Agent — A Tavily-powered MCP pipeline for generating multi-turn search agent trajectories, based on the search agent SFT dev note. Seeds from Wikidata knowledge graph paths, generates BrowseComp-style obfuscated riddles through a two-stage LLM rewrite (draft → obfuscation), then runs a tool-using agent with live Tavily web search to produce full thought-action-observation trajectories captured via with_trace=dd.TraceType.ALL_MESSAGES. Uses dd.MCPProvider with Tavily's hosted streamable_http endpoint — no local server or extra dependencies needed.
Both recipes follow the existing conventions (PEP 723 script metadata, build_config/create_dataset or serve/main patterns, --model-alias/--num-records/--artifact-path CLI args for make test-run-recipes compatibility).
Files changed