feat(docs-pipeline): source grounding, compiler gate, and pipeline fixes#2
Conversation
…e fixes Implements the Autonomous Self-Demonstrating Simulation Engine architecture for the DocumentationWriterBot. Key changes: Source Code Grounding: - Add GroundSourceCode action to extract verbatim API source from repos - Fix Enum.find_value bug where `cat` error messages from wrong repo were accepted as valid content (test -f gate + text guard) - Reduce @max_lines to 250 to avoid 414 URI Too Long from Sprites API Compiler Gate (EvaluateLivebookDraft): - New action that extracts ```elixir blocks and runs them in sprite sandbox - Fix output capture: append `|| true` so non-zero exit codes don't lose the actual compilation error trace - Add trace truncation (4KB tail) so Mix.install noise doesn't blow up the critic prompt payload - Fix has_error_indicators? false positive on bare "failed" match - Increase eval timeout to 120s for Mix.install + compilation Pipeline Workflow: - Wire ParseContentBrief, ResolveOutputPath, GroundSourceCode, EvaluateLivebookDraft (V1+V2), and EmbedInteractiveDemo into DAG - Add content_metadata and prompt_overrides to build_intake - Add docs-specific keys to shared @pipeline_keys in helpers.ex - Fix ParseContentBrief to preserve intake metadata when brief has no frontmatter (already split by mix task) Writer/Critic Quality: - Enhanced writer prompt with NimblePublisher format enforcement, frontmatter template, NimbleOptions warnings, Jido.Agent.set/2 guidance - Enhanced critic prompt with compiler trace section for deterministic error feedback - Fix Observer.stop crash (rescue -> catch :exit for GenServer.stop) Infrastructure: - Add Dockerfile.docs for Linux-only erlexec builds - Add mix jido_lib.github.docs.pipeline task with --max-revisions, --provider, --dry-run options - Add ContentPlan module for Elixir map frontmatter parsing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5924ae58d2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| [frontmatter_str, body] -> | ||
| if String.starts_with?(String.trim(frontmatter_str), "%{") do | ||
| try do | ||
| {metadata, _binding} = Code.eval_string(frontmatter_str) |
There was a problem hiding this comment.
Parse frontmatter without evaluating arbitrary Elixir
ContentPlan.parse_string!/1 executes frontmatter via Code.eval_string/1, so any content plan processed by mix jido_lib.github.docs.pipeline, generate_all, or scan_drift can run arbitrary BEAM code during parsing (for example System.cmd/3) instead of being treated as data. This is a code-execution path triggered by file contents and is unsafe in automation/CI contexts that ingest untrusted branches or PR content.
Useful? React with 👍 / 👎.
|
|
||
| """ | ||
| #{String.trim(draft)} | ||
| #{fm_string}#{clean_draft} |
There was a problem hiding this comment.
Avoid prepending a second frontmatter block to final guides
This path always prepends build_frontmatter/2 to clean_draft, but RunWriterPass now instructs both v1 and v2 drafts to start with their own %{...} frontmatter. As a result, accepted outputs systematically contain two frontmatter sections, which breaks the expected single-header Livebook/NimblePublisher format and leaves one metadata map rendered as body content.
Useful? React with 👍 / 👎.
Collapse the LLM's latent space with a concrete, complete Livebook
example directly in the V1 writer prompt. The template shows:
- Exact %{ frontmatter } → --- → # Title structure
- Mix.install as first code cell
- Variable-before-use pattern (prevents undefined variable errors)
- DemoAgent = Module assignment as final code block
- ExUnit test block with autorun: false + ExUnit.run()
V2 prompt reinforced with 8 absolute rules including "return the
COMPLETE file, not a diff" to prevent the 654-byte stub issue.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…late Kino is 0.14.x not 2.x — the blanket "~> 2.0" for all ecosystem packages caused Mix.install to fail with "kino ~> 2.0 doesn't match any versions". Add a version_map with correct constraints. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ace prefix Two fixes to the compiler gate: 1. strip_mix_install_noise: Remove Mix.install dependency resolution and compilation output before running has_error_indicators?. Dep warnings (e.g., Timex struct update warning) were triggering false positives. Now only the user's code output is analyzed. 2. truncate_trace preserves prefix: The BEAM SUCCESS/FAILED first line is now kept through truncation so the critic knows the verdict even when the trace exceeds the 4KB limit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…do.Agent.new() The `use Jido.Agent` macro generates new/0, cmd/2, set/2 on the child module. The previous golden template incorrectly taught the LLM to call Jido.Agent.new(Module) which triggers FunctionClauseError. Fixed both V1 and V2 writer prompts to use the correct module-based API pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ctives}
The golden template incorrectly taught {:ok, updated, _directives} pattern
for cmd/2, causing MatchError at runtime since cmd/2 returns a plain
{agent, directives} 2-tuple. Fixed in grounded_clause, golden template
code examples, and both V1/V2 ABSOLUTE RULES.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The LLM was generating actions that returned absolute values instead of
computing from current state (e.g., %{count: amount} instead of
%{count: context.state.count + amount}). Added a Defining Actions section
to the golden template showing the correct context.state access pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In Elixir, DemoAgent = CounterAgent is a pattern match between two different atoms (not variable assignment), causing MatchError. These Livebook visualizer assignments are valid in Livebook cells but fail in concatenated .exs scripts. Strip them before evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The demo prompt included the full docs_brief (~21KB of grounded source context) which triggered 414 URI Too Long from the Sprites API. The error was silently swallowed. Fixed by: 1. Dropping docs_brief from demo prompt (only needs the draft) 2. Truncating draft to last 4KB if large 3. Adding Logger.warning on failure for diagnosis 4. Adding API pattern rules to demo prompt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Implements the Autonomous Self-Demonstrating Simulation Engine for the
DocumentationWriterBot. The pipeline now:GroundSourceCodereadslib/jido/agent.exandlib/jido_action.exfrom cloned reposEvaluateLivebookDraftextracts all ```elixir blocks, combines them into a.exsscript, and executes in the sprite sandbox. Real compilation errors feed into the critic for targeted revision instructionsBug Fixes (8 total)
Enum.find_valuesource grounding bug —cat file | head -n 1000exits 0 even whencatfails (head's exit code wins), so error messages like"cat: lib/jido_action.ex: No such file"were accepted as valid content. Fixed withtest -fgate + text guard.elixir script.exsreturned{:error, %Shell.Error{}}which didn't match the expected pattern, losing the actual compilation trace. Fixed with|| trueto always capture output.has_error_indicators?matched bare"failed"in normal Jido runtime log lines during successful execution. Narrowed to specific Elixir error patterns.rescuedoesn't catch OTP:exitsignals fromGenServer.stop. Changed tocatch :exit.@pipeline_keysdidn't include docs-specific keys, so shared actions dropped them. Added 10 keys to both parent and child helpers.%{}when brief had no frontmatter. Now preserves existing metadata.New Files (9)
GroundSourceCode— Extracts source code from repos for semantic groundingEvaluateLivebookDraft— Compiler gate: runs extracted code in sprite sandboxParseContentBrief— Parses Elixir map frontmatter from content plansResolveOutputPath— Resolves output path from content metadataEmbedInteractiveDemo— Embeds interactive demo blocks in accepted draftsContentPlan— Elixir map frontmatter parser moduleDockerfile.docs— Docker image for Linux-only erlexec buildsmix jido_lib.github.docs.pipeline— Single content plan pipeline taskmix jido_lib.github.docs.generate_all— Batch pipeline runnerPipeline Test Results
All 27 DAG nodes execute end-to-end. Source grounding successfully reads both
Jido.Agent(9KB) andJido.Action(9KB). Compiler gate correctly catches compilation errors and feeds traces to critic. Writer/critic revision loop operates correctly (V1 → gate → critic → V2 → gate → final decision).Test plan
--provider claudeuntil writer produces accepted output🤖 Generated with Claude Code