Skip to content

feat(docs-pipeline): source grounding, compiler gate, and pipeline fixes#2

Merged
mikehostetler merged 9 commits into
agentjido:mainfrom
jmanhype:feat/docs-pipeline-grounding-and-compiler-gate
Feb 28, 2026
Merged

feat(docs-pipeline): source grounding, compiler gate, and pipeline fixes#2
mikehostetler merged 9 commits into
agentjido:mainfrom
jmanhype:feat/docs-pipeline-grounding-and-compiler-gate

Conversation

@jmanhype
Copy link
Copy Markdown

Summary

Implements the Autonomous Self-Demonstrating Simulation Engine for the DocumentationWriterBot. The pipeline now:

  • Grounds writer in actual source codeGroundSourceCode reads lib/jido/agent.ex and lib/jido_action.ex from cloned repos
  • Runs a deterministic compiler gateEvaluateLivebookDraft extracts all ```elixir blocks, combines them into a .exs script, and executes in the sprite sandbox. Real compilation errors feed into the critic for targeted revision instructions
  • Propagates content_metadata through the full 23-node DAG without dropping docs-specific keys at shared action boundaries

Bug Fixes (8 total)

  1. Enum.find_value source grounding bugcat file | head -n 1000 exits 0 even when cat fails (head's exit code wins), so error messages like "cat: lib/jido_action.ex: No such file" were accepted as valid content. Fixed with test -f gate + text guard.
  2. Compiler gate output capture — Non-zero exit from elixir script.exs returned {:error, %Shell.Error{}} which didn't match the expected pattern, losing the actual compilation trace. Fixed with || true to always capture output.
  3. Trace truncation — Mix.install output (37KB of package resolution) blew up the critic prompt payload past Sprites API limits. Now truncates to last 4KB.
  4. Error detection false positivehas_error_indicators? matched bare "failed" in normal Jido runtime log lines during successful execution. Narrowed to specific Elixir error patterns.
  5. Observer stop crashrescue doesn't catch OTP :exit signals from GenServer.stop. Changed to catch :exit.
  6. content_metadata propagation — Shared @pipeline_keys didn't include docs-specific keys, so shared actions dropped them. Added 10 keys to both parent and child helpers.
  7. ParseContentBrief metadata — Overwrote intake metadata with empty %{} when brief had no frontmatter. Now preserves existing metadata.
  8. @max_lines overflow — 1000 lines × 2 files = 58KB grounded context → 414 URI Too Long. Reduced to 250 lines.

New Files (9)

  • GroundSourceCode — Extracts source code from repos for semantic grounding
  • EvaluateLivebookDraft — Compiler gate: runs extracted code in sprite sandbox
  • ParseContentBrief — Parses Elixir map frontmatter from content plans
  • ResolveOutputPath — Resolves output path from content metadata
  • EmbedInteractiveDemo — Embeds interactive demo blocks in accepted drafts
  • ContentPlan — Elixir map frontmatter parser module
  • Dockerfile.docs — Docker image for Linux-only erlexec builds
  • mix jido_lib.github.docs.pipeline — Single content plan pipeline task
  • mix jido_lib.github.docs.generate_all — Batch pipeline runner

Pipeline Test Results

All 27 DAG nodes execute end-to-end. Source grounding successfully reads both Jido.Agent (9KB) and Jido.Action (9KB). Compiler gate correctly catches compilation errors and feeds traces to critic. Writer/critic revision loop operates correctly (V1 → gate → critic → V2 → gate → final decision).

Test plan

  • Pipeline runs all 27 nodes without infrastructure errors
  • Source grounding reads files from correct repos (not cat error messages)
  • Compiler gate captures actual compilation errors (not SYSTEM ERROR)
  • Trace truncation keeps critic prompt under Sprites API limit
  • Error detection doesn't false-positive on runtime log lines
  • Run pipeline with --provider claude until writer produces accepted output

🤖 Generated with Claude Code

…e fixes

Implements the Autonomous Self-Demonstrating Simulation Engine architecture
for the DocumentationWriterBot. Key changes:

Source Code Grounding:
- Add GroundSourceCode action to extract verbatim API source from repos
- Fix Enum.find_value bug where `cat` error messages from wrong repo were
  accepted as valid content (test -f gate + text guard)
- Reduce @max_lines to 250 to avoid 414 URI Too Long from Sprites API

Compiler Gate (EvaluateLivebookDraft):
- New action that extracts ```elixir blocks and runs them in sprite sandbox
- Fix output capture: append `|| true` so non-zero exit codes don't lose
  the actual compilation error trace
- Add trace truncation (4KB tail) so Mix.install noise doesn't blow up
  the critic prompt payload
- Fix has_error_indicators? false positive on bare "failed" match
- Increase eval timeout to 120s for Mix.install + compilation

Pipeline Workflow:
- Wire ParseContentBrief, ResolveOutputPath, GroundSourceCode,
  EvaluateLivebookDraft (V1+V2), and EmbedInteractiveDemo into DAG
- Add content_metadata and prompt_overrides to build_intake
- Add docs-specific keys to shared @pipeline_keys in helpers.ex
- Fix ParseContentBrief to preserve intake metadata when brief has
  no frontmatter (already split by mix task)

Writer/Critic Quality:
- Enhanced writer prompt with NimblePublisher format enforcement,
  frontmatter template, NimbleOptions warnings, Jido.Agent.set/2 guidance
- Enhanced critic prompt with compiler trace section for deterministic
  error feedback
- Fix Observer.stop crash (rescue -> catch :exit for GenServer.stop)

Infrastructure:
- Add Dockerfile.docs for Linux-only erlexec builds
- Add mix jido_lib.github.docs.pipeline task with --max-revisions,
  --provider, --dry-run options
- Add ContentPlan module for Elixir map frontmatter parsing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5924ae58d2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

[frontmatter_str, body] ->
if String.starts_with?(String.trim(frontmatter_str), "%{") do
try do
{metadata, _binding} = Code.eval_string(frontmatter_str)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse frontmatter without evaluating arbitrary Elixir

ContentPlan.parse_string!/1 executes frontmatter via Code.eval_string/1, so any content plan processed by mix jido_lib.github.docs.pipeline, generate_all, or scan_drift can run arbitrary BEAM code during parsing (for example System.cmd/3) instead of being treated as data. This is a code-execution path triggered by file contents and is unsafe in automation/CI contexts that ingest untrusted branches or PR content.

Useful? React with 👍 / 👎.


"""
#{String.trim(draft)}
#{fm_string}#{clean_draft}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid prepending a second frontmatter block to final guides

This path always prepends build_frontmatter/2 to clean_draft, but RunWriterPass now instructs both v1 and v2 drafts to start with their own %{...} frontmatter. As a result, accepted outputs systematically contain two frontmatter sections, which breaks the expected single-header Livebook/NimblePublisher format and leaves one metadata map rendered as body content.

Useful? React with 👍 / 👎.

Riley Test and others added 8 commits February 28, 2026 12:26
Collapse the LLM's latent space with a concrete, complete Livebook
example directly in the V1 writer prompt. The template shows:

- Exact %{ frontmatter } → --- → # Title structure
- Mix.install as first code cell
- Variable-before-use pattern (prevents undefined variable errors)
- DemoAgent = Module assignment as final code block
- ExUnit test block with autorun: false + ExUnit.run()

V2 prompt reinforced with 8 absolute rules including "return the
COMPLETE file, not a diff" to prevent the 654-byte stub issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…late

Kino is 0.14.x not 2.x — the blanket "~> 2.0" for all ecosystem
packages caused Mix.install to fail with "kino ~> 2.0 doesn't match
any versions". Add a version_map with correct constraints.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ace prefix

Two fixes to the compiler gate:

1. strip_mix_install_noise: Remove Mix.install dependency resolution
   and compilation output before running has_error_indicators?. Dep
   warnings (e.g., Timex struct update warning) were triggering false
   positives. Now only the user's code output is analyzed.

2. truncate_trace preserves prefix: The BEAM SUCCESS/FAILED first line
   is now kept through truncation so the critic knows the verdict even
   when the trace exceeds the 4KB limit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…do.Agent.new()

The `use Jido.Agent` macro generates new/0, cmd/2, set/2 on the child
module. The previous golden template incorrectly taught the LLM to call
Jido.Agent.new(Module) which triggers FunctionClauseError. Fixed both
V1 and V2 writer prompts to use the correct module-based API pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ctives}

The golden template incorrectly taught {:ok, updated, _directives} pattern
for cmd/2, causing MatchError at runtime since cmd/2 returns a plain
{agent, directives} 2-tuple. Fixed in grounded_clause, golden template
code examples, and both V1/V2 ABSOLUTE RULES.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The LLM was generating actions that returned absolute values instead of
computing from current state (e.g., %{count: amount} instead of
%{count: context.state.count + amount}). Added a Defining Actions section
to the golden template showing the correct context.state access pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In Elixir, DemoAgent = CounterAgent is a pattern match between two
different atoms (not variable assignment), causing MatchError. These
Livebook visualizer assignments are valid in Livebook cells but fail in
concatenated .exs scripts. Strip them before evaluation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The demo prompt included the full docs_brief (~21KB of grounded source
context) which triggered 414 URI Too Long from the Sprites API. The
error was silently swallowed. Fixed by:
1. Dropping docs_brief from demo prompt (only needs the draft)
2. Truncating draft to last 4KB if large
3. Adding Logger.warning on failure for diagnosis
4. Adding API pattern rules to demo prompt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mikehostetler mikehostetler merged commit 8483e9f into agentjido:main Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants