Prepare Wardwright 0.0.11 release by bglusman · Pull Request #73 · bglusman/wardwright

bglusman · 2026-05-24T22:54:06Z

Scope

Prepares the next public Wardwright release as 0.0.11 instead of 0.1.0.

This branch now includes:

release/package/docs retargeting to 0.0.11
Jido framework dogfood smoke coverage and the local Gemma authoring dogfood recipe
the framework-adapter recipe/support documentation from the release-prep branch
a release smoke proving a real HTTP API plus MCP authoring/debugging loop without UI scraping
CI hardening for the secret-scan checkout range after GitHub Actions runner-token fetch failures

API/MCP proof run

The new smoke activates a canned Wardwright model through POST /v1/policy-authoring/wardwright-models, calls it through /v1/chat/completions, captures x-wardwright-receipt-id, discovers MCP tools through tools/list, and loads the resulting trace through load_control_debugger_trace.

This proves the core authoring/debugging loop can be driven by agents over API/MCP. It does not claim every HTTP scenario-management endpoint is exposed as MCP, and it does not claim native framework state or exact replay fidelity beyond the tested surfaces.

Validation

Local validation:

cd app && MIX_ENV=test mise exec -- mix test --no-compile test/mcp_authoring_test.exs test/agent_adapter_identity_test.exs test/agent_adapter_recording_test.exs test/local_gemma_authoring_recipe_test.exs test/jido_adapter_smoke_test.exs -> 26 passed
mise run check:docs -> passed
mise check -> 458 passed, 6 excluded; docs/map/style/type/browser checks passed
commit hooks and pre-push hooks -> app tests/docs/gitleaks/mise check passed
mise run package:smoke:darwin-arm64 -> Burrito binary built and printed 0.0.11

GitHub validation on the current head:

app -> passed
docs -> passed
packaged smoke -> passed
gitleaks -> passed
zizmor -> passed
aggregate test -> passed

Claim limits

Framework adapters remain recipes/smokes, not published native framework packages.
Jido is dogfooded in-app, but broader Jido native tool/stream/state fidelity is not claimed.
Some scenario-management operations remain HTTP-only; MCP covers the core authoring/debugging loop and exposed tools documented in the agent authoring guide.
The Darwin ARM64 artifact was built and smoked locally; release tags/Homebrew publication are not done by this PR itself.

sourcery-ai · 2026-05-24T22:54:13Z

Reviewer's Guide

Prepares the Wardwright 0.1.0 stable release by introducing a Wardwright-hosted server-tool framework for non-streaming Chat Completions (built-in policy cache status, trusted Dune functions, and trusted BEAM modules), enriching tool-context metadata with execution location/visibility, updating the OpenAPI contract and docs to the 0.1.0 line and honest framework claims, and wiring the new server-tools path into the completion pipeline alongside version and CI workflow updates.

Sequence diagram for non-streaming Chat Completions with Wardwright-hosted server tools

sequenceDiagram
  actor Client
  participant Router as Wardwright.Router
  participant ServerTools as Wardwright.ServerTools
  participant Core as Wardwright
  participant Provider

  Client->>Router: POST /v1/chat/completions
  Router->>ServerTools: complete_selected_model(selected_model, request, config)
  alt tools disabled or stream==true
    ServerTools->>Core: complete_selected_model(selected_model, request, config)
    Core-->>ServerTools: first_response
    ServerTools-->>Router: first_response
  else tools enabled and non-streaming
    ServerTools->>ServerTools: configured_tools(config)
    ServerTools->>ServerTools: inject_tools(request, tools)
    ServerTools->>Core: complete_selected_model(selected_model, request_with_tools, config)
    Core-->>ServerTools: first_response (tool_calls[])
    alt matching server tool requested
      ServerTools->>ServerTools: execute_tool(tool_call, tool, request, config)
      opt builtin tool
        ServerTools->>Wardwright.PolicyCache: status()
        Wardwright.PolicyCache-->>ServerTools: policy_cache_status
      end
      opt dune tool
        ServerTools->>Wardwright.PolicySandbox.Dune: eval_snippet(source, input, limits)
        Wardwright.PolicySandbox.Dune-->>ServerTools: result
      end
      opt beam_module tool
        ServerTools->>BeamModule: run(arguments, %{config, request})
        BeamModule-->>ServerTools: result | {:error, reason}
      end
      ServerTools->>Core: complete_selected_model(selected_model, followup_request, config)
      Core-->>ServerTools: second_response
      ServerTools->>ServerTools: add_server_tool_metadata(second_response, first_response, execution)
      ServerTools-->>Router: final_response_with_metadata
    else no matching server tool
      ServerTools-->>Router: first_response
    end
  end
  Router-->>Client: JSON completion with provider_metadata.wardwright_server_tools[]

File-Level Changes

Change	Details	Files
Introduce Wardwright-hosted server-tools framework and integrate it into non-streaming Chat Completions.	Add Wardwright.ServerTools module to inject configured server tools into chat-completions requests, run a single model-requested server-tool call, and merge execution metadata back into provider_metadata and latency accounting. Support three server-tool engines: a builtin wardwright_policy_cache_status tool, trusted local Dune function tools (inline source or snippet_id with limit options), and trusted BEAM module tools loaded from .ex/.exs/.erl/.beam paths via a common spec/0 and run/2 behaviour. Normalize server_tools configuration in Wardwright config normalization so tools can be declared as strings, objects with engines, or with implicit engine inference; accept elixir_module as an alias for beam_module and filter out disabled/invalid tools. Expose Wardwright.ServerTools.Behaviour for BEAM extensions and ensure BEAM tools are loaded safely (require_file, Erlang compile, or beam load) and module selection favors configured module names with run/2. Update router to call Wardwright.ServerTools.complete_selected_model so the server-tool loop wraps the existing completion path for non-streaming calls only.	`app/lib/wardwright/server_tools.ex` `app/lib/wardwright/server_tools/behaviour.ex` `app/lib/wardwright.ex` `app/lib/wardwright/router.ex`
Extend tool-context policy metadata with execution location and visibility level, and wire through core and reference implementations.	Add execution_location and visibility_level fields to tool_context contract and normalization, deriving them from tool namespace/source via new Gleam core functions. Update Wardwright.ToolContext.normalize to compute primary_tool, execution_location, and visibility_level for both caller-metadata and inferred contexts, delegating to :wardwright@tool_context_core. Implement execution_location/visibility_level in Gleam tool_context_core and keep Elixir reference implementation in sync; add tests to assert parity. Extend tool_context tests to assert new fields for client tools, and add scenarios covering future wardwright_hosted and provider-declared tools.	`contracts/tool-context-policy-contract.md` `app/lib/wardwright/tool_context.ex` `app/src/wardwright/tool_context_core.gleam` `app/src/wardwright/elixir_reference/tool_context_core_reference.exs` `app/test/tool_context_test.exs` `app/test/gleam_policy_core_test.exs`
Add test and test-support coverage for the new server-tool behavior against the OpenAI-compatible test provider.	Extend the streaming provider test harness to recognize server tool initiation/result loops, emitting tool_calls for wardwright_policy_cache_status, dune_echo_tool, and beam_reverse_tool, and returning distinct final assistant messages once tool results are sent back. Add integration tests that configure server_tools in unit_policy_config, seed the policy cache, and assert that non-streaming /v1/chat/completions calls trigger Wardwright-hosted server tools and record wardwright_server_tools metadata with expected fields. Add tests for Dune server tools that define an echo tool with inline source and for BEAM module tools that are written to a temporary .exs file implementing the Wardwright.ServerTools.Behaviour contract, verifying result_metadata echoes and reversals.	`app/test/stream_provider_transport_test.exs` `app/test_support/router_case.ex`
Update OpenAPI contract and documentation to describe server_tools, clarify framework adapter claims, and move from 0.1.0-rc.1 to the 0.1.0 release line.	Extend the OpenAPI config/test schemas with a server_tools array and a ServerTool schema (engine, name, parameters, source/snippet_id/input, module/path, enabled, execution_location, visibility_level), and add allowed_tools to GovernanceRule kinds. Update tool-context and feature-spikes documentation to describe the new Wardwright-hosted server-tool surface, including execution_location/visibility semantics and explicit limits (non-streaming only, no sandbox, no broad hidden tools). Refresh framework-adapter docs, tutorial, vision, provider-credentials, packaging, and site index to reference the 0.1.0 release instead of 0.0.10/0.1.0-rc.1; incorporate results from post-RC Proxmox/.NET and streaming checks, and keep claims conservative about streaming, tool fidelity, and state. Add a Ralph run log describing the post-RC framework gap mitigation (Proxmox LXC, NuGet smoke, streaming checks) and explicitly document remaining gaps and follow-ups.	`contracts/openapi.yaml` `docs/tool-context-policy.md` `docs/feature-spikes.md` `docs/framework-adapters.md` `docs/tutorial-news-monitor-agent.md` `docs/packaging.md` `docs/index.md` `docs/vision.md` `docs/provider-credentials.md` `docs/agent-authoring.md` `docs/agent-adapters.md` `docs/ralph-runs/framework-adapter-validation-loop-supervisor.md` `README.md`
Finalize 0.1.0 release wiring, versions, and CI workflow pinning.	Bump mix project version to 0.1.0 and update agent adapter identity tests and adapter pack modules to report adapter_version 0.1.0, including identity TTL tweaks in recording tests. Update the release GitHub Actions workflow to pin actions/checkout, upload-artifact, and download-artifact to specific SHAs for reproducibility and security. Adjust documentation and README install snippets, packaging notes, and agent adapter status sections to reference v0.1.0 as the stable release and describe its feature set and limits.	`app/mix.exs` `.github/workflows/wardwright-release.yml` `app/test/agent_adapter_identity_test.exs` `app/test/agent_adapter_recording_test.exs` `app/lib/wardwright/agent_adapters/claude_code_pack.ex` `app/lib/wardwright/agent_adapters/omp_pack.ex` `app/lib/wardwright/agent_adapters/pi_pack.ex` `README.md` `docs/packaging.md` `docs/agent-adapters.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've left some high level feedback:

The server-tool normalization logic is split between Wardwright.normalize_server_tools/1 and Wardwright.ServerTools.configured_tools/1 (including separate present?/1 and engine inference branches); consider consolidating this into a single shared normalization path to avoid drift between config serialization and runtime behavior.
In Wardwright.ServerTools, the module loading paths (load_tool_path/1, select_tool_module/2, compile_erlang_tool/1, load_beam_tool/1) have several failure modes that are collapsed into generic {:error, reason} tuples; consider adding structured error tagging or logging so operators can more easily diagnose why a given BEAM server tool failed to load or was ignored.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The server-tool normalization logic is split between `Wardwright.normalize_server_tools/1` and `Wardwright.ServerTools.configured_tools/1` (including separate `present?/1` and engine inference branches); consider consolidating this into a single shared normalization path to avoid drift between config serialization and runtime behavior.
- In `Wardwright.ServerTools`, the module loading paths (`load_tool_path/1`, `select_tool_module/2`, `compile_erlang_tool/1`, `load_beam_tool/1`) have several failure modes that are collapsed into generic `{:error, reason}` tuples; consider adding structured error tagging or logging so operators can more easily diagnose why a given BEAM server tool failed to load or was ignored.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist

Code Review

This pull request introduces a server-side tool framework to Wardwright, enabling models to execute built-in tools, Dune snippets, and custom BEAM modules. It also updates the project version to v0.1.0 and adds execution_location and visibility_level metadata to the tool context. Feedback on the new ServerTools module highlights several critical issues: the tool loop only handles the first tool call, violating the OpenAI API contract for multiple calls; the BEAM module loader suffers from performance bottlenecks and a bug where tools fail to load on subsequent requests due to Code.require_file behavior; and the use of JSON.encode! on tool results risks runtime crashes when encountering atoms.

Copilot

Pull request overview

Prepares the Wardwright 0.1.0 stable release line by introducing a minimal Wardwright-hosted server-tool execution surface for non-streaming Chat Completions, extending tool-context provenance metadata, and updating contracts/docs/workflows/version strings from RC/draft wording to 0.1.0.

Changes:

Add Wardwright-hosted server tools for non-streaming Chat Completions (builtin + trusted local Dune + trusted local BEAM modules) and wire the router to use this path.
Extend tool-context normalization to include execution_location and visibility_level, and update tests/contracts accordingly.
Update OpenAPI metadata, release docs, and workflow pins for the 0.1.0 release line.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
README.md	Updates install/version references to `v0.1.0`.
docs/vision.md	Updates status banner to `v0.1.0` wording.
docs/tutorial-news-monitor-agent.md	Updates validation notes and framework evidence wording for `0.1.0`.
docs/tool-context-policy.md	Documents Wardwright-hosted server tools and execution/visibility semantics.
docs/ralph-runs/framework-adapter-validation-loop-supervisor.md	Adds post-RC evidence log slice for .NET + streaming probes.
docs/provider-credentials.md	Updates release-line wording from RC to stable.
docs/packaging.md	Updates packaging/version guidance and documents server-tool slice.
docs/index.md	Updates docs landing status banner for `v0.1.0`.
docs/framework-adapters.md	Updates framework adapter evidence/limits wording for `0.1.0`.
docs/feature-spikes.md	Adds `Wardwright-hosted server tools` spike entry and guardrails.
docs/agent-authoring.md	Updates “release line” wording.
docs/agent-adapters.md	Updates adapter docs to `0.1.0` release line.
contracts/tool-context-policy-contract.md	Adds `execution_location` and `visibility_level` fields to the contract doc.
contracts/openapi.yaml	Bumps API version to `0.1.0` and adds `server_tools` / `ServerTool` schema.
app/test/tool_context_test.exs	Asserts new tool-context fields in normalized output.
app/test/stream_provider_transport_test.exs	Adds end-to-end tests for builtin/Dune/BEAM server tools.
app/test/gleam_policy_core_test.exs	Adds Gleam reference assertions for execution/visibility helpers.
app/test/agent_adapter_recording_test.exs	Updates adapter version assertions and TTL defaults in helper.
app/test/agent_adapter_identity_test.exs	Updates adapter version assertions to `0.1.0`.
app/test_support/router_case.ex	Extends test provider to simulate server-tool calls/results.
app/src/wardwright/tool_context_core.gleam	Implements execution/visibility derivation helpers in Gleam core.
app/src/wardwright/elixir_reference/tool_context_core_reference.exs	Mirrors Gleam helpers in Elixir reference implementation.
app/mix.exs	Sets application version to `0.1.0`.
app/lib/wardwright/tool_context.ex	Adds `execution_location` and `visibility_level` normalization.
app/lib/wardwright/server_tools/behaviour.ex	Introduces behaviour for trusted BEAM server tools (`spec/0`, `run/2`).
app/lib/wardwright/server_tools.ex	Implements server-tool registry, tool injection, one-loop execution, and receipt metadata.
app/lib/wardwright/router.ex	Routes non-streaming completions through `Wardwright.ServerTools.complete_selected_model/3`.
app/lib/wardwright/agent_adapters/pi_pack.ex	Updates adapter version constant to `0.1.0`.
app/lib/wardwright/agent_adapters/omp_pack.ex	Updates adapter version constant to `0.1.0`.
app/lib/wardwright/agent_adapters/claude_code_pack.ex	Updates adapter version constant to `0.1.0`.
app/lib/wardwright.ex	Normalizes `server_tools` config into the public model config surface.
.github/workflows/wardwright-release.yml	Pins GitHub Actions to specific SHAs for reproducible release builds.

Add request-side tool mediation controls

bglusman added 3 commits May 24, 2026 17:16

Record framework gap mitigation probes

f500e77

Prepare 0.1.0 server tool release slice

9cc0a36

Mark API contract as 0.1.0

6c322b2

Copilot AI review requested due to automatic review settings May 24, 2026 22:54

Copilot started reviewing on behalf of bglusman May 24, 2026 22:54 View session

sourcery-ai Bot reviewed May 24, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 24, 2026

View reviewed changes

Comment thread app/lib/wardwright/server_tools.ex

Comment thread app/lib/wardwright/server_tools.ex Outdated

Comment thread app/lib/wardwright/server_tools.ex

Comment thread app/lib/wardwright/server_tools.ex

Copilot AI reviewed May 24, 2026

View reviewed changes

bglusman added 12 commits May 24, 2026 20:01

Add request-side tool mediation controls

ede3498

Expand dynamic tool mediation integration coverage

4180108

Address tool mediation review feedback

4b4bd6b

Document tool mediation review coverage

3f4cbd7

Merge pull request #74 from bglusman/tool-mediation-control-plane

c3966c4

Add request-side tool mediation controls

Address server tool release feedback

9af58bb

Expose server tool visibility in admin UI

db52be9

Improve server tool UI on mobile

3b7bff9

Expose server tool advertisement controls

40a49d2

Add admin model config design review

6532af0

Add freeform admin UX concepts

f386d4e

Add holistic admin UX concepts

b168d56

bglusman mentioned this pull request May 25, 2026

UX concept voting: Wardwright admin exploration #75

Open

bglusman added 8 commits May 25, 2026 10:26

Add admin UX exploration gallery

b0c25d8

Add app-native admin UX concept routes

d479ef5

Make admin UX exploration live

be89d05

Expand admin UX exploration concepts

d2f54f7

Make UX concepts standalone admin experiences

c507a7c

Fix UX concept anchor navigation

112d2cb

Fix UX exploration responsive layout bugs

f97924f

Differentiate admin UX exploration concepts

bba6f3a

bglusman added 4 commits May 25, 2026 23:29

Add Jido framework dogfood smoke

c81ae1d

Add local Gemma authoring dogfood recipe

06ba54a

Merge Jido dogfood into release prep

5c39de0

Retarget release prep to 0.0.11

972eaf6

bglusman changed the title ~~Prepare Wardwright 0.1.0 release~~ Prepare Wardwright 0.0.11 release May 26, 2026

Harden secret scan checkout range

7da2735

bglusman merged commit cfc56f5 into main May 26, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepare Wardwright 0.0.11 release#73

Prepare Wardwright 0.0.11 release#73
bglusman merged 28 commits into
mainfrom
release/docs-and-version-prep

bglusman commented May 24, 2026 •

edited

Loading

Uh oh!

sourcery-ai Bot commented May 24, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bglusman commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scope

API/MCP proof run

Validation

Claim limits

Uh oh!

sourcery-ai Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for non-streaming Chat Completions with Wardwright-hosted server tools

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bglusman commented May 24, 2026 •

edited

Loading

sourcery-ai Bot commented May 24, 2026 •

edited

Loading