Add builtin qmd documentation search tool (experimental)#22183
Add builtin qmd documentation search tool (experimental)#22183
qmd documentation search tool (experimental)#22183Conversation
- Add QmdToolConfig type with docs glob patterns field - Add DefaultQmdVersion (0.0.16) and QmdArtifactName constants - Add parseQmdTool parser function - Register qmd as builtin tool in tools_parser, tools_types, mcp_config_validation - Activation job: install @tobilu/qmd, build index, upload qmd-index artifact - Agent job: download qmd-index artifact before MCP setup - MCP renderer: RenderQmdMCP for JSON/TOML (Node.js + npx serve-mcp) - MCP setup generator: qmd added to standard MCP tools list - Claude tools: qmd handled as wildcard MCP server (like serena) - Docker: add node:lts-alpine pre-pull when qmd is configured - System prompt: qmd_prompt.md injected when qmd tool is active - Dependabot: track @tobilu/qmd version in npm deps - Shell injection prevention: single-quote glob patterns in activation steps - JSON schema: add qmd tool schema definition Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e5f827de-afbd-4b2e-98e0-7a7d87ed547a
qmd documentation search tool
|
@copilot mark as experimental |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1dc55c60-1437-4497-a351-d1b7b8d88bfa
Done in abf31d6. The qmd tool now emits Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
qmd documentation search toolqmd documentation search tool (experimental)
|
Every documentation collection can be associated to a "checkout" (same syntax as checkout top level). Default is current repo. The checkout is what paths apply to. Make it configurable the same way as checkout |
|
@copilot Every documentation collection can be associated to a "checkout" (same syntax as checkout top level). Default is current repo. The checkout is what paths apply to. Make it configurable the same way as checkout |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e9a784be-084b-40e0-9bf4-eb76dc81df6d
Done in e09c697. Each documentation collection can now carry its own tools:
qmd:
collections:
- name: current-docs
docs:
- docs/**/*.md
- name: other-docs
docs:
- docs/**/*.md
checkout:
repository: owner/other-repo
path: ./other-repo # optional; defaults to /tmp/gh-aw/qmd-checkout-<name>The activation job emits an Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot the QMD object will be structured as a set of checkouts and a set of searches in GitHub. Take the existing structure and move those entries under a sub-identifier named
The searches will be executed and added to the collection one by one. |
|
@copilot the QMD object may also have a cache key, which means it will be stored in an action cache. This allows you, for example, to declare QMD with only a cache key and omit the indexing, using it as a read-only database. |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e854d5ea-603f-40f9-8672-434f0246af32
Done in 8e48ff6. The
checkouts:
- name: local-docs
docs: [docs/**/*.md]
- name: other-docs
docs: [docs/**/*.md]
checkout:
repository: owner/other-repo
path: ./other-repo
searches:
- query: "repo:owner/repo language:Markdown path:docs/"
min: 1 # fail if fewer results
max: 30 # default 30
github-token: ${{ secrets.GITHUB_TOKEN }}
# OR github-app: { app-id: ..., private-key: ... }Each search step uses Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/9b048214-2eee-4a65-b213-30364c2769b7
Done in ef071a0. The Read-only mode (cache-key only, no sources — use a pre-built cache as a database): tools:
qmd:
cache-key: "qmd-index-${{ hashFiles('docs/**') }}"→ activation job: restore from cache → upload as artifact; all indexing steps skipped. Build + cache mode (sources + cache-key — build once, reuse on subsequent runs): tools:
qmd:
checkouts:
- name: docs
docs: [docs/**/*.md]
cache-key: "qmd-index-${{ hashFiles('docs/**') }}"→ activation job: restore from cache → on miss: build index + save to cache → upload as artifact; all build steps carry Uses Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot review all changes, cleanup up. Add new documentation page in reference. |
|
@copilot review gateway logs and fix qmd MCP config https://github.com/github/gh-aw/actions/runs/23414845250/job/68108714933#step:20:1 |
The gateway schema requires env as an object (key-value pairs), not
env_vars as an array. This caused the validation error:
additionalProperties 'env_vars' not allowed
- INDEX_PATH and NODE_LLAMA_CPP_GPU use \${VAR} so heredoc does not
expand them; the gateway resolves them from its own environment
- HOME uses ${HOME} (no backslash) so heredoc expands to /home/runner,
since HOME is not passed to the gateway container
- Use sortedMapKeys for deterministic output and to keep keys/values in sync
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/4091c5b1-577e-47b1-9500-d16f21a79db9
Fixed in 285c3d7. The gateway log showed the only actual validation failure was: The
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot merge main, recompile |
…rt-for-collection # Conflicts: # .github/workflows/dev.lock.yml # .github/workflows/smoke-codex.lock.yml # pkg/workflow/testdata/wasm_golden/TestWasmGolden_CompileFixtures/basic-copilot.golden # pkg/workflow/testdata/wasm_golden/TestWasmGolden_CompileFixtures/with-imports.golden
There was a problem hiding this comment.
💥 Automated smoke test review - all systems nominal!
💥 [THE END] — Illustrated by Smoke Claude
| --- | ||
| on: | ||
| workflow_dispatch: | ||
| label_command: dev |
There was a problem hiding this comment.
The label_command: dev trigger looks good — this enables the workflow to react to dev labels on issues, PRs, and discussions. Consider documenting the expected behavior when this label is applied to ensure contributors understand the automation.
| @@ -14,6 +15,20 @@ permissions: | |||
| issues: read | |||
| pull-requests: read | |||
|
|
|||
There was a problem hiding this comment.
The qmd tool integration with both checkouts and searches is a powerful combination. The max: 500 for issues search is reasonable — ensure this doesn't cause timeouts on repos with many open issues. The github-token usage here is correct for accessing the GitHub API.
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/51228c75-5a27-4790-8a79-2df6819091bd
Done in 1606b63. Merged Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
✅ Dev completed successfully! |
|
@copilot make qmd is configured with the guard in the MCP config. Also review gateway error logs in https://github.com/github/gh-aw/actions/runs/23417157963/job/68114923074#step:32:1 |
…P config - Add NO_COLOR=1 to qmd container environment (both TOML and JSON renderers) to prevent ANSI escape codes from corrupting the JSON-RPC stream - Add guard-policies support to qmd MCP config in both TOML and JSON formats, consistent with playwright, serena, mcp-scripts, and agentic-workflows Fixes gateway error: invalid character '\x1b' looking for beginning of value Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/b99aded2-8a95-458f-a171-afa6b8debabf
Done in c90147f. Two changes:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
✅ Dev completed successfully! No action needed: A Daily Status Report for documentation quality (2026-03-23) was already created today as issue #22350 at 01:17 UTC. It covers 3 open documentation issues (#22226 broken link, #22147 missing engine comparison table, #22092 frontmatter validation gap) with cross-references to relevant doc files and recommended actions. Creating a duplicate would be redundant. |
|
@copilot the qmd mcp server failed to start. Review qmd.log and update the configuration to fix the startup script. See https://github.com/github/gh-aw/actions/runs/23418447754/job/68118607721 |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/de0be902-9461-4165-a6a1-33f4984e886c
Fixed in 3af7267. The qmd.log showed: Added Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
✅ Dev completed successfully! |
|
The qmd tool is disabled in CI . Fix the ascii issues with spurious outputs without setting CI=1 Reference: https://github.com/github/gh-aw/actions/runs/23419565823/job/68121775227#step:32:1 |
|
@copilot The qmd tool is disabled in CI . Fix the ascii issues with spurious outputs without setting CI=1 Reference: https://github.com/github/gh-aw/actions/runs/23419565823/job/68121775227#step:32:1 |
…tion The root cause was Llama._callLogger() in node-llama-cpp v3 writing dot-progress characters directly to process.stdout (bypassing log levels and NO_COLOR=1) during model loading on first tool call. This corrupted the stdio JSON-RPC stream. Instead of CI=1 (which disabled qmd LLM operations via LlamaCpp._ciMode check), switch qmd to HTTP transport (qmd mcp --http). A new 'Start QMD MCP Server' step starts qmd in a Docker container before the gateway; the MCP protocol travels over TCP/HTTP, completely independent of qmd's stdout/stderr. - Add DefaultQmdMCPPort = 8181 constant - generateQmdStartStep(): Docker start + curl health-check (120s timeout) - mcp_setup_generator: add qmd start step before gateway - mcp_renderer_builtin: replace container/stdio config with HTTP URL - Update RenderQmd signature to pass workflowData for host resolution - Remove INDEX_PATH/NODE_LLAMA_CPP_GPU from gateway env (now in Docker step) - Recompile all 177 lock files Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1769cafa-6727-4bf8-b4ea-44a3f8eb589d
Fixed in 8a9cfee. Root cause: Fix: Switch qmd from gateway-managed stdio container to HTTP transport (
Fixed in 8a9cfee by switching qmd to HTTP transport ( Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot qmd cannot run in a container because it must be native in the action job vm. So add a setup to launch the qmd MCP http server and configure the MCP config to bind to the server. |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/0ce24d39-ceb1-447b-86ee-fa8fcce7a34b
Fixed in 16290ab. qmd now runs natively on the VM:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
✅ Dev completed successfully! |
Adds native support for tobi/qmd as a builtin experimental tool, enabling vector similarity search over configured documentation files without requiring
contents: readin the agent job.How it works
The index is split across three jobs:
modelandartifact_prefixactivation,timeout-minutes: 60,runs-on: ubuntu-latest): installs@tobilu/qmdand@actions/github, registers collections from checkouts and/or GitHub searches/issue lists, builds the index using the@tobilu/qmdJavaScript SDK via a singleactions/github-scriptstep, and saves the index toactions/cacheactivationandindexing): restores the qmd index from cache using the precise cache key, then starts the qmd MCP server natively on the VM as an HTTP server, and the MCP gateway connects to it via HTTP transportThis means the agent never needs
contents: read— the indexing job (which already has it) does all the file access.The embedding models are cached in two separate
actions/cacheentries (both keyed with the qmd version for proper invalidation on upgrades):~/.cache/qmd/models/): keyed by qmd version + OS, shared across runs~/.cache/node-llama-cpp/): keyed by qmd version + OS + arch + runner image ID, so pre-built llama.cpp binaries are invalidated when the runner image or qmd version changesThe indexing job uses
actions/cache(restore + post-save) to populate both caches; the agent job usesactions/cache/restore(read-only) to restore them without writing back.Cache key and restore keys
All
actions/cachekeys includeDefaultQmdVersionso that upgrading qmd automatically busts all caches. The qmd index default cache key isgh-aw-qmd<version>-${{ github.run_id }}(per-run ephemeral), but users can override it withcache-key:for cross-run persistence. The indexing job emitsrestore-keysderived by stripping the trailing${{ ... }}expression from the cache key, allowing incremental updates:gh-aw-qmd-2.0.1-${{ github.run_id }}→restore-keys: gh-aw-qmd-2.0.1-qmd-index-${{ hashFiles('docs/**') }}→restore-keys: qmd-index-The agent job restores using the precise cache key only (no restore-keys fallback), ensuring it always uses the exact index built by the current run's indexing job.
GPU and runner configuration
By default,
NODE_LLAMA_CPP_GPUis set to"false"in the indexing step so node-llama-cpp skips GPU probing on CPU-only runners. Setgpu: truein the qmd config to re-enable auto GPU detection. The indexing job defaults toubuntu-latestbut the runner image can be overridden independently viaruns-on::qmd MCP server (native HTTP transport)
The qmd MCP server is started natively on the GitHub Actions VM (not in a Docker container) in a Start QMD MCP Server step before the gateway. Node.js is set up via
actions/setup-node, thennpx @tobilu/qmd qmd mcp --httpis launched as a background process on port8181. The gateway connects to it via HTTP (http://localhost:8181/mcp). Using HTTP transport instead of stdio completely avoids ANSI/progress output fromnode-llama-cppcorrupting the JSON-RPC stream.INDEX_PATH,NO_COLOR, andNODE_LLAMA_CPP_GPUare set directly as environment variables, and process output is redirected to/tmp/qmd-mcp.logfor diagnostics. Acurlhealth-check (120s timeout) waits for the server to be ready before the gateway starts.Running natively (rather than in a container) is required because node-llama-cpp builds native llama.cpp binaries that must match the runner OS/arch and cannot be pre-built for a container image at compile time.
Usage
Checkouts form (multiple collections with per-collection checkout)
Searches form — code search (GitHub search queries)
Searches form — issue search (GitHub issues as indexed documents)
When
type: issues,queryis the repository slug (owner/repo); if omitted,${{ github.repository }}is used. Each open issue is saved as a markdown file and added to the named collection.Cache key (read-only mode — use a pre-built index without any indexing)
Build + cache mode (build once, reuse on subsequent runs)
Combined form
Each checkout collection can be associated with a
checkout(same syntax as the top-levelcheckout) and an optionalcontexthint for the agent. Each search entry downloads matching files or issues from GitHub and adds them to the index. Both sources can be combined.Changes
QmdToolConfigwithcheckouts []*QmdDocCollection,searches []*QmdSearchEntry,cache-key string,gpu bool, andruns-on string;QmdDocCollectionwithname,paths,context, andcheckout *CheckoutConfigfields;QmdSearchEntrywithname,type,query,min,max,github-token, andgithub-appfields; registered as a builtin acrosstools_parser,tools_types,mcp_config_validation,mcp_setup_generator,claude_tools,dockerDefaultQmdVersion = "2.0.1",DefaultQmdMCPPort = 8181pkg/workflow/qmd.go: indexing job generator — serializes config asQMD_CONFIG_JSON, installs@tobilu/qmdand@actions/githublocally vianpm install --prefix, runs a singleactions/github-scriptstep callingqmd_index.cjs; separate cache restore/save steps for the qmd index and qmd models (~/.cache/qmd/); all cache keys includeDefaultQmdVersion;resolveQmdCacheKey()always emits a version-prefixed cache key (default:gh-aw-qmd<version>-${{ github.run_id }});resolveQmdRestoreKeys()derives a prefix restore key for cross-run fallback; the indexing job hastimeout-minutes: 60and defaults toubuntu-latest;gpu: false(default) injectsNODE_LLAMA_CPP_GPU=falseto skip GPU probing;runs-onoverrides the indexing job's runner image; the agent job restores the qmd index using the precise cache key (no restore-keys fallback);generateQmdStartStep()produces a native Node.js start step (usingactions/setup-node+npx) with acurlhealth-check (120s timeout) that runs before the gatewaypkg/workflow/mcp_renderer_builtin.go: qmd is registered as an HTTP MCP server athttp://localhost:8181/mcp; the gateway connects over HTTP to the natively running qmd processpkg/workflow/mcp_setup_generator.go: qmd start step is injected before the gateway start step in the agent jobactions/setup/js/qmd_index.cjs: JavaScript indexer using the@tobilu/qmdSDK — reads config fromQMD_CONFIG_JSON, processes checkout collections (glob expansion) and search entries (GitHub code search or issue list via Octokit), callsstore.update(),store.embed(),store.close(), and writes a Markdown step summaryactions/setup/js/qmd_index.test.cjs: 20 vitest unit tests covering all code paths: missing config, missing SDK, checkout collections (${ENV_VAR}expansion, default pattern), issues search (valid repo, explicitrepofield, invalid slug, empty slug, min-count), code search (file download, min-count, download errors, malformedfull_name), combined checkouts + searches,finallyblock, andwriteSummary.github/workflows/ci.yml: newjs-qmd-indexCI job that runs theqmd_index.test.cjstests in isolationdocs/src/content/docs/reference/qmd.md: new reference documentation page (sidebar order 730)docs/src/content/docs/reference/tools.md: updated withqmd:entry and link to new reference pagepkg/cli/workflows/smoke-codex.md: updated withcheckouts(docs collection) andsearches(500 open issues) to exercise qmd end-to-endpkg/cli/workflows/dev.md: updated to trigger onlabel_command: dev,workflow_dispatch(no inputs), and dailyschedule; added qmd tool configuration with adocscheckout collection and a500-issuesearch; the agent prompt uses qmd to identify documentation problems reported in issues and then looks up answers in the docs collection⚡ Quickly spin up Copilot coding agent tasks from anywhere on your macOS or Windows machine with Raycast.
✨ PR Review Safe Output Test - Run 23416863680