Skip to content

Proposal: make Repo Mind Light integration more reusable in gh-aw #29063

@szabta89

Description

@szabta89

Summary

I would like to discuss how GitHub Agentic Workflows should support Repo Mind Light in a way that is reusable across workflows.

The motivating example is a workflow in github/pull-requests that currently wires Repo Mind Light manually for issue triage. It works, but the setup is fairly bespoke and mixes together three different concerns:

  1. deterministic prep and indexing
  2. MCP server and agent-job setup
  3. prompt instructions telling the agent to use Repo Mind Light when available

The current shape makes sense for one workflow, but if multiple workflows want to use Repo Mind Light, the amount of hand-written YAML and prompt duplication grows quickly.

Motivating Example

In github/pull-requests, the current issue-triage.md includes:

  • a custom prepare_repo_mind job for incremental indexing, cache restore/save, and artifact upload
  • an inline mcp-servers.repo-mind declaration
  • inline agent-job steps to download the artifact, write config, start the MCP server, wait for preload readiness, and capture logs
  • inline prompt text that explicitly says "Use Repo Mind Light (Required)"

A simplified version of the current shape looks like this:

jobs:
  prepare_repo_mind:
    steps:
      - uses: actions/cache/restore@v5
      - run: docker run ... repo-mind-light index --result-json ...
      - uses: actions/cache/save@v5
      - uses: actions/upload-artifact@v7

mcp-servers:
  repo-mind:
    url: http://127.0.0.1:8000/mcp
    allowed:
      - query

steps:
  - uses: actions/download-artifact@v8
  - run: printf '%s\n' "$REPO_MIND_LIGHT_CONFIG_YAML" > .repo-mind-light.config.yml
  - run: docker run -d ... "$REPO_MIND_LIGHT_IMAGE"
  - run: curl --silent --fail http://127.0.0.1:8000/preload-status

And the workflow body separately contains instructions like:

### 1. Use Repo Mind Light (Required)

Use Repo Mind Light for repo-context retrieval. Make one focused request first,
then make at most one follow-up request only if the first result leaves a
specific gap that matters for routing.

There is also already a reusable setup precedent in the same repo:

imports:
  - copilot-setup-steps.yml

That pattern is attractive because it keeps shared setup out of each workflow.

Design Alternatives

Alternative A: Current primitives, documented pattern

Document Repo Mind Light as a composition pattern using existing gh-aw features:

  • shared .md import for prompt text + mcp-servers + common pre-agent-steps / post-steps
  • reusable GitHub Actions workflow or composite action for deterministic prep/indexing
  • consuming workflows import the shared component and call the reusable prep workflow

That would make the structure look more like:

imports:
  - copilot-setup-steps.yml
  - path: shared/repo-mind-light.md
    with:
      config-yaml: |
        slug: ${{ github.repository }}
        store_path: /var/lib/repo-mind-light/index

jobs:
  prepare_repo_mind:
    uses: ./.github/workflows/reusable-prepare-repo-mind.yml

Pros:

  • works with existing gh-aw architecture
  • keeps compiler surface area smaller
  • shared import can inject both tool wiring and prompt text
  • reusable workflow handles the non-agent deterministic job cleanly

Cons:

  • still requires users to understand two reuse mechanisms
  • still somewhat verbose
  • Repo Mind Light remains a pattern rather than a first-class concept

Alternative B: Stronger shared-component guidance in gh-aw docs/examples

Keep the model the same, but provide a more explicit endorsed pattern and examples for:

  • MCP-backed shared imports with prompt injection
  • deterministic prep via reusable workflows
  • “tool exists + prompt instructs usage” as a first-class authoring guideline

This seems especially relevant because the architecture boundary is not obvious at first glance:

  • reusable Actions workflow can prepare data, but does not affect the agent prompt
  • prompt text alone can instruct usage, but does not create the MCP tool
  • shared gh-aw imports are the mechanism that can do both at once

Pros:

  • low implementation cost
  • probably enough if the current primitives are considered sufficient

Cons:

  • does not reduce actual authoring boilerplate much

Alternative C: Compiler support for more seamless Repo Mind Light integration

Open question for discussion: should gh-aw grow first-class compiler support for Repo Mind Light, or a more general abstraction that would make this kind of MCP-backed repo-context service easier to adopt?

I am not proposing this as the preferred direction yet, but I think it is worth discussing.

Possible shapes:

  • a first-class repo-context: / repo-mind-light: block that expands to the MCP server wiring and common lifecycle steps
  • a special built-in import or template for Repo Mind Light
  • compiler support for bundling a shared MCP setup pattern plus prompt snippet with less manual boilerplate
  • a more general “MCP service lifecycle” abstraction for common start/wait/cleanup patterns

Potential upside:

  • lower friction for teams to adopt Repo Mind Light consistently
  • fewer chances to get cache/artifact/readiness/logging details wrong
  • clearer default prompt guidance around using the tool when present

Potential downside:

  • adds compiler/product surface area for something that may be adequately handled by current imports + reusable workflows
  • may set a precedent for first-class treatment of integrations that could otherwise stay as composition patterns

Questions

  1. Is the recommended direction simply to formalize a two-layer pattern:
    • shared gh-aw import for prompt + MCP wiring
    • reusable GitHub Actions workflow for deterministic prep/indexing?
  2. If yes, should gh-aw docs include an explicit Repo Mind Light example that demonstrates this pattern end to end?
  3. If no, does it make sense to explore compiler changes that make Repo Mind Light support more seamless?
  4. More generally, should gh-aw have a better abstraction for reusable MCP service lifecycle setup, beyond manually combining imports and reusable workflows?

Why I think this is worth discussing

The github/pull-requests example suggests that the current building blocks are close, but the reuse story is not obvious:

  • the author has to manually split deterministic preparation from agent-facing composition
  • prompt reuse and MCP reuse need to be designed together
  • the “agent should use Repo Mind Light if available” behavior depends on both runtime wiring and prompt injection

I think the current approach may already be the right one, but if so it would help to make that architectural pattern much more explicit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions