Skip to content

Add scripting interface/batch mode for Copilot Agent Mode? #254473

Open
@bartlettroscoe

Description

@bartlettroscoe

There are a lot of AI agent frameworks out there that allow you to batch the running of an AI agent to help solve software problems. Also, one needs to be able to call an AI agent with a prompt and a set of MCP tools to execute a task to edit code, update documentation, add tests, etc., with feedback provided by builds, running tests, running tools etc.

It would be great if the open-source Copilot could be extended to be able to run in batch mode as well were it is given a prompt, context description, a local MCP server (perhaps) and then make it go do the task. In this mode, the agent would run without interaction with the user and would automatically apply changes to any modified files. This would be equivalent to the user clicking "Okay" to every command request and "Keep" to all of the suggested changes at each iteration of the agent. It is only when the agent completed the criteria provided in the prompt (e.g. code builds and tests pass) or it exhausted its max iterations or max wall-clock time would it return.

Examples of tasks that could be automated across many different instances would be:

  • Refactor/factor out code in file <X> in the parts <a> and <b> so that so that it can be run in a unit test harness and add a few unit tests to call the refactored code

  • Add unit tests for the factored out code to match coverage provide by <a> , <b>, <c>, ...

VSCode + Copilot offers a lot of infrastructure for a more general purpose AI agent including:

  • System for registering and calling MCP tools
  • Management of specialized prompts (which may mention specific MCP tools) and rules for when to apply which specialized prompt
  • Ability to gather context from a large code-base to add to the prompt
  • Ability to get back updates from the LLM and apply patches to the files in the project

It is a lot of work to set all of that up for a large/complex project. So it would be good if we did not need to do this for both VSCode + Copilot to drive local development and for a different AI agent framework (e.g. OpenHands) to drive development of specialized prompts, MCP tools, on custom benchmark/evaluation suites.

If open-source Copilot were to support such a batch/scripting mode, we could just go all-in on VSCode Copilot Agent mode both for our research and deploying AI agents to end users in our institution. Specifically, we would use batch mode to run a suite of our own benchmark problems and allow us to experiment with different MCP tools, prompts, LLMs, max iterations and/or runtime, etc. And when we found combinations that worked well on our benchmarks, developers could immediately use them to do development with VSCode + Copilot Agent Mode.

And if we are running local models (e.g. #246551 and #254463), there should be no worry about hammering the online LLMs that your GitHub Copilot subscription is paying for.

This would provide the basic foundation we need to our work with AI agents for software development research and application in our institution. We would not need any other AI agents to dive our research and development work. (Without this and #246551/#254463, we would have to look for other options.)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions