Write test scenarios in Given/When/Then style — no step implementations required. GitHub Copilot SDK with MCP servers (Playwright, Android Emulator, curl) autonomously interprets and executes each step.
┌─────────────────────────────────────────────────┐
│ Your Tests │
│ feature("Login") │
│ .scenario("Successful login") │
│ .given("I am on the login page") │
│ .when("I enter valid credentials") │
│ .then("I see the dashboard") │
└─────────────────┬───────────────────────────────┘
│ DSL (src/dsl.ts)
▼
┌─────────────────────────────────────────────────┐
│ CopilotTest Runner │
│ configure() → test() → run() │
└─────────────────┬───────────────────────────────┘
│ src/runtime.ts
▼
┌─────────────────────────────────────────────────┐
│ GitHub Copilot SDK │
│ CopilotClient → Session → sendAndWait() │
└────────┬──────────────┬──────────────┬──────────┘
│ │ │
Playwright curl MCP Android MCP
MCP (REST APIs) (Mobile apps)
(Web browsers)
# Install globally or use npx
npm install -g copilot-test
# or
npx copilot-test <command>
# Initialize new project
copilot-test init
# Run tests
copilot-test run
# List all tests
copilot-test list
# Validate configuration
copilot-test validate
# Health check
copilot-test doctornpm install
npm run build
npm testThe CLI tool provides comprehensive commands for managing your test projects.
Interactive project scaffolding with templates and examples.
copilot-test init
# Prompts for:
# - Project name
# - Platforms (web, api, mobile)
# - AI Model
# - TypeScript or JavaScript
# - Install dependenciesCreates:
copilot-test.config.ts- Configuration filetests/directory with example testspackage.json(if not exists)tsconfig.json(for TypeScript projects).gitignoreREADME.md
Execute tests with various options.
# Run all tests
copilot-test run
# Run specific file
copilot-test run tests/login.spec.ts
# Run with filters
copilot-test run --tag=@smoke
copilot-test run --filter="login"
copilot-test run --env=staging
# Run with options
copilot-test run --headless
copilot-test run --parallel
copilot-test run --debugDisplay all features and scenarios in your test suite.
copilot-test list
# Output:
# Feature: User Login (tests/login.spec.ts)
# ✓ Scenario: Successful admin login [@smoke]
# ✓ Scenario: Invalid credentials [@negative]
# Total: 2 features, 6 scenariosOpen reports or compare test runs.
# Open latest report in browser
copilot-test report
copilot-test report open
# Compare two test runs
copilot-test report compare \
--baseline copilot-test-results/runs/baseline.json \
--current copilot-test-results/runs/current.json \
--output comparison.htmlCheck your configuration and environment setup.
copilot-test validate
# Checks:
# ✓ Configuration file exists and is valid
# ✓ Test files present
# ✓ Dependencies installed
# ✓ Node.js version compatible
# ⚠ Warnings and errorsScaffold a new test file from templates.
copilot-test create test
# Prompts for:
# - Test type (web, api, mobile)
# - Feature name
# - Scenario name
# - File nameComprehensive environment validation.
copilot-test doctor
# Checks:
# ✓ Node.js version
# ✓ TypeScript installed
# ✓ Dependencies present
# ✓ Config file valid
# ✓ API keys configured
# ⚠ Warnings and issuesSet and manage global CLI preferences.
# Set configuration
copilot-test config set model gpt-4o
copilot-test config set headless true
copilot-test config set parallel true
# Get configuration value
copilot-test config get model
# List all configuration
copilot-test config list
# Delete configuration
copilot-test config delete modelGlobal options available for commands:
-v, --version- Show CLI version-h, --help- Show help information--env <name>- Set environment--tag <tag>- Filter by tag--parallel- Enable parallel execution--headless- Run in headless mode--debug- Enable debug output
import { configure, feature, test, run } from 'copilot-test';
import { webPlatform } from 'copilot-test';
configure({
model: 'gpt-4o',
platforms: { web: webPlatform({ browser: 'chromium' }) },
});
test(
feature('User Authentication')
.scenario('Successful login')
.given("I am on https://example.com/login")
.when("I enter username 'admin' and password 'secret'")
.and("I click the Login button")
.then("I should see the dashboard")
.done()
._build(),
'web'
);
await run();import { configure, feature, test, run } from 'copilot-test';
import { apiPlatform } from 'copilot-test';
configure({
model: 'gpt-4o',
platforms: { api: apiPlatform({ baseUrl: 'https://api.example.com' }) },
});
test(
feature('Users API')
.scenario('Create a user')
.given("the Users API is available")
.when("I POST to /users")
.withDocString('{"name": "Alice", "email": "alice@example.com"}')
.then("the response status is 201")
.and("the response contains the new user's id")
.done()
._build(),
'api'
);
await run();import { configure, feature, test, run } from 'copilot-test';
import { mobilePlatform } from 'copilot-test';
configure({
model: 'gpt-4o',
platforms: {
mobile: mobilePlatform({
device: 'emulator-5554',
appPackage: 'com.example.app',
}),
},
});
test(
feature('App Onboarding')
.scenario('New user completes onboarding')
.given("the app is launched for the first time")
.when("I tap 'Get Started'")
.and("I fill in my profile details")
.then("I see the home screen")
.done()
._build(),
'mobile'
);
await run();configure({
model: 'gpt-4o', // AI model to use
reasoningEffort: 'high', // 'low' | 'medium' | 'high'
platforms: {
web: webPlatform({ ... }),
api: apiPlatform({ ... }),
mobile: mobilePlatform({ ... }),
},
baseUrl: 'https://example.com', // Default base URL
stepTimeout: 30000, // Timeout per step (ms)
retries: 2, // Retry failed scenarios
screenshotOnFailure: true, // Capture screenshots on failure
outputDir: 'copilot-test-results', // Report output directory
mcpServers: { // Additional MCP servers
database: { type: 'stdio', command: 'npx', args: ['my-db-mcp'] },
},
// Parallel execution options (NEW)
parallel: true, // Enable parallel scenario execution
maxWorkers: 4, // Number of concurrent workers (or 'auto' for CPU-based)
workerTimeout: 300000, // Max time per scenario (ms, default: 5 minutes)
failFast: false, // Stop all workers on first failure
// Watch mode options (NEW)
watch: {
enabled: true, // Enable watch mode
include: ['src/**/*.ts', 'tests/**/*.spec.ts'], // Files to watch
exclude: ['node_modules/**', 'dist/**'], // Files to exclude
debounce: 300, // Delay before re-running (ms)
runMode: 'all', // 'all' | 'related' | 'changed-files'
failedFirst: true, // Run failed tests first
clearConsole: false, // Clear console before each run
},
});Run tests continuously during development with automatic re-execution on file changes:
npm run test:watch tests/login.spec.tsNote: Watch mode CLI requires a test file path. The test file should call configure() and test() but NOT run() - watch mode handles test execution.
When running in a terminal, watch mode provides keyboard controls:
Interactive Commands:
a - Run all tests
f - Run only failed tests
q - Quit watch mode
Enter - Re-run tests
╔════════════════════════════════════════╗
║ COPILOT TEST - WATCH MODE ║
╚════════════════════════════════════════╝
📁 Watching 42 files...
============================================================
🔄 Running tests... (10:30:45 AM)
============================================================
📝 Changed files:
• src/login.ts
• tests/login.spec.ts
[Test execution output...]
╔════════════════════════════════════════╗
║ Status: ✓ All tests passed ║
║ Tests: 12 passed, 0 failed ║
║ Pass rate: 100% ║
║ Duration: 2345ms ║
╚════════════════════════════════════════╝
👀 Watching for file changes...
configure({
platforms: { web: webPlatform() },
watch: {
enabled: true, // Enable watch mode
include: ['src/**/*.ts', 'tests/**/*.spec.ts'], // Files to watch
exclude: ['node_modules/**', 'dist/**'], // Files to exclude
debounce: 300, // Delay before re-running (ms)
runMode: 'all', // 'all' | 'related' | 'changed-files'
failedFirst: true, // Run failed tests first
clearConsole: false, // Clear console before each run
maxWorkers: 2, // Limit workers in watch mode
},
});See Watch Mode Documentation for more details.
Run scenarios in parallel for significantly faster test execution:
configure({
model: 'gpt-4o',
platforms: { web: webPlatform() },
parallel: true, // Enable parallel execution
maxWorkers: 4, // Run 4 scenarios concurrently
workerTimeout: 300000, // 5 minute timeout per worker
failFast: false, // Continue running even if one fails
});parallel: Enable/disable parallel execution (default:false)maxWorkers: Number of concurrent workers- Use a number (e.g.,
4) for fixed worker count - Use
'auto'to automatically determine based on CPU cores (CPU count - 1)
- Use a number (e.g.,
workerTimeout: Maximum time a scenario can run before timing out (default:300000ms/ 5 minutes)failFast: Stop all workers immediately when any scenario fails (default:false)
- Faster execution: 50+ scenarios can run in minutes instead of tens of minutes
- Better resource utilization: Utilize multiple CPU cores effectively
- CI/CD optimization: Reduce pipeline execution time
- Proper isolation: Each scenario gets its own session and resources
⚡ Running 12 scenarios with 4 workers
[Worker 0] Starting scenario: User login
[Worker 1] Starting scenario: Password reset
[Worker 2] Starting scenario: Profile update
[Worker 3] Starting scenario: Logout flow
[Worker 0] ✅ User login (2341ms) [1/12]
[Worker 0] Starting scenario: Two-factor auth
[Worker 2] ✅ Profile update (2456ms) [2/12]
...
✨ Parallel execution complete: 11 passed, 1 failed
feature(name: string)
.tag(...tags)
.description(text)
.background()
.given(step)
.and(step)
.scenario(name) // ends background, starts scenario
.scenario(name)
.tag(...tags)
.given(step)
.when(step)
.then(step)
.and(step)
.but(step)
.withTable([[header1, header2], [val1, val2]])
.withDocString(text)
.scenario(nextScenario) // chain next scenario
.done() // end builder, returns FeatureBuilder
._build() // returns Feature object- You write BDD scenarios with Given/When/Then steps — no implementation needed
- CopilotTest creates a GitHub Copilot SDK session per scenario
- The AI agent receives your step as a prompt with platform-specific tools available
- MCP tools allow the AI to actually interact with browsers, APIs, or mobile apps
- Results are collected, displayed in real-time, and saved as an HTML report
name: BDD Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- run: npm run build
- run: npm test
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/upload-artifact@v4
if: always()
with:
name: test-report
path: copilot-test-results/| Principle | Description |
|---|---|
| Zero-implementation | Write intent, not code. The AI figures out how to execute it. |
| Platform agnostic | Same DSL for web, mobile, and API testing |
| AI-powered | GitHub Copilot SDK drives test execution via MCP tools |
| BDD-native | Given/When/Then syntax promotes collaboration |
| Transparent | AI reasoning is captured and included in reports |
| Extensible | Add custom MCP servers for any tool or platform |
src/
types.ts # Core TypeScript interfaces
dsl.ts # Fluent BDD builder (feature/scenario/step)
runtime.ts # CopilotTestRuntime — core AI execution engine
runner.ts # Test queue, configure/test/run functions
reporter.ts # HTML/JSON report generator
compare.ts # Test run comparison utilities
cli-compare.ts # CLI for comparing test runs
platforms/
web.ts # Playwright MCP platform config
api.ts # curl MCP platform config
mobile.ts # Android MCP platform config
index.ts # Public API exports
tests/
login.spec.ts # Web test example
api-users.spec.ts # API test example
mobile-app.spec.ts # Mobile test example
copilot-test.config.ts # Global config example
CopilotTest generates interactive HTML reports with advanced filtering, search, and historical tracking capabilities.
After running tests, reports are saved in the configured output directory (default: copilot-test-results/):
copilot-test-results/
├── index.html # Dashboard showing all test runs
├── report.html # Latest test run report
├── report.json # Latest test run data
├── trends.json # Historical trends data
└── runs/
├── 2024-01-15T10-30-00.html # Timestamped run report
├── 2024-01-15T10-30-00.json # Timestamped run data
├── 2024-01-15T14-20-00.html
└── 2024-01-15T14-20-00.json
The HTML report includes interactive controls:
- Status Filters: View All, Passed only, or Failed only scenarios
- Search: Filter scenarios by name in real-time
- Tag Filters: Click tags to filter scenarios by specific tags
- Export: Download test results as JSON
Reports automatically capture CI/CD metadata from environment variables:
{
"metadata": {
"timestamp": "2024-01-15T10:30:00Z",
"duration": 45000,
"environment": "staging",
"git": {
"branch": "main",
"commit": "abc123",
"author": "John Doe"
},
"ci": {
"buildNumber": "123",
"jobUrl": "https://github.com/owner/repo/actions/runs/123"
}
}
}Supported environment variables:
NODE_ENV,ENVIRONMENT→ environmentGITHUB_REF_NAME,GIT_BRANCH→ git.branchGITHUB_SHA,GIT_COMMIT→ git.commitGITHUB_ACTOR,GIT_AUTHOR→ git.authorGITHUB_RUN_NUMBER,BUILD_NUMBER→ ci.buildNumber- GitHub Actions URL auto-generated from
GITHUB_SERVER_URL,GITHUB_REPOSITORY,GITHUB_RUN_ID
Open copilot-test-results/index.html to view:
- History of recent test runs (up to 20)
- Pass/fail trends over time
- Duration trends
- Quick access to individual run reports
- Download links for JSON data
AI reasoning is captured in collapsible sections (collapsed by default):
- Click "AI Reasoning" to expand/collapse
- Shows the AI's thought process for each step
- Helps debug why a step passed or failed
Compare two test runs to identify improvements, regressions, and performance changes.
import { compareTestRuns } from 'copilot-test';
const result = await compareTestRuns(
'copilot-test-results/runs/2024-01-15T10-30-00.json',
'copilot-test-results/runs/2024-01-15T14-20-00.json',
'comparison.html'
);
console.log('Improvements:', result.changes.improved.length);
console.log('Regressions:', result.changes.regressed.length);
console.log('Duration change:', result.performance.durationChange);npx tsx src/cli-compare.ts \
--baseline copilot-test-results/runs/2024-01-15T10-30-00.json \
--current copilot-test-results/runs/2024-01-15T14-20-00.json \
--output comparison.htmlThe comparison report shows:
- Summary Cards: Pass rate change, duration change, improvements, regressions
- Improvements: Tests that were failing and now pass
- Regressions: Tests that were passing and now fail
- New Scenarios: Scenarios added since baseline
- Removed Scenarios: Scenarios removed since baseline
- Performance Changes: Top 10 scenarios with significant duration changes (>100ms)
The CLI exits with code 1 if regressions are detected, making it ideal for CI/CD pipelines.
name: BDD Tests with Comparison
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- run: npm run build
# Download previous run results
- uses: actions/download-artifact@v4
continue-on-error: true
with:
name: test-results-baseline
path: baseline/
# Run tests
- run: npm test
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Compare with baseline if it exists
- name: Compare with baseline
if: hashFiles('baseline/report.json') != ''
run: |
npx tsx src/cli-compare.ts \
--baseline baseline/report.json \
--current copilot-test-results/report.json \
--output copilot-test-results/comparison.html
continue-on-error: true
# Upload current run as next baseline
- uses: actions/upload-artifact@v4
if: always()
with:
name: test-results-baseline
path: copilot-test-results/report.json
# Upload full report
- uses: actions/upload-artifact@v4
if: always()
with:
name: test-reports
path: copilot-test-results/The trends.json file tracks up to 50 recent test runs:
{
"runs": [
{
"timestamp": "2024-01-15T10:30:00Z",
"duration": 45000,
"total": 25,
"passed": 23,
"failed": 2,
"skipped": 0,
"passRate": 92
}
]
}Use this data to:
- Track test suite stability over time
- Monitor test execution duration trends
- Identify flaky tests (tests with inconsistent results)
- Measure improvement or degradation in pass rates