🧪 CopilotTest — AI-Driven BDD Testing Framework

Write test scenarios in Given/When/Then style — no step implementations required. GitHub Copilot SDK with MCP servers (Playwright, Android Emulator, curl) autonomously interprets and executes each step.

Architecture

┌─────────────────────────────────────────────────┐
│                   Your Tests                    │
│  feature("Login")                               │
│    .scenario("Successful login")                │
│      .given("I am on the login page")           │
│      .when("I enter valid credentials")         │
│      .then("I see the dashboard")               │
└─────────────────┬───────────────────────────────┘
                  │ DSL (src/dsl.ts)
                  ▼
┌─────────────────────────────────────────────────┐
│             CopilotTest Runner                  │
│  configure() → test() → run()                   │
└─────────────────┬───────────────────────────────┘
                  │ src/runtime.ts
                  ▼
┌─────────────────────────────────────────────────┐
│           GitHub Copilot SDK                    │
│  CopilotClient → Session → sendAndWait()        │
└────────┬──────────────┬──────────────┬──────────┘
         │              │              │
    Playwright      curl MCP      Android MCP
       MCP         (REST APIs)   (Mobile apps)
  (Web browsers)

Quick Start

Using the CLI (Recommended)

# Install globally or use npx
npm install -g copilot-test
# or
npx copilot-test <command>

# Initialize new project
copilot-test init

# Run tests
copilot-test run

# List all tests
copilot-test list

# Validate configuration
copilot-test validate

# Health check
copilot-test doctor

Manual Setup

npm install
npm run build
npm test

CLI Reference

The CLI tool provides comprehensive commands for managing your test projects.

Commands

`init` - Initialize New Project

Interactive project scaffolding with templates and examples.

copilot-test init

# Prompts for:
# - Project name
# - Platforms (web, api, mobile)
# - AI Model
# - TypeScript or JavaScript
# - Install dependencies

Creates:

copilot-test.config.ts - Configuration file
tests/ directory with example tests
package.json (if not exists)
tsconfig.json (for TypeScript projects)
.gitignore
README.md

`run` - Run Tests

Execute tests with various options.

# Run all tests
copilot-test run

# Run specific file
copilot-test run tests/login.spec.ts

# Run with filters
copilot-test run --tag=@smoke
copilot-test run --filter="login"
copilot-test run --env=staging

# Run with options
copilot-test run --headless
copilot-test run --parallel
copilot-test run --debug

`list` - List Available Tests

Display all features and scenarios in your test suite.

copilot-test list

# Output:
# Feature: User Login (tests/login.spec.ts)
#   ✓ Scenario: Successful admin login [@smoke]
#   ✓ Scenario: Invalid credentials [@negative]
# Total: 2 features, 6 scenarios

`report` - Generate and View Reports

Open reports or compare test runs.

# Open latest report in browser
copilot-test report
copilot-test report open

# Compare two test runs
copilot-test report compare \
  --baseline copilot-test-results/runs/baseline.json \
  --current copilot-test-results/runs/current.json \
  --output comparison.html

`validate` - Validate Configuration

Check your configuration and environment setup.

copilot-test validate

# Checks:
# ✓ Configuration file exists and is valid
# ✓ Test files present
# ✓ Dependencies installed
# ✓ Node.js version compatible
# ⚠ Warnings and errors

`create` - Create New Test

Scaffold a new test file from templates.

copilot-test create test

# Prompts for:
# - Test type (web, api, mobile)
# - Feature name
# - Scenario name
# - File name

`doctor` - System Health Check

Comprehensive environment validation.

copilot-test doctor

# Checks:
# ✓ Node.js version
# ✓ TypeScript installed
# ✓ Dependencies present
# ✓ Config file valid
# ✓ API keys configured
# ⚠ Warnings and issues

`config` - Manage Global Configuration

Set and manage global CLI preferences.

# Set configuration
copilot-test config set model gpt-4o
copilot-test config set headless true
copilot-test config set parallel true

# Get configuration value
copilot-test config get model

# List all configuration
copilot-test config list

# Delete configuration
copilot-test config delete model

Options

Global options available for commands:

-v, --version - Show CLI version
-h, --help - Show help information
--env <name> - Set environment
--tag <tag> - Filter by tag
--parallel - Enable parallel execution
--headless - Run in headless mode
--debug - Enable debug output

Writing Tests

Web Test

import { configure, feature, test, run } from 'copilot-test';
import { webPlatform } from 'copilot-test';

configure({
  model: 'gpt-4o',
  platforms: { web: webPlatform({ browser: 'chromium' }) },
});

test(
  feature('User Authentication')
    .scenario('Successful login')
      .given("I am on https://example.com/login")
      .when("I enter username 'admin' and password 'secret'")
      .and("I click the Login button")
      .then("I should see the dashboard")
      .done()
    ._build(),
  'web'
);

await run();

API Test

import { configure, feature, test, run } from 'copilot-test';
import { apiPlatform } from 'copilot-test';

configure({
  model: 'gpt-4o',
  platforms: { api: apiPlatform({ baseUrl: 'https://api.example.com' }) },
});

test(
  feature('Users API')
    .scenario('Create a user')
      .given("the Users API is available")
      .when("I POST to /users")
      .withDocString('{"name": "Alice", "email": "alice@example.com"}')
      .then("the response status is 201")
      .and("the response contains the new user's id")
      .done()
    ._build(),
  'api'
);

await run();

Mobile Test

import { configure, feature, test, run } from 'copilot-test';
import { mobilePlatform } from 'copilot-test';

configure({
  model: 'gpt-4o',
  platforms: {
    mobile: mobilePlatform({
      device: 'emulator-5554',
      appPackage: 'com.example.app',
    }),
  },
});

test(
  feature('App Onboarding')
    .scenario('New user completes onboarding')
      .given("the app is launched for the first time")
      .when("I tap 'Get Started'")
      .and("I fill in my profile details")
      .then("I see the home screen")
      .done()
    ._build(),
  'mobile'
);

await run();

Configuration Reference

configure({
  model: 'gpt-4o',                    // AI model to use
  reasoningEffort: 'high',            // 'low' | 'medium' | 'high'
  platforms: {
    web: webPlatform({ ... }),
    api: apiPlatform({ ... }),
    mobile: mobilePlatform({ ... }),
  },
  baseUrl: 'https://example.com',     // Default base URL
  stepTimeout: 30000,                 // Timeout per step (ms)
  retries: 2,                         // Retry failed scenarios
  screenshotOnFailure: true,          // Capture screenshots on failure
  outputDir: 'copilot-test-results',  // Report output directory
  mcpServers: {                       // Additional MCP servers
    database: { type: 'stdio', command: 'npx', args: ['my-db-mcp'] },
  },
  // Parallel execution options (NEW)
  parallel: true,                     // Enable parallel scenario execution
  maxWorkers: 4,                      // Number of concurrent workers (or 'auto' for CPU-based)
  workerTimeout: 300000,              // Max time per scenario (ms, default: 5 minutes)
  failFast: false,                    // Stop all workers on first failure
  // Watch mode options (NEW)
  watch: {
    enabled: true,                    // Enable watch mode
    include: ['src/**/*.ts', 'tests/**/*.spec.ts'],  // Files to watch
    exclude: ['node_modules/**', 'dist/**'],         // Files to exclude
    debounce: 300,                    // Delay before re-running (ms)
    runMode: 'all',                   // 'all' | 'related' | 'changed-files'
    failedFirst: true,                // Run failed tests first
    clearConsole: false,              // Clear console before each run
  },
});

Watch Mode

Run tests continuously during development with automatic re-execution on file changes:

npm run test:watch tests/login.spec.ts

Note: Watch mode CLI requires a test file path. The test file should call configure() and test() but NOT run() - watch mode handles test execution.

Interactive Controls

When running in a terminal, watch mode provides keyboard controls:

Interactive Commands:
  a - Run all tests
  f - Run only failed tests
  q - Quit watch mode
  Enter - Re-run tests

Watch Mode UI

╔════════════════════════════════════════╗
║      COPILOT TEST - WATCH MODE         ║
╚════════════════════════════════════════╝

📁 Watching 42 files...

============================================================
🔄 Running tests... (10:30:45 AM)
============================================================

📝 Changed files:
  • src/login.ts
  • tests/login.spec.ts

[Test execution output...]

╔════════════════════════════════════════╗
║ Status: ✓ All tests passed            ║
║ Tests: 12 passed, 0 failed            ║
║ Pass rate: 100%                        ║
║ Duration: 2345ms                       ║
╚════════════════════════════════════════╝

👀 Watching for file changes...

Configuration

configure({
  platforms: { web: webPlatform() },
  watch: {
    enabled: true,                    // Enable watch mode
    include: ['src/**/*.ts', 'tests/**/*.spec.ts'],  // Files to watch
    exclude: ['node_modules/**', 'dist/**'],         // Files to exclude
    debounce: 300,                    // Delay before re-running (ms)
    runMode: 'all',                   // 'all' | 'related' | 'changed-files'
    failedFirst: true,                // Run failed tests first
    clearConsole: false,              // Clear console before each run
    maxWorkers: 2,                    // Limit workers in watch mode
  },
});

See Watch Mode Documentation for more details.

Parallel Execution

Run scenarios in parallel for significantly faster test execution:

configure({
  model: 'gpt-4o',
  platforms: { web: webPlatform() },
  parallel: true,           // Enable parallel execution
  maxWorkers: 4,            // Run 4 scenarios concurrently
  workerTimeout: 300000,    // 5 minute timeout per worker
  failFast: false,          // Continue running even if one fails
});

Configuration Options

parallel: Enable/disable parallel execution (default: false)
maxWorkers: Number of concurrent workers
- Use a number (e.g., 4) for fixed worker count
- Use 'auto' to automatically determine based on CPU cores (CPU count - 1)
workerTimeout: Maximum time a scenario can run before timing out (default: 300000ms / 5 minutes)
failFast: Stop all workers immediately when any scenario fails (default: false)

Benefits

Faster execution: 50+ scenarios can run in minutes instead of tens of minutes
Better resource utilization: Utilize multiple CPU cores effectively
CI/CD optimization: Reduce pipeline execution time
Proper isolation: Each scenario gets its own session and resources

Example Output

⚡ Running 12 scenarios with 4 workers

[Worker 0] Starting scenario: User login
[Worker 1] Starting scenario: Password reset
[Worker 2] Starting scenario: Profile update
[Worker 3] Starting scenario: Logout flow
[Worker 0] ✅ User login (2341ms) [1/12]
[Worker 0] Starting scenario: Two-factor auth
[Worker 2] ✅ Profile update (2456ms) [2/12]
...

✨ Parallel execution complete: 11 passed, 1 failed

DSL Reference

feature(name: string)
  .tag(...tags)
  .description(text)
  .background()
    .given(step)
    .and(step)
    .scenario(name)  // ends background, starts scenario
  .scenario(name)
    .tag(...tags)
    .given(step)
    .when(step)
    .then(step)
    .and(step)
    .but(step)
    .withTable([[header1, header2], [val1, val2]])
    .withDocString(text)
    .scenario(nextScenario)  // chain next scenario
    .done()  // end builder, returns FeatureBuilder
  ._build()  // returns Feature object

How It Works

You write BDD scenarios with Given/When/Then steps — no implementation needed
CopilotTest creates a GitHub Copilot SDK session per scenario
The AI agent receives your step as a prompt with platform-specific tools available
MCP tools allow the AI to actually interact with browsers, APIs, or mobile apps
Results are collected, displayed in real-time, and saved as an HTML report

CI/CD — GitHub Actions

name: BDD Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm ci
      - run: npm run build
      - run: npm test
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: test-report
          path: copilot-test-results/

Design Principles

Principle	Description
Zero-implementation	Write intent, not code. The AI figures out how to execute it.
Platform agnostic	Same DSL for web, mobile, and API testing
AI-powered	GitHub Copilot SDK drives test execution via MCP tools
BDD-native	Given/When/Then syntax promotes collaboration
Transparent	AI reasoning is captured and included in reports
Extensible	Add custom MCP servers for any tool or platform

Project Structure

src/
  types.ts          # Core TypeScript interfaces
  dsl.ts            # Fluent BDD builder (feature/scenario/step)
  runtime.ts        # CopilotTestRuntime — core AI execution engine
  runner.ts         # Test queue, configure/test/run functions
  reporter.ts       # HTML/JSON report generator
  compare.ts        # Test run comparison utilities
  cli-compare.ts    # CLI for comparing test runs
  platforms/
    web.ts          # Playwright MCP platform config
    api.ts          # curl MCP platform config
    mobile.ts       # Android MCP platform config
  index.ts          # Public API exports
tests/
  login.spec.ts     # Web test example
  api-users.spec.ts # API test example
  mobile-app.spec.ts # Mobile test example
copilot-test.config.ts  # Global config example

📊 Enhanced Test Reporting

CopilotTest generates interactive HTML reports with advanced filtering, search, and historical tracking capabilities.

Report Structure

After running tests, reports are saved in the configured output directory (default: copilot-test-results/):

copilot-test-results/
├── index.html                    # Dashboard showing all test runs
├── report.html                   # Latest test run report
├── report.json                   # Latest test run data
├── trends.json                   # Historical trends data
└── runs/
    ├── 2024-01-15T10-30-00.html # Timestamped run report
    ├── 2024-01-15T10-30-00.json # Timestamped run data
    ├── 2024-01-15T14-20-00.html
    └── 2024-01-15T14-20-00.json

Interactive Features

1. Filtering & Search

The HTML report includes interactive controls:

Status Filters: View All, Passed only, or Failed only scenarios
Search: Filter scenarios by name in real-time
Tag Filters: Click tags to filter scenarios by specific tags
Export: Download test results as JSON

2. Metadata Display

Reports automatically capture CI/CD metadata from environment variables:

{
  "metadata": {
    "timestamp": "2024-01-15T10:30:00Z",
    "duration": 45000,
    "environment": "staging",
    "git": {
      "branch": "main",
      "commit": "abc123",
      "author": "John Doe"
    },
    "ci": {
      "buildNumber": "123",
      "jobUrl": "https://github.com/owner/repo/actions/runs/123"
    }
  }
}

Supported environment variables:

NODE_ENV, ENVIRONMENT → environment
GITHUB_REF_NAME, GIT_BRANCH → git.branch
GITHUB_SHA, GIT_COMMIT → git.commit
GITHUB_ACTOR, GIT_AUTHOR → git.author
GITHUB_RUN_NUMBER, BUILD_NUMBER → ci.buildNumber
GitHub Actions URL auto-generated from GITHUB_SERVER_URL, GITHUB_REPOSITORY, GITHUB_RUN_ID

3. Dashboard

Open copilot-test-results/index.html to view:

History of recent test runs (up to 20)
Pass/fail trends over time
Duration trends
Quick access to individual run reports
Download links for JSON data

4. AI Reasoning

AI reasoning is captured in collapsible sections (collapsed by default):

Click "AI Reasoning" to expand/collapse
Shows the AI's thought process for each step
Helps debug why a step passed or failed

🔍 Comparing Test Runs

Compare two test runs to identify improvements, regressions, and performance changes.

Using the API

import { compareTestRuns } from 'copilot-test';

const result = await compareTestRuns(
  'copilot-test-results/runs/2024-01-15T10-30-00.json',
  'copilot-test-results/runs/2024-01-15T14-20-00.json',
  'comparison.html'
);

console.log('Improvements:', result.changes.improved.length);
console.log('Regressions:', result.changes.regressed.length);
console.log('Duration change:', result.performance.durationChange);

Using the CLI

npx tsx src/cli-compare.ts \
  --baseline copilot-test-results/runs/2024-01-15T10-30-00.json \
  --current copilot-test-results/runs/2024-01-15T14-20-00.json \
  --output comparison.html

Comparison Report Features

The comparison report shows:

Summary Cards: Pass rate change, duration change, improvements, regressions
Improvements: Tests that were failing and now pass
Regressions: Tests that were passing and now fail
New Scenarios: Scenarios added since baseline
Removed Scenarios: Scenarios removed since baseline
Performance Changes: Top 10 scenarios with significant duration changes (>100ms)

The CLI exits with code 1 if regressions are detected, making it ideal for CI/CD pipelines.

CI/CD Integration

GitHub Actions Example

name: BDD Tests with Comparison
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - run: npm ci
      - run: npm run build

      # Download previous run results
      - uses: actions/download-artifact@v4
        continue-on-error: true
        with:
          name: test-results-baseline
          path: baseline/

      # Run tests
      - run: npm test
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      # Compare with baseline if it exists
      - name: Compare with baseline
        if: hashFiles('baseline/report.json') != ''
        run: |
          npx tsx src/cli-compare.ts \
            --baseline baseline/report.json \
            --current copilot-test-results/report.json \
            --output copilot-test-results/comparison.html
        continue-on-error: true

      # Upload current run as next baseline
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: test-results-baseline
          path: copilot-test-results/report.json

      # Upload full report
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: test-reports
          path: copilot-test-results/

Trends & Historical Analysis

The trends.json file tracks up to 50 recent test runs:

{
  "runs": [
    {
      "timestamp": "2024-01-15T10:30:00Z",
      "duration": 45000,
      "total": 25,
      "passed": 23,
      "failed": 2,
      "skipped": 0,
      "passRate": 92
    }
  ]
}

Use this data to:

Track test suite stability over time
Monitor test execution duration trends
Identify flaky tests (tests with inconsistent results)
Measure improvement or degradation in pass rates

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
RETRY_IMPLEMENTATION.md		RETRY_IMPLEMENTATION.md
copilot-test.config.ts		copilot-test.config.ts
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

🧪 CopilotTest — AI-Driven BDD Testing Framework

Architecture

Quick Start

Using the CLI (Recommended)

Manual Setup

CLI Reference

Commands

init - Initialize New Project

run - Run Tests

list - List Available Tests

report - Generate and View Reports

validate - Validate Configuration

create - Create New Test

doctor - System Health Check

config - Manage Global Configuration

Options

Writing Tests

Web Test

API Test

Mobile Test

Configuration Reference

Watch Mode

Interactive Controls

Watch Mode UI

Configuration

Parallel Execution

Configuration Options

Benefits

Example Output

DSL Reference

How It Works

CI/CD — GitHub Actions

Design Principles

Project Structure

📊 Enhanced Test Reporting

Report Structure

Interactive Features

1. Filtering & Search

2. Metadata Display

3. Dashboard

4. AI Reasoning

🔍 Comparing Test Runs

Using the API

Using the CLI

Comparison Report Features

CI/CD Integration

GitHub Actions Example

Trends & Historical Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`init` - Initialize New Project

`run` - Run Tests

`list` - List Available Tests

`report` - Generate and View Reports

`validate` - Validate Configuration

`create` - Create New Test

`doctor` - System Health Check

`config` - Manage Global Configuration

Packages