Skip to content

Ktovoz/Skiritai

Repository files navigation

Skiritai

AI-Powered Test Automation Agent

Named after the Skiritai — Sparta's elite reconnaissance troops who scouted the path ahead of the main army.


Python 3.11+ Playwright License: MIT

Test Status Publish

English | 中文


What is Skiritai?

Skiritai is an AI-driven browser test automation framework that scouts automation paths before executing them.

Like the ancient Skiritai who reconnoitered the terrain before the Spartan army advanced, Skiritai's agent first * explores* the target application — navigating pages, discovering UI elements, and figuring out the correct sequence of actions — then generates replayable scripts that can execute the same path at 30x speed without any AI inference.

Explore Mode (Scout the path)
  AI Agent → analyze page → decide actions → generate scripts
         ↓
Replay Mode (Execute the proven path)
  Script → direct execution → no AI needed → 30x faster

Key Features

Feature Description
Explore → Replay Loop AI explores and generates scripts on first run; replays them instantly on subsequent runs
30x Performance Replay mode skips AI inference entirely — 74s → 2.5s
Flow API Functional, no-subclass API — async with flow() as ai:
YAML Cases Define test steps in YAML, run via CLI or run_yaml_case()
Python-native Cases Define test cases as Python classes with decorators
Auto-Solidification Successful explorations are automatically saved as replayable scripts
Multi-level Fallback fillclick_forceeval_js for resilient element interaction
Flexible LLM Supports OpenAI, Anthropic, Qwen, and any compatible API
Optional Web UI FastAPI backend with REST + WebSocket for external frontends
Visual Reports Standalone HTML report built with Vue 3 + Ant Design — screenshots, assertions, step details
CLI skiritai run/serve/list/browser commands

How It Works

from skiritai import BaseCase, step_mode


class SearchTest(BaseCase):
    async def setup(self):
        await self.launch_browser()

    async def teardown(self):
        await self.close_browser()

    async def open_site(self):
        await self.ai.action("Navigate to https://example.com")

    @step_mode("explore")  # Force AI exploration for this step
    async def search(self):
        await self.ai.action("Search for 'automation testing'")

    async def verify(self):
        await self.ai.action("Verify search results are displayed")

First run — AI explores each step, generates scripts:

[Step] open_site   (explore)  → 20s  → scripts/open_site.py   ✓
[Step] search      (explore)  → 30s  → scripts/search.py      ✓
[Step] verify      (explore)  → 24s  → scripts/verify.py      ✓
Total: 74s

Second run — scripts replay directly, no AI:

[Step] open_site   (replay)   → 0.8s → direct execution       ✓
[Step] search      (replay)   → 0.8s → direct execution       ✓
[Step] verify      (replay)   → 0.8s → direct execution       ✓
Total: 2.5s

Flow API (Functional, No Subclass)

from skiritai import flow

async with flow() as ai:
    await ai.action("Navigate to https://example.com")
    await ai.screenshot("homepage")
    await ai.verify("Page title contains 'Example'")

flow() is a functional context manager — no subclass, no decorators. Just ai.action(), ai.verify(), ai.screenshot(), ai.analyze_page(), and ai.get_page_info().

YAML Cases (No Code)

# case.yaml
name: Search Test
steps:
  - action: Open https://www.baidu.com
  - action: Search for "Playwright"
  - verify: Search results are displayed
  - screenshot: result
skiritai run examples/beginner/baidu_search/03_yaml

YAML cases are perfect for QA teams who want AI-driven testing without writing Python. Supports action, verify, screenshot, analyze, page_info steps, with per-step on_failure: skip | abort policies.

Quick Start

1. Install

pip install skiritai
playwright install chromium

2. Configure

# .env
OPENAI_API_KEY=your-api-key
OPENAI_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4o

3. Run

# Run an example case
skiritai run examples/tutorial/minimal

# List available cases
skiritai list examples/

Or programmatically:

import asyncio
from pathlib import Path
from skiritai import run_case

report = asyncio.run(run_case(Path("examples/minimal")))
print(report)

4. (Optional) Start Web Server

pip install skiritai[web]
skiritai serve --host 0.0.0.0 --port 8000

Development

# Clone and install
git clone https://github.com/Ktovoz/Skiritai.git
cd Skiritai

# Install uv (one-time)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Sync dependencies and set up dev environment
uv sync

# Run tests
uv run pytest

Project Structure

skiritai/
├── core/                      # Core engine (always installed)
│   ├── agent_loop.py          # LangGraph ReAct Agent
│   ├── ai_context.py          # Explore/Replay execution context
│   ├── base_case.py           # Test case base class
│   ├── runner.py              # Case discovery and execution
│   ├── flow.py                # Functional no-subclass API
│   ├── yaml_runner.py         # YAML case loader and runner
│   ├── tools.py               # Playwright tool set (14 tools)
│   ├── browser.py             # Browser lifecycle management
│   └── ...
├── llm/                       # LLM provider abstraction
│   ├── openai_provider.py
│   └── anthropic_provider.py
├── events/                    # Event bus
├── web/                       # [optional] FastAPI server (pip install skiritai[web])
│   ├── app.py                 # Application factory
│   ├── routers/               # REST + WebSocket endpoints
│   └── ws_manager.py          # Event → WebSocket bridge
└── cli.py                     # CLI entry point

report/                        # Visual report project (Vue 3 + Ant Design)
├── src/                       #   Components: ReportHeader, SummaryBar, StepCard, ScreenshotViewer
├── dist/                      #   Build output (single-file HTML, data injected by _render_html)
└── package.json               #   skiritai-report

examples/                      # Sample test cases
├── tutorial/                  # Teaching examples (learn framework features)
│   ├── minimal/               #   Pure Playwright, no AI needed
│   ├── step_modes/            #   auto/explore/replay execution modes
│   ├── failure_policies/      #   ABORT/SKIP/RETRY failure strategies
│   ├── hooks_demo/            #   before_step/after_step/on_step_error hooks
│   └── context_demo/          #   Cross-step context sharing via self.ctx
├── baidu_search/              # [First Try] Full E2E AI-driven test + replay scripts
├── beginner/                  # Beginner examples — 3 usage patterns for Baidu search
│   └── baidu_search/          #   01_basecase, 02_flow, 03_yaml
├── advanced/                  # Advanced examples — 3 usage patterns for ktovoz blog
│   └── ktovoz_blog/           #   01_basecase, 02_flow, 03_yaml
└── ktovoz_blog/               # [Advanced] 11-step long-range blog test

tests/                         # Framework tests
├── unit/
├── functional/
├── acceptance/
└── e2e/

Examples

Examples are organized into three tiers:

Beginner — Baidu Search (3 usage patterns × 3 LLM configs)

Example Description
beginner/baidu_search/01_basecase/ BaseCase class + .env auto-detect
beginner/baidu_search/02_flow/ flow() functional API + explicit OpenAIProvider
beginner/baidu_search/03_yaml/ YAML declarative + skiritai.toml config file

Advanced — ktovoz Blog (3 usage patterns × 3 LLM configs)

Example Description
advanced/ktovoz_blog/01_basecase/ BaseCase class + .env (full 11-step blog test)
advanced/ktovoz_blog/02_flow/ flow() functional API + explicit Provider
advanced/ktovoz_blog/03_yaml/ YAML declarative + skiritai.toml config file

Teaching (learn framework features)

Example What It Teaches
minimal/ BaseCase structure — pure Playwright, no LLM required
step_modes/ auto / explore / replay execution modes via @step_mode
failure_policies/ @on_failure(SKIP) / @on_failure(RETRY) error handling
hooks_demo/ before_step / after_step / on_step_error lifecycle hooks
context_demo/ Cross-step data sharing with self.ctx.store

First Try (real-world end-to-end)

Example Description
baidu_search/ Complete E2E: open Baidu → search → verify results. Demonstrates Explore→Replay loop in a real scenario.

Advanced (long-range testing)

Example Description
ktovoz_blog/ 11-step blog test: homepage, articles, tags, about, footer, search, summary. Demonstrates the framework's capability for complex multi-step scenarios.
# Beginner — Baidu search (3 patterns)
skiritai run examples/beginner/baidu_search/01_basecase
python examples/beginner/baidu_search/02_flow/demo.py
skiritai run examples/beginner/baidu_search/03_yaml

# Advanced — ktovoz blog (3 patterns)
skiritai run examples/advanced/ktovoz_blog/01_basecase
python examples/advanced/ktovoz_blog/02_flow/demo.py
skiritai run examples/advanced/ktovoz_blog/03_yaml

# Start with a teaching example (no AI needed)
skiritai run examples/tutorial/minimal

Roadmap

Vision Perception Layer

Current AI exploration relies on DOM analysis and CSS selectors. The next evolution adds visual perception — the agent will "see" the page like a human tester, enabling:

  • Visual-based AI exploration — interpret screenshots, identify UI elements by appearance, and interact with canvas/WebGL-based interfaces that lack accessible DOM
  • Multimodal model support — leverage vision-language models (GPT-4o, Claude 3.5 Sonnet, Gemini) and native multimodal models for richer page understanding
  • Visual regression detection — compare screenshots across runs to catch unexpected UI changes

Multi-Platform Testing

Skiritai currently supports Web (Playwright/Chromium). We plan to extend to:

Platform Planned Approach Status
Mobile (iOS/Android) Appium / browser-use mobile integration Planned
API Testing HTTP request tools for the AI agent Planned
Desktop (Electron, native) Playwright Electron / OS-level automation Under investigation

The goal is a unified test framework where the same Explore → Replay workflow works across Web, Mobile, and API — write once, test everywhere.


CLI Commands

skiritai run <case_dir>               # Run a test case
skiritai serve [--host] [--port]       # Start web server
skiritai list [cases_root]            # List available cases
skiritai browser status [case_dir]    # Check persistent browser session
skiritai browser cleanup [case_dir]   # Kill orphan browser process

Tool Set

14 Playwright tools available to the AI agent:

Tool Description
navigate Navigate to URL
click Click element
click_force Force click (for hidden elements)
fill Fill input field
type_text Type character by character
focus Focus on element
get_text Get element text content
get_page_info Get page title, URL, and text summary
wait_for Wait for element to appear
scroll Scroll page
eval_js Execute JavaScript
select_option Select dropdown option
hover Hover over element
screenshot Capture page screenshot

Execution Modes

Control how each step executes via ai.action() or the @step_mode decorator:

Mode Behavior Use Case
auto (default) Replay if script exists, otherwise explore Most steps
explore Always use AI, overwrite existing script New features, re-exploration
replay Always replay, error if no script CI/CD regression
# Via decorator
@step_mode("explore")
async def my_step(self, ai):
    await ai.action("...")


# Via parameter (overrides decorator)
await ai.action("...", mode="replay")

Author

Joe Shen

GitHub


License

MIT

Contributing

PRs Welcome

Contributions, issues, and feature requests are welcome!

About

AI-powered test automation framework that explores test paths with AI and generates replayable scripts for 30x faster execution.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages