AI Testing Agent Framework · Open-Source · Multi-LLM · 5-second setup
English | 简体中文
git clone https://github.com/Wool-xing/Test-Agent.git
cd Test-Agent && pip install -e .
tagent demo # 0 API key · 0 config · stub LLM · 30s end-to-endOutputs: test cases (Excel + xmind + markmap + opml) + Word report + decision logs, all under workspace/.
Ready to run on your project?
tagent init --preset 国内-web # or: minimal / saas-web / mobile-android / security-pentest
# → produces .env + tagent.yml + STARTUP.md (5-step onboarding guide)8640 config combinations from a single matrix.yaml — change a line in YAML, the wizard picks it up. See 04-配置文件/templates/INDEX.md.
Test-Agent turns any software, EXE, APK, Docker image, or API into a fully tested project — autonomous from requirement parsing to PoC-validated bug reports. Built for QA teams, security researchers, automotive testers, and anyone who wants to use AI testing while learning the theory behind it.
- 16 expert agents — functional · security · mobile · desktop · AI model · automotive · pentest …
- 32+ reusable skills — TDD · E2E · regression · pentest · car-CAN-bus · eval-harness · …
- 49 production utils — pytest · Playwright · JMeter · Appium · Burp · Allure · OpenCV · …
- Multi-LLM — Claude / OpenAI / Gemini / Qwen / DeepSeek / Ollama(local,no vendor lock-in)
- 6 BugTracker adapters — Zentao · Jira · GitHub Issues · GitLab Issues · Linear · Webhook(主宪章 §37)
- 6 notify channels — WeChat Work · Lark/Feishu · DingTalk · Slack · Email · MS Teams(主宪章 §36)
- MCP-native — 6-server suite + 4-gate marketplace
- 4-layer self-test — L1 lint · L2 mock CI · L3 real-LLM pre-tag · L4 weekly cron(主宪章 §33)
curl -fsSL https://raw.githubusercontent.com/Wool-xing/Test-Agent/main/install.sh | bash -s -- /path/to/your-test-projectThen tagent init to scaffold .env/tagent.yml/STARTUP.md — no more 30 mins of hand-editing.
- All-platform — Web / API / Android / iOS / WeChat-miniprogram / Windows EXE / macOS / Linux / Electron / game / IoT / audio-video / AI/LLM / blockchain / 车载
- All-protocol — HTTP(S) / gRPC / WebSocket / TCP / UDP / GraphQL / SOAP / MQTT / SSH / serial / Kafka / RabbitMQ / Modbus / CAN-bus / SOME-IP / DoIP / UDS
- Multi-LLM no lock-in — switch with
tagent modelbetween Claude / OpenAI / Gemini / Qwen / DeepSeek / Ollama - Learn while using —
--mode learnoutputs every step with theory references (22 KB cards across 13 domains: tools / coding / foundation / strategy / methods / protocols / platforms / gates / security / AI testing / compliance / process / build-your-own) - Safe-by-default — sandboxed exec / PII scrub / runtime prompt-injection scan / 4-gate marketplace verify / decisions audit trail
- Product types: Web · API · Mobile · Desktop · IoT · AI · Blockchain · Vehicle · Embedded · Serverless
- Test types: functional / performance / security / compatibility / weak-network / stability / reliability / accessibility / contract / visual / i18n / observability / chaos / mutation / AI-specific (hallucination / prompt-injection / drift / fairness) / compliance
- Test design methods: equivalence-partitioning · boundary-value · decision-table · state-transition · pairwise · orthogonal · exploratory SBTM · risk-based · TDD · BDD · ATDD
- Quality gates: smoke → regression → performance_ci_quick → performance_full → release (5-layer)
Total ≈ 95% coverage — remaining 5% (DO-178C avionics / HIPAA medical / IEC 61508 industrial) added by your domain experts.
Test-Agent ships with a 31-section charter (CHARTER.md-equivalent) covering:
- §10–§12 · Soul (3 axioms + 5 inscriptions + 16 key terms)
- §13–§17 · Architecture (experts / skills / installs / darwin self-evolution / AgentChat / MCP)
- §18–§21 · Methodology (9-cluster map / test pyramid 2024 / 18 closed-loop rules / 9-industry adapter / 50+ test types / 4 depth levels)
- §22 · Hermes-inspired (scheduler / subagent / learning-loop / 7 backends / 8 platforms)
- §23 · Teaching layer (KB 13 categories + anti-hallucination 3 layers + bilingual)
- §24 · GBrain-inspired (KB self-wiring graph + eval replay + PII scrub)
- §25–§26 · Pentest & Automotive verticals
- §27 · Karpathy 4 principles (think-before / simplicity-first / surgical / goal-driven)
- §28 · ECC test hardening (tdd-workflow / verification-loop / e2e / eval-harness / security-review)
- §29 · Essence watcher (auto-track upstream OSS for delta extraction)
- §30 · Marketplace 4-lane (4-gate security)
- §31 · Build-your-own-X learning layer
Test-Agent/
├── 00-项目导航.md ← 5-dimension category guide
├── 01-快速开始/ ← user manual / deploy / config / deliverables
├── 02-专家定义/ ← 16 expert agents
├── 03-技能定义/ ← 34 skills (incl. darwin-skill / karpathy-guidelines upstream)
├── 04-配置文件/ ← conftest / pytest.ini / .env / .mcp.json
├── 05-代码示例/ ← 49 production utils
├── 06-CICD集成/ ← GitHub Actions + Jenkins
├── runtime/ ← V1.x runtime layer (router / orchestrator / MCP / web / scheduler / subagent / learning_loop / backends / gateway / tutor / essence_watcher / marketplace)
├── docs/theory/ ← 22 teaching KB cards across 13 categories
├── profiles/compliance/ ← 10 industry compliance YAML profiles
├── marketplace/ ← Community skills / agents / mcp / hooks (4 lanes, 4-gate verify)
├── install.sh ← one-line deploy
├── README.md ← This file
├── FULL_GUIDE.md ← Full engineering guide
├── CHANGELOG.md ← Version log
└── LICENSE / SECURITY.md / CONTRIBUTING.md / CODE_OF_CONDUCT.md
| Audience | Read |
|---|---|
| First-time user | Quick start → Deploy |
| QA engineer | User manual → Skill catalog |
| Architect / SRE | Architecture deep-dive → Runtime |
| Security researcher | Pentest expert → pentest-coordinator |
| Automotive tester | Automotive expert → ASIL workflow |
| Contributor | CONTRIBUTING.md → Marketplace |
pytest 8.3 · Playwright 1.59 · Appium 5.3 · pywinauto · JMeter 5.6 · Allure · Airtest · OpenCV · Faker · SQLAlchemy 2.0 · MCP 1.0 · LiteLLM · Prefect · FastAPI · React 18 · Tailwind · Postgres+pgvector · MinIO · OpenTelemetry · Loguru · Docker Compose · GitHub Actions / Jenkins
See CONTRIBUTING.md for the full workflow (sync rules + RACI matrix + 6-layer dependency policy + Karpathy 4 principles).
Community marketplace contributions (marketplace/) go through 4 safety gates: signature → injection scan → docker sandbox → darwin-skill scoring.
MIT License — see LICENSE.
Upstream components retain their own licenses; see NOTICE.md for attributions.
- hermes-agent — closed learning loop + 7 backends + multi-platform gateway
- gbrain — self-wiring KB graph + eval replay + safe-by-default
- andrej-karpathy-skills — 4 LLM-coding principles
- everything-claude-code — TDD / verification / harness-first
- pentagi + shannon — pentest agent black-box + white-box
- build-your-own-x — deep-dive learning path
Made for testers · Built with testers · Tested by testers