A sophisticated multi-agent system built with LangGraph and Python that analyzes software requirements and automatically generates high-quality Gherkin BDD features and Playwright automation.
This system takes requirements from multiple sources (Jira, files, or raw text), orchestrates an intelligent multi-agent pipeline, and produces:
- β High-quality Gherkin feature files with proper BDD structure and semantic tags
- β Test coverage matrices in Markdown format with manual test identification
- β Playwright BDD automation with Page Object Model (for frontend requirements)
- β Semantic change detection to avoid unnecessary regeneration (wording-only vs. real changes)
- β Persistent state & traceability via SQLite (MCP-based)
- β LangSmith integration for full pipeline observability
Input (Jira/File/Text)
β
[Ingestion Agent] β fetch via MCP HTTP/Filesystem
β
[Change Detection Agent] β semantic diff via Groq Mixtral
β (skip if wording-only)
[Classification Agent] β OpenAI classification
β
[Feature Gen Agent] β generate Gherkin via OpenAI
β
[Coverage Eval Agent] β Groq Mixtral evaluation
β
[Router Agent] β decide specialist path
βββ frontend? β [Playwright Agent]
βββ other β [Post-Eval Agent]
β
[Post-Eval Agent] β final QA via Groq Mixtral
β
Output (Features, Coverage, Automation)
-
All CLI and Gradio runs create a root pipeline span plus child spans for every stage (ingestion, classification, feature generation, etc.) using
services/observability.py. -
Set these environment variables (preferably in
.env) to enable tracing:LANGSMITH_API_KEY=lsv2_... LANGCHAIN_PROJECT=qa-e2e # or any project name you prefer LANGCHAIN_TRACING_V2=true
-
Every run now appears inside the configured LangSmith project with:
- Inputs (source type, reference)
- Nested spans per agent with status/errors
- Final outputs (feature path, coverage report path, specialist notes)
-
The same observability hooks are reused by
python main.py ...and the Gradio UI because both call the shared pipeline entry point.
All IO operations delegate to MCP tools (not direct calls):
| Operation | MCP Tool | Usage |
|---|---|---|
| File I/O | filesystem |
Create/read/write feature files, markdown, raw requirements |
| Database | sqlite |
All SQL operations on db/context.db for persistence |
| Jira API | http / jira |
Fetch requirements from Jira via REST API |
| Shell commands | shell / terminal |
Bootstrap Playwright, run npm install |
| PDF/Docx extraction | pdf_extract / docx_extract |
Parse document requirements |
- Fetches requirements from Jira, file, or raw text
- Normalizes and stores in SQLite via MCP
- Creates initial RunState
- Compares current vs. previous requirement text
- Uses semantic diff + Groq to determine if change is meaningful
- Skips pipeline if wording-only (idempotency)
- Records change type in SQLite
- Classifies work area:
frontend | api | data | load | security | other - Updates RunState and SQLite
- Generates high-quality Gherkin with:
- Multiple scenarios (happy, edge, negative paths)
- Proper tags (
@area-*,@priority-*,@req-*,@edge,@negative) - Clear Given/When/Then steps
- Saves to
playwright/ui/features/{requirement_id}.feature
- Evaluates test coverage against requirements
- Generates coverage matrix in Markdown
- Identifies manual test gaps
- Saves to
coverage/{requirement_id}-coverage.md
- Routes frontend β Playwright Agent
- Routes others β Post-Eval Agent (stub for future)
- Bootstraps Playwright + Cucumber project structure
- Generates TypeScript step definitions
- Generates Page Object Model classes
- Saves files to
playwright/ui/step-definitions/andplaywright/ui/pages/
- Final QA: checks alignment between requirements, Gherkin, and automation
- Appends findings to coverage markdown
You can run a lightweight Gradio UI to interact with the pipeline locally. The UI supports:
- Paste requirement text
- Upload a requirement file (pdf/docx/txt)
- Provide a Jira key
- Run pipeline and stream logs
- Preview generated feature + coverage artifact paths
Run the UI:
# Activate the project's venv (.venv if you used uv)
source .venv/bin/activate
# Install Gradio if not already installed
uv sync # or pip install gradio
# Start the Gradio app
python ui/gradio_app.pyVisit http://localhost:7860 in your browser.
- Updates final status in SQLite
All database operations go through the MCP SQLite tool.
requirements
id(PK): Unique identifierrequirement_id: Jira key or synthetic IDsource_type: "jira" | "file" | "text"source_reference: Jira key / file path / NULLraw_text: Original requirement textnormalized_text: Normalized/cleaned textproject_area: Classificationlast_run_id: Most recent runcreated_at,updated_at: Timestamps
runs
run_id(PK): Unique run identifierrequirement_id(FK): Links to requirementsstatus: Pipeline statusstarted_at,finished_at: Timestampsnotes: Human-readable notes
artifacts
id(PK): Artifact identifierrun_id(FK): Links to runstype: "feature" | "coverage" | "playwright"path: File system pathhash: Content hash for integritycreated_at: Timestamp
locks
requirement_id(PK): Prevents concurrent modificationlocked_at: Lock acquisition timelocked_by: Who locked it
changes
id(PK): Change identifierrequirement_id(FK): Links to requirementsold_text: Previous normalized textnew_text: New normalized textchange_type: "semantic_change" | "wording_only"created_at: Timestamp
- Uses
lockstable to prevent concurrent modification of same requirement - Acquire lock before processing, release after completion
- Supports idempotent re-runs (wording-only changes are skipped)
QAe2e/
βββ main.py # CLI entrypoint (Typer)
βββ graph.py # LangGraph workflow definition
βββ state.py # Pydantic state models
β
βββ agents/ # Agent nodes
β βββ __init__.py
β βββ ingestion_agent.py
β βββ change_detection_agent.py
β βββ classification_agent.py
β βββ feature_gen_agent.py
β βββ coverage_eval_agent.py
β βββ router_agent.py
β βββ playwright_agent.py
β βββ post_eval_agent.py
β
βββ services/ # Service layer (MCP abstractions)
β βββ __init__.py
β βββ db.py # MCP SQLite wrapper
β βββ jira_client.py # MCP HTTP/Jira wrapper
β βββ nlp_utils.py # Text processing
β βββ diff_utils.py # Semantic diff
β βββ logging_utils.py # Structured logging + LangSmith env hooks
β βββ observability.py # LangSmith span manager for CLI + Gradio runs
β
βββ playwright/
β βββ ui/
β βββ features/ # Generated .feature files
β βββ step-definitions/ # Generated .ts step defs
β βββ pages/ # Page object models
β βββ support/ # Playwright hooks/support code
β βββ config/ # Playwright config
β
βββ coverage/ # Generated coverage reports (.md)
βββ requirements_raw/ # Raw requirement snapshots
βββ parsed/ # Parsed/processed requirements
βββ db/ # SQLite database
β βββ context.db # Created/managed via MCP
βββ logs/ # Application logs
β βββ app.log
β
βββ README.md # This file
βββ README-usage.md # Usage guide
βββ pyproject.toml # Python dependencies
βββ .env.example # Environment template
βββ .github/
βββ copilot-instructions.md # VS Code Copilot context
- Python 3.10+
- Node.js 16+ (for Playwright)
uv(recommended, fast Python package installer) - install from https://docs.astral.sh/uv/- API Keys:
- OpenAI (GPT-4)
- Groq Mixtral
- (Optional) Jira API token
- (Optional) LangSmith API key
-
Clone/create workspace:
cd QAe2e -
Set up Python environment with
uv(recommended):uv venv .venv source .venv/bin/activate # macOS/Linux # or .venv\Scripts\activate # Windows
Or with standard pip:
python -m venv venv source venv/bin/activate # macOS/Linux # or venv\Scripts\activate # Windows
-
Install dependencies:
uv sync # Recommended (fast) # or pip install -e . # Alternative
-
Configure environment:
cp .env.example .env # Edit .env with your API keys -
Initialize system:
python main.py init
# Analyze a requirement from text
python main.py run --text "As a user I want to log in with Google SSO"
# Analyze from Jira
python main.py run --jira-key JIRA-123
# Analyze from file
python main.py run --file requirements.pdf
# Check status
python main.py status --requirement-id JIRA-123
# Verbose mode (debug logging)
python main.py run --jira-key JIRA-123 --verboseSee README-usage.md for detailed usage documentation, including:
- CLI commands and examples
- Interpreting coverage reports
- Understanding manual test notes
- Extending for new specialist agents
- Troubleshooting
Enable LangSmith tracing to inspect all agent executions:
# Set in .env
LANGSMITH_ENABLED=true
LANGCHAIN_API_KEY=ls_...
# Then inspect at https://smith.langchain.comEach agent call appears as a span with metadata:
requirement_idrun_idproject_areasource_type
Structured JSON logging to logs/app.log:
- Each log entry includes:
timestamp,level,loggerrun_id,requirement_id(if available)agent_name,metadata(structured context)
β MCP-Native IO
- All file operations use
filesystemMCP tool - All database ops use
sqliteMCP tool - All external API calls use
httpMCP tool - All shell commands use
shellMCP tool
β Semantic Change Detection
- Avoids unnecessary regeneration of tests when requirements are just reworded
- Uses Groq Mixtral for intelligent semantic understanding
- Records change history in SQLite
β Idempotency
- Wording-only changes skip regeneration
- Locks prevent concurrent modification
- Previous versions stored for comparison
β High-Quality Gherkin
- Multiple scenarios per feature (happy, edge, negative)
- Proper BDD language (Given/When/Then)
- Comprehensive tagging system
- Generated from requirements using LLM
β Test Automation Coverage
- Coverage matrix with happy/edge/negative/manual assessment
- Playwright BDD with Page Object Model (for frontend)
- Step definitions and page objects generated automatically
- Ready to run:
npx playwright test
β LLM Flexibility
- OpenAI GPT-4 for main agents (thinking, generation)
- Groq Mixtral for evaluators (cost-effective critic)
- Easy to swap LLM providers
β Observability & Tracing
- LangSmith integration for full pipeline visibility
- Structured logging with context
- Human-readable CLI with progress updates
- Create agent module in
agents/new_area_agent.py - Implement processing logic with MCP tool calls
- Add node to
graph.py - Update router logic in
router_agent.py - Document in README-usage.md
Edit in state.py and agent implementations:
OPENAI_MODEL = "gpt-4-turbo" # or "gpt-3.5-turbo"
GROQ_MODEL = "mixtral-8x7b-32768" # or other Groq modelEdit coverage_eval_agent.py to enhance the manual test detection prompt.
@area-frontend @priority-p0 @req-JIRA-123 @smoke
Feature: Google SSO Login Integration
@happy-path
Scenario: Successful login with Google SSO
Given I am on the login page
When I click the "Sign in with Google" button
And I authenticate with valid Google credentials
Then I should be redirected to the dashboard
And my user profile should be loaded
@edge-case
Scenario: User already logged in via SSO
Given I am already logged in via Google SSO
When I navigate to the login page
Then I should be redirected to the dashboard
@negative
Scenario: Login cancelled by user
Given I am on the login page
When I click the "Sign in with Google" button
And I cancel the Google authentication dialog
Then I should remain on the login page
And see a message "Login cancelled"# Coverage Report: JIRA-123
## Requirement Summary
"As a user I want to log in with Google SSO so that I can access without managing passwords"
## Gherkin Scenarios
Total scenarios: 3
| Scenario ID | Covers Happy Path | Covers Edge Cases | Covers Negative | Automated | Manual Needed | Notes |
|-------------|-------------------|-------------------|-----------------|-----------|---------------|-------|
| Scenario 1 | Yes | No | No | Yes | No | - |
| Scenario 2 | No | Yes | No | Yes | Yes | Test Google UI flow |
| Scenario 3 | No | No | Yes | Yes | No | - |
## Coverage Analysis
- Happy path: 100% covered
- Edge cases: 67% covered
- Negative cases: 100% covered
- Overall: 89% coverage
### Manual Tests Needed
1. Test with Google account email not registered in system
2. Test SSO flow on mobile browsers
3. Test error handling for Google API outages
## Post-Generation Review
- β
Gherkin accurately represents requirements
- β
Acceptance criteria fully covered
- β οΈ Consider adding scenario for permission denialSee README-usage.md for common issues and solutions.
MIT License - See LICENSE file for details
Contributions welcome! Please ensure:
- All new agents follow the LoggingMixin pattern
- MCP tools are used for all IO (no direct file/db access)
- Type hints are complete
- Docstrings are comprehensive
- Tests are included
For issues, questions, or feedback:
- Check logs in
logs/app.log - Enable verbose mode:
--verbose - Review LangSmith traces at https://smith.langchain.com
- Check README-usage.md for common patterns
Built with β€οΈ using LangGraph, OpenAI, Groq, and MCP tools