Skip to content

prithvi471/opensoirceor

Aegis PM — PRD Analyzer

The PRD review tool for product managers who don't have time to wait for a senior PM to read their doc.

Paste a PRD. Get a structured 100-point score across 10 dimensions, the single biggest gap, three specific rewrite suggestions, and the questions a reviewer would actually ask. Profile-aware: a startup PRD, a TCS engagement doc, and a Big Tech design doc are scored differently.

This is the production product. The repository also contains experimental modules (task breakdown, assignment, sprint tracking) — see Roadmap for current status.


What it does (Phase 1)

Two modes for the analyzer:

Mode Latency Cost What it tells you
scan <100ms $0 What sections exist, what's missing, structural completeness %
analyze ~3s ~$0.03 Full 100-point score, dimension breakdown, rewrite suggestions, reviewer questions

Profile-aware scoring for:

  • startup — falsifiability, lean PRD conventions
  • it_services — SOW alignment, client-side acceptance, KT plan
  • big_tech — design-doc conventions, alternatives mandatory, OKRs
  • financial_services — regulatory considerations (see Compliance Caveat)

Quickstart — no API key needed

Three modes, in order of friction. Pick whichever fits.

Zero-setup (rules-only) — instant, no API, no model download

pip install -r requirements.txt
python -m eval.runner --mode rules_only --max 3   # see it working in 1 second

Score any PRD with the deterministic regex + heuristic engine (82% band-pass on our 50-fixture eval — see LEADERBOARD.md). Free, offline, private. Useful as a CI gate or a fast pre-check.

from engine.module3_analyzer.prd_analyzer import PRDAnalyzer
analyzer = PRDAnalyzer()                       # no LLM client
result = analyzer.analyze(open("my_prd.md").read(), mode="rules_only")
print(f"{result['total_score']}/100 — {result['rating']}")

Local LLM (your laptop) — no key, no cloud, ~5GB download

# Install Ollama from https://ollama.com, then:
ollama pull llama3.1:8b
export AEGIS_LOCAL_BASE_URL=http://localhost:11434/v1
export AEGIS_LOCAL_MODEL=llama3.1:8b
python -m engine.verify --live                # confirms local model responds

Same JSON output shape as the cloud path, but runs entirely on your hardware. Slower (~30-60s per PRD on CPU) but unlimited and fully private.

Cloud LLM (free tier) — fastest + highest quality

Pick either of these free providers — no credit card:

# Option A: Groq (recommended — fast, generous daily limit)
echo "GROQ_API_KEY=gsk_xxxxxxxxxxxx" >> .env       # get one at console.groq.com

# Option B: OpenRouter (free models like deepseek-chat-v3:free)
echo "OPENROUTER_API_KEY=sk-or-xxxxxxxxxxxx" >> .env  # openrouter.ai

python -m engine.verify --live

Aegis auto-detects whichever key is present. Set AEGIS_REDACT=1 if you handle PII or regulated data — see SECURITY.md. For detailed signup steps and per-provider rate limits, see BUDGET.md.

Common usage

from engine.ai_engine import AegisEngine

eng = AegisEngine()

# Free, instant: structural scan (no LLM)
scan = eng.scan_prd(open("my_prd.md").read())
print(f"Completeness: {scan['completeness']}%")
print(f"Missing: {scan['missing_sections']}")

# Full analysis (uses whatever's configured: rules, local, or cloud)
result = eng.analyze_prd(open("my_prd.md").read())
print(f"Score: {result['total_score']}/100 — {result['rating']}")
print(f"Critical gap: {result['critical_gap']}")
for tip in result['top_improvements']:
    print(f"  - {tip}")

Architecture (Phase 1)

       ┌──────────────────┐
       │  PRD text input  │
       └────────┬─────────┘
                │
     ┌──────────┴──────────┐
     │                     │
     ▼                     ▼
┌──────────┐         ┌──────────────┐
│  Layer 1 │         │  Layer 1+2+3 │
│  scan()  │         │   analyze()  │
│  regex   │         │   regex →    │
│  <100ms  │         │   LLM →      │
│  free    │         │   RAG ground │
└──────────┘         │   ~3s, $0.03 │
                     └──────┬───────┘
                            │
                            ▼
                     ┌──────────────┐
                     │ 100-pt score │
                     │ 10 dims      │
                     │ rewrites     │
                     │ Q&A          │
                     └──────────────┘

Underneath:

  • Module 1 — RAG Knowledge Base. Hybrid BM25 + dense vector + Reciprocal Rank Fusion + cross-encoder re-ranking, over a 99K-word curated PM corpus.
  • Module 3 — Analyzer. 10-dimension rubric (problem statement, target user, goals, metrics, solution, risks, stakeholders, open questions, launch, writing). Profile-specific scoring overlays.
  • LLMClient. Provider-agnostic (OpenAI / Anthropic / Azure / local OpenAI-compatible). Built-in redaction, audit logging, and local-model fallback. See SECURITY.md.

Eval

Every prompt change is gated by an eval set:

# Structural eval (no API key, runs in CI)
pytest eval/test_eval.py -v

# Full LLM eval (10 anchors × 5 perturbations = 50 fixtures)
# Costs $0.00 if you use Groq / OpenRouter / Gemini free tier — see BUDGET.md
python -m eval.runner

See eval/README.md for acceptance thresholds.


Roadmap

Module Status Notes
1 — RAG knowledge base ✅ Production 99K-word corpus, hybrid retrieval
2 — PRD generator ✅ Production Profile + template aware
3 — PRD analyzer Production — the wedge 100-pt rubric, 50-fixture eval
4 — Task breakdown 🧪 Experimental See module4_tasks/EXPERIMENTAL.md
5 — Task assignment 🧪 Experimental Greedy bin-packing, naive skill match
6 — Resource planner 🧪 Experimental Hardcoded focus factor, magic-number defaults
7 — Sprint tracker 🧪 Experimental In-memory state only, no persistence

Experimental modules are gated behind the env var AEGIS_ENABLE_EXPERIMENTAL=1. Do not deploy experimental modules to production.


Compliance Caveat

The financial_services profile enriches scoring with finserv-specific heuristics — it does not make this system finserv-compliant.

In particular, the system as shipped:

  • Sends PRD content to a third-party LLM unless redaction is enabled.
  • Has no guaranteed data residency.
  • Does not produce a regulator-grade audit trail.
  • Has not been independently security-reviewed or SOC2-attested.

If you operate under SEC, FINRA, MiFID, GDPR-with-finserv-overlay, or similar regimes, do not use the LLM-backed analyze path on material non-public information without:

  1. Setting AEGIS_REDACT=1 (see SECURITY.md).
  2. Configuring a local model fallback (AEGIS_LOCAL_BASE_URL).
  3. Independent legal sign-off on the data flow.

For a one-page customer-facing summary aimed at compliance officers, see TRUST.md. For the full technical detail (redaction patterns, audit-log schema, env vars, every guarantee + every non-guarantee), see SECURITY.md.


Contributing / development

# Run all tests (no API key required for unit tests)
pytest tests/ eval/test_eval.py -v

# Eval the analyzer end-to-end
python -m eval.runner --scan-only       # CI mode, no API
python -m eval.runner                    # Full mode, costs ~$0.15 with gpt-4o-mini

When you change a prompt:

  1. Run the full eval.
  2. Commit the resulting eval/results.json.
  3. Diff band_pass_rate and dim_* counters against the previous baseline.
  4. Don't merge a regression without an explicit reason in the PR.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages