GitHub - Ata-ux/hypolab: Hypothesis Lab for Claude Code — one command runs the full product validation cycle: hypothesis → research → synthetic filter → deployed landing → outreach. Every step cites a published framework.

  ██╗  ██╗██╗   ██╗██████╗  ██████╗ ██╗      █████╗ ██████╗
  ██║  ██║╚██╗ ██╔╝██╔══██╗██╔═══██╗██║     ██╔══██╗██╔══██╗
  ███████║ ╚████╔╝ ██████╔╝██║   ██║██║     ███████║██████╔╝
  ██╔══██║  ╚██╔╝  ██╔═══╝ ██║   ██║██║     ██╔══██║██╔══██╗
  ██║  ██║   ██║   ██║     ╚██████╔╝███████╗██║  ██║██████╔╝
  ╚═╝  ╚═╝   ╚═╝   ╚═╝      ╚═════╝ ╚══════╝╚═╝  ╚═╝╚═════╝

       hypothesis lab for agent coding CLIs
                   v 0 · 2

One command. Full product validation pipeline. Every step cites a published framework. Works with any agent CLI that reads SKILL.md: Claude Code · Claude Cowork · Codex · Gemini CLI · Cursor · OpenCode · Antigravity

Classical validation path: ~2–3 weeks. With hypolab: 30–90 minutes. Same rigor. Real frameworks. Verifiable artifacts.

The one-liner

/hypolab AI assistant for Russian HR managers hiring developers

That's it. hypolab runs the full pipeline and stops at 3 decision points for your approval. At the end you have:

— a structured hypothesis with falsifiable kill criteria — 6–8 research items with real URLs and verbatim quotes — 10 synthetic personas (as a filter, honest about limits) — a working Next.js landing page with lead capture — a live Vercel URL — 6 outreach templates grounded in your audience's actual language — LAB_REPORT.md — full summary with a weekly kill-criteria dashboard

How the pipeline flows

flowchart LR
  idea([Raw idea]) --> outline["① Frame<br/>hypothesis"]
  outline --> deep["② Deep research<br/>3 parallel agents"]
  deep --> validate["③ Synthetic<br/>filter · 10 personas"]
  validate --> cp1{CHECKPOINT 1<br/>approve?}
  cp1 -->|yes| landing["④ Build landing<br/>Next.js 15"]
  landing --> cp2{CHECKPOINT 2<br/>deploy?}
  cp2 -->|yes| deploy["⑤ Deploy<br/>Vercel / YC"]
  deploy --> outreach["⑥ Outreach<br/>Josh Braun 4-T"]
  outreach --> cp3{CHECKPOINT 3<br/>send?}
  cp3 -->|yes| report["⑦ Lab report"]
  report --> monitor["⑧ Weekly monitor<br/>kill criteria"]

  classDef phase fill:#f7f5f2,stroke:#111110,stroke-width:1.5px,color:#111110
  classDef checkpoint fill:#c75b39,stroke:#111110,stroke-width:1.5px,color:#fff
  class outline,deep,validate,landing,deploy,outreach,report,monitor phase
  class cp1,cp2,cp3 checkpoint

Every phase is idempotent and resumable. You can run any phase manually via /hypolab:lab-outline, /hypolab:lab-deep, etc. — the master /hypolab command is just an orchestrator that chains them.

Install

Claude Code
_{plugin marketplace} Claude Cowork
_{desktop UI} Codex / Gemini CLI / Cursor / OpenCode / Antigravity
_{agents directory} Local dev
_{for hacking}

/plugin marketplace add \
  https://github.com/Ata-ux/hypolab
/plugin install hypolab@ata-plugins

Plugin UI at claude.com/plugins → search hypolab.

Same package. Zero translation.

git clone \
  https://github.com/Ata-ux/hypolab \
  ~/.agents/skills/hypolab

These CLIs read skills from ~/.agents/skills/ or .agents/skills/ in the project root (naming convention shared by Codex, Gemini CLI, and OpenCode). hypolab's skills/ folder works directly.

git clone \
  https://github.com/Ata-ux/hypolab

Then load the plugin locally in whichever CLI you use (--plugin-dir, --agents-dir, or equivalent).

Compatibility note. hypolab's skills follow the SKILL.md frontmatter convention that Claude Code pioneered and that Codex, Gemini CLI, Cursor, OpenCode, and Google's Antigravity have since adopted. The hypothesis framing, deep research, synthetic validation, landing composition, outreach scripts, and lab-monitor phases all run the same way regardless of which agent CLI hosts them. The only CLI-specific paths are the plugin installation location and the .mcp.json bundled server loader — for those, see docs/INSTALL.md.

First-time setup

On the first session, SessionStart preflight auto-installs everything hypolab needs:

━━━ hypolab preflight ━━━
Ready:
  ✓ Node.js v18+
  ✓ Python 3.9+
  ✓ PyYAML                (auto-installed)
  ✓ Vercel CLI            (auto-installed)
  ✓ uv                    (auto-installed)
  ✓ Playwright chromium   (auto-installed)
All systems ready. Start with: /hypolab [your idea]
━━━━━━━━━━━━━━━━━━━━━━━━━

You configure credentials once via the plugin's userConfig — Vercel token, Telegram API id/hash, Yandex credentials. Full details in docs/INSTALL.md.

Why this isn't "just ask ChatGPT"

capability	plain ChatGPT	hypolab
Research depth	1 agent, serial	3 parallel agents
Real data access	hallucinates URLs	Playwright + Telegram MTProto + Yandex MCP
Framework application	from memory	mandatory Read of cited file
Output validation	none	YAML schema + python validator
Artifacts	chat messages	structured files on disk
Landing page	HTML in chat	real Next.js project with lead capture
Deploy	none	one command, live URL
Cross-session memory	none	LAB_REPORT + MONITOR_LOG
Anti-sycophancy	no	required 2/10 critical personas
Kill criteria enforcement	no	pre-committed, checked weekly

What's inside

Full plugin structure — click to expand

hypolab/
│
├─ .claude-plugin/          →  plugin.json (manifest) + marketplace.json
├─ .mcp.json                →  6 bundled MCP servers
├─ hooks/hooks.json         →  SessionStart preflight hook
├─ scripts/preflight.sh     →  auto-installs Vercel CLI, uv, Playwright, PyYAML
│
├─ skills/                  →  11 skills
│  ├─ hypolab/              →    master orchestrator (one command)
│  ├─ lab-outline/          →    phase 1 — frame hypothesis
│  ├─ lab-deep/             →    phase 2 — parallel research
│  ├─ lab-validate/         →    phase 3 — synthetic filter
│  ├─ design-studio/        →    commit a bold aesthetic direction (no defaults)
│  ├─ lab-landing/          →    phase 4 — landing generation
│  ├─ lab-deploy/           →    phase 5 — vercel / yc deploy
│  ├─ lab-outreach/         →    phase 6 — cold outreach scripts
│  ├─ lab-report/           →    phase 7 — final report
│  ├─ lab-monitor/          →    phase 8 — weekly kill-criteria check
│  └─ setup-telegram/       →    one-time MTProto auth
│
├─ servers/                 →  forked MCP servers (MIT, preserved)
│  ├─ telegram-mtproto/     →    sparfenyuk/mcp-telegram
│  └─ yandex-tools/         →    altrr2/yandex-tools-mcp (search + wordstat + metrika + webmaster)
│
├─ templates/
│  └─ base/                 →    single blank-canvas Next.js scaffold
│                                (CSS variables, no hardcoded fonts / colors / sections —
│                                 the generator composes the entire page per committed
│                                 design direction, so two projects never look alike)
│
└─ references/
   ├─ schemas/              →  4 YAML schemas + python validate.py
   ├─ strategy-modules/     →  9 research tactics (how to search which source)
   └─ frameworks/           →  11 framework reference files with citations

Under 150 files total. No external npm/pip dependencies beyond what preflight installs automatically.

Honest limitations

No marketing-honest fluff. Real boundaries:

Synthetic users are a filter, not validation. Peer-reviewed research (Lin 2025 AMPPS · arXiv 2512.00461) shows LLM personas fail construct-validity tests. hypolab forces 2 of 10 personas to push back and never frames output as proof. Real user interviews are still required.
Deep research depth is limited by public sources. Paid Gartner/Statista reports are not accessible unless you provide credentials. Reddit/G2/Habr work well; closed databases do not.
Cold outreach is generated, not sent. Automating sends would be spam. You send manually after reviewing.
Enterprise admins can disable local MCP. If your Claude deployment is locked down, Telegram and Yandex MCP may be unavailable. hypolab degrades gracefully via WebFetch fallbacks.
Kill criteria monitoring is manual. lab-monitor runs when invoked, not as a daemon. Use a calendar reminder or cron.
First-time Telegram and Vercel auth are interactive. One-time friction — then it's seamless.

License

hypolab plugin: PolyForm Noncommercial 1.0.0

You can: use for personal projects, hobby work, learning, research. Modify. Share forks. Use inside nonprofits / universities / government.

You cannot: sell hypolab as a product, integrate it into commercial SaaS, or use for paid consulting without a commercial license.

Commercial use inquiries → @ai_vdel

Forked code in servers/ retains its original MIT license:

Built by Ata · product manager · author of @ai_vdel — AI for business and product

Made with Claude Code using hypolab's own frameworks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The one-liner

How the pipeline flows

Install

First-time setup

Why this isn't "just ask ChatGPT"

What's inside

Honest limitations

Read next

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.claude-plugin		.claude-plugin
docs		docs
hooks		hooks
references		references
scripts		scripts
servers		servers
skills		skills
templates/base		templates/base
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

The one-liner

How the pipeline flows

Install

First-time setup

Why this isn't "just ask ChatGPT"

What's inside

Honest limitations

Read next

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages