GhostShip

Phantom users for every pull request.

Every Vercel preview is already an A/B test. It just has zero users.

ghostship.sh | Demo Video | Problem Statement

What is GhostShip?

AI coding tools made building features 10x faster. But learning whether a change is good for users still takes 2-4 weeks of A/B testing. GhostShip closes that gap.

It sends AI-generated "phantom users" to evaluate your pages in 30 seconds — not 3 weeks. Lighthouse for UX.

Three Capabilities

1. Generate Personas on the Fly

Paste any URL. Gemini analyzes the page and generates 5 user personas specific to that page — not generic templates. A cooking site gets food personas. A B2B pricing page gets buyer personas.

2. Review a Page from Each Persona's Perspective

Each persona evaluates the page using Gemini's multimodal vision: first impressions, scores, strengths, weaknesses, and suggestions — all from their unique point of view.

3. Compare Pages Across Revisions via PR Reviews

When a developer opens a PR, @ghostship in Slack or GitHub compares the Vercel preview against production. Five personas vote, and you get a confidence-scored recommendation in 30 seconds.

@ghostship https://my-app-git-feature.vercel.app/pricing

👻 Ghostship Report: /pricing
Preview wins 4-1 · Confidence: 82%

🛍️ Budget-Conscious Buyer — Prefers Preview (high confidence)
   "The pricing tiers are much clearer. I can immediately see what I get at each level."

💻 Power User — Prefers Preview (high confidence)
   "The CTA stands out more. I don't have to hunt for the signup button."

💼 Executive — Prefers Production (medium confidence)
   "The new layout feels busier. I preferred the simpler presentation."

👀 First-Time Visitor — Prefers Preview (high confidence)
   "The comparison table makes it easy to decide."

♿ Accessibility User — Prefers Preview (high confidence)
   "Better contrast on the CTA button. The heading hierarchy is more logical."

Summary: Preview wins 4-1. Ship with confidence.

Tech Stack

Layer	Technology
AI	Gemini 2.5 Flash (multimodal vision + structured output)
AI SDK	Vercel AI SDK (`generateText`, `Output.object`, zod schemas)
Bot Framework	Vercel Chat SDK (Slack + GitHub adapters)
Screenshots	Puppeteer (`puppeteer-core` + `@sparticuz/chromium`)
Framework	Next.js 16 (App Router)
Deployment	Vercel

How It Works

URL mentioned in Slack/GitHub
  │
  ├── Screenshot both URLs (Puppeteer, parallel)
  │
  ├── 5 Persona Evaluations (Gemini, parallel)
  │   ├── Budget-Conscious Buyer
  │   ├── Power User / Developer
  │   ├── Non-Technical Executive
  │   ├── First-Time Visitor
  │   └── Accessibility-Focused User
  │
  ├── Aggregate: votes, confidence, summary
  │
  └── Post report card to Slack thread / GitHub PR comment

Project Structure

src/
├── app/
│   ├── page.tsx                          # Landing page
│   ├── pricing/page.tsx                  # Demo pricing page
│   └── api/
│       └── webhooks/[platform]/route.ts  # Slack + GitHub webhook handler
├── lib/
│   ├── agent.ts                          # Orchestrator: runGhostship(), reviewPage(), runGhostshipForPR()
│   ├── bot.tsx                           # Chat SDK bot handlers
│   ├── evaluate.ts                       # Gemini multimodal evaluation (single-page + A/B)
│   ├── personas.ts                       # 5 persona definitions + types
│   ├── screenshot.ts                     # Puppeteer screenshot service
│   └── adapters.ts                       # Slack + GitHub adapter setup
└── scripts/
    └── evaluate-page.ts                  # CLI: evaluate any URL with all 5 personas

Quick Start

pnpm install
cp .env.example .env.local
# Fill in: GEMINI_API_KEY, GOOGLE_GENERATIVE_AI_API_KEY,
#          SLACK_BOT_TOKEN, SLACK_SIGNING_SECRET
pnpm dev

Try the CLI

npx tsx scripts/evaluate-page.ts https://vercel.com

Environment Variables

Variable	Description
`GOOGLE_GENERATIVE_AI_API_KEY`	Gemini API key (from Google AI Studio)
`SLACK_BOT_TOKEN`	Slack bot token
`SLACK_SIGNING_SECRET`	Slack request verification
`GITHUB_WEBHOOK_SECRET`	GitHub webhook secret (for PR reviews)
`GITHUB_APP_ID`	GitHub App ID
`GITHUB_PRIVATE_KEY`	GitHub App private key

Dogfooding

GhostShip evaluates its own PRs. We created variant pricing pages and used GhostShip to compare them:

Why This Matters

70-90% of A/B tests show no statistically significant winner. Most wait time is wasted.
SimAB research (2026) showed LLM-based simulation achieves 67% accuracy overall, 83% on high-confidence predictions — vs 50% (coin flip) for teams shipping without testing.
GhostShip is a pre-filter, not a replacement for real A/B testing. It tells you which experiments are worth running.

Built for Zero to Agent: Vercel x DeepMind Hackathon SF (March 2026)

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.agents/skills		.agents/skills
.claude/skills		.claude/skills
docs		docs
public		public
remotion		remotion
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DEMO_SCRIPT.md		DEMO_SCRIPT.md
PLAN.md		PLAN.md
PROBLEM_STATEMENT.md		PROBLEM_STATEMENT.md
README.md		README.md
next-env.d.ts		next-env.d.ts
next.config.ts		next.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.mjs		postcss.config.mjs
progress.txt		progress.txt
skills-lock.json		skills-lock.json
slack-manifest.yml		slack-manifest.yml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GhostShip

What is GhostShip?

Three Capabilities

1. Generate Personas on the Fly

2. Review a Page from Each Persona's Perspective

3. Compare Pages Across Revisions via PR Reviews

Tech Stack

How It Works

Project Structure

Quick Start

Try the CLI

Environment Variables

Dogfooding

Why This Matters

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GhostShip

What is GhostShip?

Three Capabilities

1. Generate Personas on the Fly

2. Review a Page from Each Persona's Perspective

3. Compare Pages Across Revisions via PR Reviews

Tech Stack

How It Works

Project Structure

Quick Start

Try the CLI

Environment Variables

Dogfooding

Why This Matters

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages