opensigma

A free, open-source replacement for Stripe Sigma and Stripe Data Pipeline that mirrors your Stripe account into a Cloudflare D1 database. Query your own data, build dashboards, run analytics — no extra Stripe fees, no Snowflake/Redshift contract required.

Replicates the core behavior of stripe/sync-engine, but runs entirely on Cloudflare Workers + D1 + Queues instead of Postgres + a long-lived server.

opensigma vs Sigma vs Data Pipeline

	opensigma	Stripe Sigma	Stripe Data Pipeline
Price	$0 (Cloudflare free tier covers most accounts)	~CA$21/mo (250 charges) → CA$621+/mo (25k+ charges)	~CA$69/mo → CA$759+/mo, plus what you already pay for the warehouse
Where the data lives	Your D1 database (SQLite, you own it)	Stripe's servers — dashboard-only access	Snowflake / Redshift / Databricks / S3 / GCS / Azure (you provision and pay)
Query interface	Any SQLite client, any Worker, plus a natural-language skill that works with any LLM (Claude, GPT, Gemini, local)	Stripe's dashboard SQL editor + Stripe-managed AI	Whatever your warehouse supports
Join with your own app data	Yes — it's just SQLite, ship a Worker	No — Stripe data only	Yes, once it lands in your warehouse
Freshness	Seconds (webhook-driven)	Real-time (Stripe-side)	Batched, multi-hour latency
Build apps & dashboards on top	Yes — Workers bindings, any framework, any host	Limited — schedule reports, export, embed	Yes — whatever your warehouse supports
Vendor lock-in	None — MIT, runs on your Cloudflare account	Total — Stripe dashboard only	Partial — warehouse-portable but Stripe-gated
Source code	This repo, MIT-licensed	Closed	Closed
Cost at 100k charges/mo	$0	~CA$2,700+/mo	~CA$2,300+/mo + warehouse

Sigma and Data Pipeline pricing pulled from Stripe's Sigma pricing and Data Pipeline pricing pages.

Why this exists

If you just want to JOIN your invoices against your customers, rank top products by revenue, or build a dashboard, you shouldn't have to pay Stripe a second time for data you already gave them. With this project you get:

Every billing-core object as a queryable SQLite row in D1.
Webhook-driven updates, so the database stays fresh in seconds.
A historical backload via the Stripe REST API on a cron.
Natural-language queries via the bundled stripe-schema-query skill — pick any LLM.
Free tier–friendly: D1, Queues, and Workers cover most accounts at $0/mo.

Architecture

                                    ┌──────────────────────────┐
                                    │  Cloudflare Worker       │
   Stripe webhook  ─────────────▶   │  POST /webhooks/stripe   │──┐
                                    └──────────────────────────┘  │
                                                                  │ upsert row
                                    ┌──────────────────────────┐  │ (last_event_at guard)
   Cron (hourly)   ─────────────▶   │  scheduled() handler     │  │
                                    │  enqueues backload jobs  │  ▼
                                    └──────────┬───────────────┘  ┌──────────────┐
                                               │                  │   D1         │
                                               ▼                  │  stripe_sync │
                                    ┌──────────────────────────┐  │              │
                                    │  Queue consumer          │──┤  27 tables   │
                                    │  pages Stripe REST API   │  └──────────────┘
                                    │  with restricted key     │
                                    └──────────────────────────┘

Webhooks keep rows fresh in real time. Out-of-order deliveries are dropped via a per-row last_event_at guard.
Cron + Queue backfills historical data and retries anything webhooks missed.
D1 is the only datastore. Query it from any Worker, or via wrangler d1 execute.

Stripe restricted API key

Create a read-only restricted key at dashboard.stripe.com/apikeys/create?name=opensigma.

You want Read access on every resource, not just Billing Core — this project syncs charges, payouts, balance transactions, disputes, checkout sessions, payment methods, radar, etc. The fastest way:

Click Select all permissions at the top of the permissions list.
Switch the bulk selector from None → Read.
Scroll through and confirm everything is Read (leave Write unchecked).
Create the key and copy the rk_live_… (or rk_test_…) value.

A read-only key means a compromised Worker can't move money, refund anyone, or mutate your Stripe account — it can only read.

Then set it as a Worker secret:

wrangler secret put STRIPE_API_KEY

First-time setup

pnpm install

# Copy the example config (wrangler.jsonc is gitignored)
cp wrangler.jsonc.example wrangler.jsonc

# Create the D1 database
wrangler d1 create stripe_sync
# Paste the printed database_id into wrangler.jsonc

# Create the queues
wrangler queues create stripe-sync-backload
wrangler queues create stripe-sync-dlq

# Set secrets
wrangler secret put STRIPE_API_KEY        # the read-only restricted key from above
wrangler secret put STRIPE_WEBHOOK_SECRET # from the Stripe webhook endpoint

# Apply migrations
pnpm db:migrate

Point your Stripe webhook at https://<your-worker-subdomain>/webhooks/stripe and subscribe to the event types listed in src/webhooks/dispatch.ts.

Local dev

pnpm db:migrate:local
pnpm dev

Backload

The cron handler runs hourly and enqueues a job per resource that isn't done. To force a full re-backload:

pnpm backload:reset

The next cron tick will re-fetch from the start. Webhook-updated rows are protected by last_event_at.

Tables

27 tables in total. Bookkeeping: stripe_events, backload_state, backload_parent_progress. Synced Stripe resources: customers, products, prices, subscriptions, subscription_items, subscription_schedules, invoices, invoice_line_items, charges, balance_transactions, payment_intents, payment_methods, setup_intents, refunds, disputes, payouts, credit_notes, credit_note_line_items, checkout_sessions, checkout_session_line_items, coupons, promotion_codes, tax_ids, reviews, early_fraud_warnings. Each resource mirrors sync-engine's Postgres column set, ported to SQLite.

Querying your data

Once synced, query D1 like any SQLite database:

wrangler d1 execute stripe_sync --command \
  "SELECT COUNT(*) FROM invoices WHERE status = 'paid' AND created > strftime('%s', '2026-01-01')"

Or wire it into a Worker / dashboard / BI tool of your choice. Nothing here phones home to Stripe for analytics — once the data is in D1, it's yours.

Natural-language queries

This repo ships with a Claude Code skill at .claude/skills/stripe-schema-query/ that lets you ask analytics questions in plain English — it reads the Drizzle schema, generates the SQL, runs it through wrangler, and renders a table (with an optional chart). Works with any agent that supports skills.

/stripe-schema-query top 10 products by revenue last quarter
/stripe-schema-query MRR by month for the last year
/stripe-schema-query churn rate by plan, in-store vs online

Bring whichever model you like — Claude, GPT, Gemini, a local model. The skill is just a prompt + workflow, so the AI itself is your choice.

Tests

pnpm test

Uses @cloudflare/vitest-pool-workers so D1 and Queues are real bindings in tests.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.claude/skills/stripe-schema-query		.claude/skills/stripe-schema-query
.github/workflows		.github/workflows
drizzle		drizzle
src		src
test		test
.gitignore		.gitignore
.nvmrc		.nvmrc
README.md		README.md
drizzle.config.ts		drizzle.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
wrangler.jsonc.ci		wrangler.jsonc.ci
wrangler.jsonc.example		wrangler.jsonc.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

opensigma

opensigma vs Sigma vs Data Pipeline

Why this exists

Architecture

Stripe restricted API key

First-time setup

Local dev

Backload

Tables

Querying your data

Natural-language queries

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

opensigma

opensigma vs Sigma vs Data Pipeline

Why this exists

Architecture

Stripe restricted API key

First-time setup

Local dev

Backload

Tables

Querying your data

Natural-language queries

Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages