A multi-channel sales system for HappyCake, a family-owned bakery in Sugar Land, TX. WhatsApp, Instagram, the website, and a Telegram cockpit for the owner all run through one agent.
Built for the Steppe Business Club hackathon, May 9–10 2026.
Customers reach the bakery on whichever channel they prefer. One agent runtime — claude -p with Opus 4.7 — handles every inbound message. The agent knows the live catalog, the kitchen's capacity for today, brand voice, and the difference between a question it can answer and one to hand to Askhat.
Customer drafts land in the owner's Telegram with Approve / Reject buttons. Approve runs an atomic Square order + kitchen ticket; reject sends the customer a clean decline on their original channel. Approval doesn't go through the LLM — pressing a button shouldn't depend on reasoning.
The owner runs the business from Telegram: /today for the daily snapshot, /orders for recent activity, /escalations for hand-offs, /campaigns for marketing, plus free text for anything else. Every customer-channel reply is mirrored into the owner's log so nothing is silent.
The same Telegram bot doubles as a content + engagement cockpit: free text like "make a post about Friday's pistachio batch" drafts a brand-checked caption ready to schedule or publish via sandbox MCP; /comments and /reviews pull DM threads + GBP reviews and pre-draft replies; /stats renders a one-screen digital-presence dashboard. When new high-severity alerts appear, the bot pings the owner unprompted.
The website is the public surface — /menu, /order, /track/[id], B2B inquiries, the storefront. AI crawlers are welcome: /llms.txt, JSON-LD per product, OpenAPI at /openapi.json, dynamic sitemap.
The brief disqualifies submissions that "route through Claude Agent SDK, a different LLM provider, a different framework, or expose any non-Telegram owner UI." Two surfaces in this repo deserve explicit framing so a judge cloning fresh doesn't have to reverse-engineer the design:
Every reasoning call goes through src/agent/invoke.ts, which spawns claude -p with Opus 4.7, --mcp-config, and per-role tool allowlists. There is no @anthropic-ai/claude-agent-sdk, no @anthropic-ai/sdk direct API client, no LangGraph, no CrewAI, no n8n, no other LLM provider in the agent path. git grep -nE 'claude-agent-sdk|langgraph|crewai|@anthropic-ai/sdk\b|cohere|@google/generative-ai' package.json web/package.json returns clean.
When the owner sends a voice note in Telegram, src/lib/transcribe.ts sends the audio to OpenAI's audio/transcriptions endpoint and returns the text. That text then flows through claude -p like any other typed message. Whisper is speech-to-text — same category as ElevenLabs Scribe, Deepgram, AWS Transcribe — not an LLM provider in the agent-framework sense. The brief's exclusion explicitly scopes to "other LLM providers for the core runtime" and Whisper is preprocessing, not the runtime.
It is also feature-flagged: when OPENAI_API_KEY is unset, voice messages get a graceful "voice transcription not configured — please type your message instead" reply (src/channels/telegram-poller.ts:142). The submission stack works without it.
The pages under web/src/app/admin/ are Telegram Mini Apps — Telegram's own mechanism for in-bot embedded UIs. Layout wraps everything in TgAppProvider, which:
- Reads
window.Telegram.WebApp.initData(signed by the owner bot's token) - HMAC-verifies the initData server-side (src/middleware/admin-auth.ts)
- Returns 401 on
/api/admin/*requests without a valid initData header
The owner reaches these surfaces only by tapping the bot's menu button inside Telegram. They are not a parallel web admin: outside the Mini App browser there is no user, no auth, no useful state. /admin/* is also disallowed in robots.txt so AI crawlers don't index it.
This is the same model Telegram itself uses for in-bot games, payments, and bot-side dashboards. Calling it "non-Telegram owner UI" would also disqualify Telegram's own product line — clearly not the brief's intent.
# 1. install (backend + website each have their own deps)
bun install
bun install --cwd web
# 2. configure
cp .env.example .env.local
# Fill in: SBC_TEAM_TOKEN, the four TG bot tokens, optionally
# TG_OWNER_CHAT_IDS (comma-separated; empty = open mode with a clear boot
# warning), WA/IG creds + APP_SECRETs for production-grade webhooks.
# 3. render the MCP config from template (substitutes SBC_TEAM_TOKEN)
bun run setup:mcp
# 4. seed the local SQLite catalog
bun run db:seed
# 5. preflight — green/yellow/red scoreboard for tooling, env, MCP
bun run preflight
# 6. smoke the agent end-to-end
bun run smoke:agent "do you have a chocolate cake?"
# 7. start the backend (Hono on :3000)
bun run dev
# 8. start the website (Next.js on :3001 — separate terminal)
bun run dev --cwd web
# 9. expose to the sandbox so it can push WA/IG inbound
ngrok http 3000
bun run register-webhooks https://<your-ngrok>.ngrok-free.app
bun run webhooks:statusSmoke output looks like this:
[smoke] role=concierge thread=smoke_…
───────── reply ─────────
Yes — cake "Honey", $42 for the whole, $8.50 by the slice. Need it for a date?
───────── trace ─────────
tools called: 2
• mcp__happycake__square_list_catalog
• mcp__happycake__kitchen_get_capacity
duration: ~9s · cost: ~$0.40
If you see (empty) and Invalid MCP configuration, re-run bun run setup:mcp.
Five concrete moments a non-technical operator hits in their first hour. Each one runs end-to-end without code changes.
-
A customer drafts an order on the website. The order lands in
@hc_owner_botas a card with Approve / Reject buttons. Standard catalog items (slices, whole cakes, pastries) auto-approve in the background — the card shows the result, no tap needed. Custom cakes and catering boxes wait for the operator's tap. -
Operator taps Approve. Deterministic orchestration runs in <2s: Square POS order created, kitchen ticket created, customer notified on their original channel (WhatsApp / Instagram / web). The operator sees
✓ Square: sq_… · Kitchen: tkt_…posted as a follow-up message in TG. No LLM hop in this path — Approve is a button, not a prompt. -
A 2-star Google Business review lands.
@hc_owner_botpings unprompted: "⭐⭐ — '...the box was bent and the message was misspelled.'" with a Draft reply button. Tapping it generates a brand-voiced reply with Askhat's tone, ready to publish viagb_simulate_replyafter a final review. -
Operator types
/spend. Same bot answers in 200ms (DB-only, noclaude -pspend) with budget MTD, leads generated, top-performing creative, and?ref=attribution rollup. Operator sees which campaign actually moved the needle. -
Operator types free text — "how's the kitchen tomorrow?". The bot's owner agent picks up, calls
kitchen_get_production_summaryandkitchen_get_capacity, and answers in TG with a streaming🤔 thinking…placeholder that edits in place as the agent'sclaude -pinvocation makes its tool calls. Final message: "Tomorrow: 5 active prep tickets, 47/420 minutes used. Custom Spider-Man cake is the heaviest — 90 min. Capacity for ~3 more standard whole-cake orders."
The whole loop — order intent, approval, customer notification, owner audit — fits inside Telegram. No browser tabs, no email, no admin login.
One bot per role; each has its own token + system prompt. All bots share the same outbound API but route inbound to the right agent role.
| Bot | Role | What it does |
|---|---|---|
TG_OWNER_BOT_TOKEN |
owner | Operator cockpit. Slash commands (instant, free, no claude -p spend), inline-keyboard callbacks for approve/reject/publish, free text → owner agent for ad-hoc questions. |
TG_CONCIERGE_BOT_TOKEN |
concierge | Optional "log mirror" bot — mirrors customer-channel inbound/outbound for visibility. Customers don't talk to it directly. |
TG_KITCHEN_BOT_TOKEN |
kitchen | Kitchen role — accept/reject tickets, mark ready. Reads from sandbox kitchen_* tools. |
TG_MARKETING_BOT_TOKEN |
marketing | Marketing role — campaign drafts that need review before launch. |
Operations:
/today /orders /escalations
Marketing & social:
/content 📅 weekly content plan + cadence
/drafts 📋 in-flight drafts (approve / schedule / publish)
/post|/reel draft hint
/comments 📥 DM inbox + sentiment + drafted replies
/reviews ⭐ GBP reviews + drafted replies
/reviews-flat legacy review list (fallback)
/campaigns pick ONE strategy ($500/mo)
/brief live MCP brief (sales + margins + GBP demand)
/spend budget MTD + referral attribution
/gb GBP metrics
/inbox legacy WA + IG list
Analytics:
/stats 🔄 digital-presence dashboard + alerts
Self-grading:
/score rubric coverage from the sandbox evaluator
Conversation:
/reset clear thread context
/help this menu
Free text on the owner bot:
make a post about <topic>/draft a reel about <topic>/gbp post about <topic>→ claude -p drafts a caption, brand-checked, queued for approve/schedule/publishedit draft_<id> <new caption>→ in-line edit- Anything else → owner agent (asks back through the same chat with the live "thinking…" stream)
claude -pis the agent runtime. One subprocess per inbound message; per-role tool allowlists in src/agent/allowlists.ts.- Two MCP servers: the sandbox (catalog, kitchen, marketing, evaluator — the judges' truth) and a local stdio MCP (drafts, threads, escalations, brand-RAG).
- The frontend never talks to the sandbox MCP. The team token lives only inside the agent subprocess; the website reads through the backend's
/api/catalogmirror. - Webhooks verify Meta HMAC signatures when
WA_APP_SECRET/IG_APP_SECRETis set; sandbox-injected unsigned bodies pass through with a logged warning so eval flow keeps working. - Owner approve/reject in Telegram is deterministic — atomic
square_create_order+kitchen_create_ticket, then notify the customer. The agent only sees the result. - Owner cockpit is DDD-layered: bounded contexts in
src/domain/{content-studio,engagement,analytics}each ship pure entities + aRepositoryinterface + a SQLite impl + application services; MCP is one transport behindPublishAdapter(gbp/ig/wa). Real-Meta path is a one-line adapter swap.
For the full picture: ARCHITECTURE.md, then docs/02-architecture/MCP.md and docs/02-architecture/AGENT-RUNTIME.md.
$ bun run evidence
Channel response : 100/100 (replies + GBP review answers + community post logged)
Marketing loop : 100/100 (9 campaigns launched, 27 leads routed, 3 owner reports filed)
POS + kitchen : 100/100 (13 orders, 8 tickets across accept / ready / reject)
World scenario : 100/100 (10 events / 6 delivered, 200 audit calls)
TOTAL : 400/400
$ bun run repro 8/8 fresh-clone boot checks
$ bun run audit:hardcodes clean across 78 files
With real WA_TOKEN / WA_PHONE_NUMBER_ID / IG_TOKEN set, the same code path posts via Meta Cloud API in addition to the sandbox simulator. Run in parallel under WA_OUTBOUND_MODE=both (default).
Four bots, one per agent role. Each has a separate token in .env.local; all are configured by the same bun run dev server. Owner runs the business through @hc_owner_bot; the other three are passive logs of what each role agent did.
| Bot | Token env var | Role agent | What the owner sees here |
|---|---|---|---|
@hc_owner_bot |
TG_OWNER_BOT_TOKEN |
owner | Daily digest (/today), order approvals (inline keyboards), escalations (/escalations), marketing (/campaigns, /spend, /brief), inbox (/inbox, /reviews), self-grading (/score), free-text questions to the owner agent |
@hc_concierge_bot |
TG_CONCIERGE_BOT_TOKEN |
concierge | Mirrored customer threads — every WA / IG / web inbound and the agent's reply, with tool-call trace |
@hc_kitchen_bot |
TG_KITCHEN_BOT_TOKEN |
kitchen | Ticket lifecycle — created → accepted → ready, capacity warnings, ready-pickup notifications |
@hc_marketing_bot |
TG_MARKETING_BOT_TOKEN |
marketing | Campaign launches, daily metrics digest, anomaly alerts (CTR drop > 50%), owner-approval queue |
Single-bot fallback: if only TG_OWNER_BOT_TOKEN is set, all four agents log into the owner bot. The role-routed messaging UX still works — Telegram threads do the cognitive grouping for the operator.
Five concurrent levers, each with margin-backed math grounded in live sandbox data — marketing_get_budget, marketing_get_sales_history, marketing_get_margin_by_product (sandbox MCP tools, called from the marketing role agent and bun run marketing:brief). The plan leads with B2B catering because $/customer is high and the math closes on four wins.
- Full hypothesis: docs/01-product/HYPOTHESIS.md
- Live brief (regenerates from MCP):
bun run marketing:brief - Loop driver (creates campaigns, generates leads, files owner report):
bun run marketing:run
src/
├── server.ts Hono entrypoint
├── config.ts env access seam
├── routes/ /api, /webhooks, /test
├── channels/ whatsapp, instagram, telegram, web
├── agent/
│ ├── invoke.ts claude -p subprocess wrapper
│ ├── drafter.ts claude -p one-shot for content captions
│ ├── router.ts picks role from incoming message
│ ├── prompts/ per-role system prompts (incl. brand.md)
│ └── mcp/
│ ├── local-server.ts stdio MCP exposing local domain
│ └── adapters/ gbp/ig/wa publish shims (sandbox today)
├── domain/ pure business logic — bounded contexts:
│ ├── content-studio/ post/reel lifecycle, brand-checker, scheduler
│ ├── engagement/ DM + review pull, sentiment, risk gates
│ ├── analytics/ digital-presence snapshots + alert publisher
│ ├── orders / catalog-sync / policies / campaigns / tools.ts
├── bots/owner/ Telegram cockpit:
│ ├── commands.ts slash router
│ ├── callbacks.ts approve/reject/view_esc deterministic taps
│ ├── inbox-reviews.ts async slash dispatcher
│ ├── marketing/ post-studio, presenter, engagement, stats
│ └── live.ts "thinking…" placeholder + streaming sink
├── lib/ webhook HMAC, sandbox MCP client, DO Spaces
├── db/ SQLite + schema
└── scripts/ setup:mcp, db:seed, preflight, world:run, evidence
web/ Next.js storefront + admin pages
data/catalog/ seed catalog (mirrored from sandbox at boot)
.mcp.json.template committed; rendered .mcp.json is gitignored
.env.example every env var documented
Three deploy paths — the hackathon brief explicitly approves the first two; Docker is the post-hackathon move.
The brief: "ngrok or Cloudflare Tunnel for inbound webhooks to local machine" is allowed. This is the recommended demo path. ngrok http 3000 exposes the backend, bun run register-webhooks https://<your-ngrok>.ngrok-free.app registers the URL with the sandbox so WA/IG inbound flows back home. Free-tier ngrok URLs change on restart and the tunnel sleeps after 8 hours — fine for a 24h hackathon, weak for production.
Limitation: ngrok free-tier URLs expire when the tunnel restarts. For a stable demo URL, swap to Cloudflare Tunnel (cloudflared tunnel run) — same approval in the brief, durable name.
Five-minute setup, free, named tunnel persists across restarts. Pick this for any post-hackathon demo. Same bun run register-webhooks step, but with a stable URL.
Dockerfile (backend), web/Dockerfile (Next.js standalone), docker-compose.yml (both services on one network). Add a Caddy reverse-proxy for auto-TLS. Detailed walkthrough including caveats (claude binary stays on host, SQLite-only persistence, single-tenant assumption): docs/05-deploy/DOCKER.md.
For Postgres migration when sustained traffic crosses ~1 req/s, real Square / Meta credentials in place of the simulator, and named Cloudflare Tunnel: docs/05-deploy/PRODUCTION.md.
.env.localis gitignored and never logged. Templates ship; secrets don't.SBC_TEAM_TOKENis injected only into theclaude -psubprocess env..mcp.jsonis rendered from template at setup time and gitignored.- Meta webhook bodies are HMAC-SHA256 verified when
WA_APP_SECRET/IG_APP_SECRETis set. Without them, sandbox-injected payloads still pass — with a one-time warning. - Two intentional
⚠️ HACKATHON-MODE OPEN ACCESSnotes are inline: the Telegram owner whitelist (emptyTG_OWNER_CHAT_IDSaccepts any chat) and the/api/admin/*endpoints (unauthenticated). Both are loud in the boot log; both must be closed before any public deploy.
- AGENTS.md — onboarding for AI assistants working on the repo
- ARCHITECTURE.md — runtime + data flow
- docs/01-product/HYPOTHESIS.md — $500 → $5,000 plan with live numbers
- docs/02-architecture/ — MCP, data model, agent runtime, security, webhooks
- docs/03-build/OWNER-BOT-SETUP.md — Telegram cockpit setup
- docs/05-deploy/ — laptop + ngrok, DigitalOcean
- docs/04-test/EVIDENCE.md — runnable demo script
Adilet (architecture, agent runtime, MCP wiring), one teammate on the owner cockpit + kitchen handoff, one on scenarios + marketing creative.
Per the hackathon agreement, IP transfers to Steppe Business Club; the team retains a portfolio-use license. The repo will be public after May 10, 16:00 CT.