Skip to content

feat: deploy v0 to Fly.io with /version + per-IP rate limit on /generate#46

Merged
ditvor merged 2 commits into
developfrom
claude/cranky-perlman-532e6f
May 1, 2026
Merged

feat: deploy v0 to Fly.io with /version + per-IP rate limit on /generate#46
ditvor merged 2 commits into
developfrom
claude/cranky-perlman-532e6f

Conversation

@ditvor
Copy link
Copy Markdown
Owner

@ditvor ditvor commented May 1, 2026

Summary

  • Containerises the web builder for Fly.io: Dockerfile, fly.toml, .dockerignore, make deploy. Editable install on purpose so the renderer's Path(__file__).parents[2] / "templates" keeps resolving.
  • Adds GET /version returning the git SHA stamped into the image at build time so deployed bugs can be tied back to a commit.
  • Adds a sliding-window per-IP rate limit on POST /generate (10/hour). Caps abuse cost at the only route that triggers an Anthropic narrative call (~$0.10–$0.30 each); Fly's pay-as-you-go tier exposes no hard spend cap on most accounts so the defense lives in the app. Keys on Fly-Client-IPX-Forwarded-For → socket peer because request.client.host alone is one of Fly's edge IPs and useless for rate limiting.

Already deployed to https://trailstory.fly.dev/version returns the head of this branch.

Test plan

  • make ci passes locally (ruff lint, ruff format, mypy strict, pytest with 80% coverage gate). 301 tests, 95.30% coverage; new web/ratelimit.py at 100%.
  • Unit tests for RateLimiter (window expiry, key separation, max-keys eviction, retry-after computation) and client_ip (header precedence, fallbacks).
  • Integration tests: POST /generate returns 429 with Retry-After past the limit; distinct Fly-Client-IP headers get independent buckets.
  • New /version endpoint tested with and without GIT_SHA env set.
  • Smoke-tested in production: /healthz 200, /version returns the commit SHA, landing + privacy pages render, full browser flow (upload GPX + photos → SSE stream → memory page) verified end-to-end against the real Anthropic key.

Notes for reviewer

  • The .github/workflows/fly-deploy.yml was auto-generated by flyctl launch and targets main, not develop. It is harmless dead code as committed (will never trigger) — wiring up GitHub auto-deploy properly is a separate follow-up that requires FLY_API_TOKEN in repo secrets and a branch tweak.
  • No persistent volume on Fly. Workspaces live in /tmp and restart-wipes are stronger than the published 30-min retention promise — adding a volume would survive restarts and pin to one region, both undesirable for v0.
  • Anthropic spend cap should be set in parallel via the Anthropic console (the rate limit bounds per-IP burn but a botnet hitting many IPs once each is still bounded by the Anthropic-side cap, not this).

🤖 Generated with Claude Code

ditvor and others added 2 commits April 30, 2026 17:19
Containerises the web builder for Fly.io deployment. The image is
python:3.12-slim with the project installed in editable mode so the
HTML renderer can keep resolving the top-level templates/ directory
via Path(__file__).parents[2] — a non-editable install would relocate
trailstory/ into site-packages and break that path. fly.toml uses a
512 MB shared-cpu-1x VM in fra, /healthz check, force_https,
auto-stop on idle, and no persistent volume; workspaces live under
/tmp and restart-wipes are stronger than the published 30-min
retention promise.

Adds GET /version returning the git SHA stamped into the image at
build time (GIT_SHA build arg, falls back to "unknown" locally) so
deployed bugs can be tied back to a commit without log archaeology.
make deploy refuses dirty trees and forwards the SHA to flyctl
deploy --build-arg.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a sliding-window per-IP cap (10/hour) on POST /generate, the
only route that triggers an Anthropic narrative call. Without it, a
single abusive client could burn through the published Anthropic
spend cap in minutes — Fly's pay-as-you-go tier exposes no hard
spending limit on most accounts, so the defense lives in the app.

Keying lives in web.ratelimit.client_ip: Fly-Client-IP → first hop of
X-Forwarded-For → socket peer. request.client.host alone would be
one of Fly's edge IPs and useless for rate limiting. The limiter is
bounded to 10k tracked IPs with LRU-by-insert eviction so a flood of
unique sources cannot OOM the process. State is in-process; a
restart wipes counters, which is fine because the window is one
hour and Fly's HA pair leaks at most 2x across replicas.

Over-quota responses are HTTP 429 with Retry-After. The check fires
before multipart body parsing (FastAPI dependency taking only
Request) so a banned client cannot waste the upload bandwidth + LLM
call cost.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ditvor ditvor merged commit ff83e92 into develop May 1, 2026
5 checks passed
@ditvor ditvor deleted the claude/cranky-perlman-532e6f branch May 1, 2026 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant