feat: deploy v0 to Fly.io with /version + per-IP rate limit on /generate#46
Merged
Conversation
Containerises the web builder for Fly.io deployment. The image is python:3.12-slim with the project installed in editable mode so the HTML renderer can keep resolving the top-level templates/ directory via Path(__file__).parents[2] — a non-editable install would relocate trailstory/ into site-packages and break that path. fly.toml uses a 512 MB shared-cpu-1x VM in fra, /healthz check, force_https, auto-stop on idle, and no persistent volume; workspaces live under /tmp and restart-wipes are stronger than the published 30-min retention promise. Adds GET /version returning the git SHA stamped into the image at build time (GIT_SHA build arg, falls back to "unknown" locally) so deployed bugs can be tied back to a commit without log archaeology. make deploy refuses dirty trees and forwards the SHA to flyctl deploy --build-arg. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a sliding-window per-IP cap (10/hour) on POST /generate, the only route that triggers an Anthropic narrative call. Without it, a single abusive client could burn through the published Anthropic spend cap in minutes — Fly's pay-as-you-go tier exposes no hard spending limit on most accounts, so the defense lives in the app. Keying lives in web.ratelimit.client_ip: Fly-Client-IP → first hop of X-Forwarded-For → socket peer. request.client.host alone would be one of Fly's edge IPs and useless for rate limiting. The limiter is bounded to 10k tracked IPs with LRU-by-insert eviction so a flood of unique sources cannot OOM the process. State is in-process; a restart wipes counters, which is fine because the window is one hour and Fly's HA pair leaks at most 2x across replicas. Over-quota responses are HTTP 429 with Retry-After. The check fires before multipart body parsing (FastAPI dependency taking only Request) so a banned client cannot waste the upload bandwidth + LLM call cost. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Dockerfile,fly.toml,.dockerignore,make deploy. Editable install on purpose so the renderer'sPath(__file__).parents[2] / "templates"keeps resolving.GET /versionreturning the git SHA stamped into the image at build time so deployed bugs can be tied back to a commit.POST /generate(10/hour). Caps abuse cost at the only route that triggers an Anthropic narrative call (~$0.10–$0.30 each); Fly's pay-as-you-go tier exposes no hard spend cap on most accounts so the defense lives in the app. Keys onFly-Client-IP→X-Forwarded-For→ socket peer becauserequest.client.hostalone is one of Fly's edge IPs and useless for rate limiting.Already deployed to https://trailstory.fly.dev —
/versionreturns the head of this branch.Test plan
make cipasses locally (ruff lint, ruff format, mypy strict, pytest with 80% coverage gate). 301 tests, 95.30% coverage; newweb/ratelimit.pyat 100%.RateLimiter(window expiry, key separation, max-keys eviction, retry-after computation) andclient_ip(header precedence, fallbacks).POST /generatereturns 429 withRetry-Afterpast the limit; distinctFly-Client-IPheaders get independent buckets./versionendpoint tested with and withoutGIT_SHAenv set./healthz200,/versionreturns the commit SHA, landing + privacy pages render, full browser flow (upload GPX + photos → SSE stream → memory page) verified end-to-end against the real Anthropic key.Notes for reviewer
.github/workflows/fly-deploy.ymlwas auto-generated byflyctl launchand targetsmain, notdevelop. It is harmless dead code as committed (will never trigger) — wiring up GitHub auto-deploy properly is a separate follow-up that requiresFLY_API_TOKENin repo secrets and a branch tweak./tmpand restart-wipes are stronger than the published 30-min retention promise — adding a volume would survive restarts and pin to one region, both undesirable for v0.🤖 Generated with Claude Code