OLCS Python API

Multi-tenant library catalog service backed by Open Library. FastAPI + SQLAlchemy 2.0 async + SQLite + Huey.

Architecture

┌─────────────┐     ┌──────────────┐     ┌──────────────┐
│   FastAPI    │────▶│  SQLAlchemy  │────▶│   SQLite DB  │
│  (uvicorn)   │     │   async      │     │              │
└──────┬───────┘     └──────────────┘     └──────────────┘
       │
       │ HTTP
       ▼
┌──────────────┐
│ Open Library  │
│    API        │
└──────────────┘

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│    Huey      │────▶│  SQLAlchemy  │────▶│   SQLite DB  │
│  (consumer)  │     │   sync        │     │  (shared)    │
└──────────────┘     └──────────────┘     └──────────────┘

Key Design Decisions

Multi-tenancy: Every table has a tenant_id FK. All endpoints are scoped under /tenants/{slug}/.... Tenant slugs are validated as lowercase alphanumeric with hyphens.
PII-safe storage: Patron emails are HMAC-SHA256 hashed (deterministic for dedup), names are SHA-256 hashed (one-way), and emails are masked for display (j***@example.com). No plaintext PII is ever stored or returned.
Versioned books: Re-ingesting the same work creates a new BookVersion record instead of overwriting. A diff endpoint shows what changed between versions.
Rate limiting: Fixed-window per-tenant rate limits — 60 req/min for API endpoints, 10 req/hour for ingestion. Returns 429 with Retry-After header.
Background ingestion: Huey task queue handles Open Library API calls asynchronously. Jobs track status (pending → running → completed/failed).
Scheduled re-sync: A Huey periodic task re-ingests all previously ingested queries every 6 hours (configurable via RE_SYNC_INTERVAL_HOURS).

Setup

# Clone and install
git clone https://github.com/jhult/olcs-python-api.git
cd olcs-python-api
uv sync

# Create the database and run migrations
uv run alembic upgrade head

# Create a tenant
uv run python scripts/create_tenant.py --name "My Library" --slug my-library

Run

# Start the API server
uv run uvicorn olcs.main:app --reload

# Start the Huey consumer (for background ingestion)
uv run huey_consumer olcs.huey_config.huey

The API server runs on http://localhost:8000 by default. Interactive docs at http://localhost:8000/docs.

OpenAPI Schema

FastAPI auto-generates an OpenAPI schema. With the server running:

curl http://localhost:8000/openapi.json

Or without starting the server:

from olcs.main import app
import json
print(json.dumps(app.openapi(), indent=2))

API Reference

All tenant-scoped endpoints use the prefix /tenants/{slug}/.... Replace {slug} with your tenant's slug.

Health

# Health check (includes DB connectivity check)
curl http://localhost:8000/health

# Service info
curl http://localhost:8000/

Tenants

# Create a tenant
curl -X POST http://localhost:8000/tenants/ \
  -H 'Content-Type: application/json' \
  -d '{"name": "Springfield Library", "slug": "springfield"}'

# List tenants
curl http://localhost:8000/tenants/

# Get a tenant
curl http://localhost:8000/tenants/springfield

Books

# List books (paginated)
curl http://localhost:8000/tenants/springfield/books/

# Search books (min 2 characters)
curl http://localhost:8000/tenants/springfield/books/search?q=tolkien

# Get book detail with version history
curl http://localhost:8000/tenants/springfield/books/{book_id}

# Get version diff
curl http://localhost:8000/tenants/springfield/books/{book_id}/versions/{v1_id}/diff/{v2_id}

Ingestion

# Start ingestion by author
curl -X POST http://localhost:8000/tenants/springfield/ingestion/ \
  -H 'Content-Type: application/json' \
  -d '{"query": "author:Tolkien"}'

# Start ingestion by subject
curl -X POST http://localhost:8000/tenants/springfield/ingestion/ \
  -H 'Content-Type: application/json' \
  -d '{"query": "subject:fantasy"}'

# Start ingestion by OL keys
curl -X POST http://localhost:8000/tenants/springfield/ingestion/ \
  -H 'Content-Type: application/json' \
  -d '{"open_library_keys": ["/works/OL18295W"]}'

# List ingestion jobs
curl http://localhost:8000/tenants/springfield/ingestion/jobs

# Get a specific job
curl http://localhost:8000/tenants/springfield/ingestion/jobs/{job_id}

# List ingestion logs
curl http://localhost:8000/tenants/springfield/ingestion/logs

# Check re-sync status
curl http://localhost:8000/tenants/springfield/ingestion/resync

Submissions (Reading Lists)

# Submit a reading list with an OL key
curl -X POST http://localhost:8000/tenants/springfield/submissions/ \
  -H 'Content-Type: application/json' \
  -d '{
    "email": "reader@example.com",
    "name": "Jane Doe",
    "books": [{"open_library_key": "OL18295W"}]
  }'

# Submit with an ISBN
curl -X POST http://localhost:8000/tenants/springfield/submissions/ \
  -H 'Content-Type: application/json' \
  -d '{
    "email": "reader@example.com",
    "name": "Jane Doe",
    "books": [{"isbn": "9780618640157"}]
  }'

# Submit with mixed identifiers (1-50 books)
curl -X POST http://localhost:8000/tenants/springfield/submissions/ \
  -H 'Content-Type: application/json' \
  -d '{
    "email": "reader@example.com",
    "name": "Jane Doe",
    "books": [
      {"open_library_key": "OL18295W"},
      {"isbn": "9780618640157"}
    ]
  }'

# List submissions (no plaintext PII in responses)
curl http://localhost:8000/tenants/springfield/submissions/

Rate Limits

# Check rate limit usage (API and ingestion)
curl http://localhost:8000/tenants/springfield/rate-limits/

Response shows both API (60/min) and ingestion (10/hour) limits with window start/end times.

When rate-limited, responses include a Retry-After header:

HTTP/1.1 429 Too Many Requests
Retry-After: 45
{"detail": "Rate limit exceeded"}

Tenant Isolation

Each tenant sees only their own data. A book created under tenant A is invisible to tenant B:

# Create two tenants
curl -X POST http://localhost:8000/tenants/ \
  -H 'Content-Type: application/json' \
  -d '{"name": "Library A", "slug": "lib-a"}'

curl -X POST http://localhost:8000/tenants/ \
  -H 'Content-Type: application/json' \
  -d '{"name": "Library B", "slug": "lib-b"}'

# Library B sees zero books even after Library A ingests
curl http://localhost:8000/tenants/lib-b/books/
# {"items":[],"total":0}

# Accessing Library A's book under Library B returns 404
curl http://localhost:8000/tenants/lib-b/books/{lib-a-book-id}
# {"detail":"Book not found"}

Configuration

Environment variables (or .env file):

Variable	Default	Description
`DATABASE_URL`	`sqlite+aiosqlite:///olcs.db`	Database connection string
`HMAC_SECRET_KEY`	`dev-secret-key-change-in-production`	Key for email HMAC hashing
`API_RATE_LIMIT_REQUESTS`	`60`	API requests per window
`API_RATE_LIMIT_WINDOW_SECONDS`	`60`	API rate limit window (seconds)
`INGESTION_RATE_LIMIT_REQUESTS`	`10`	Ingestion requests per window
`INGESTION_RATE_LIMIT_WINDOW_SECONDS`	`3600`	Ingestion rate limit window (seconds)
`RE_SYNC_INTERVAL_HOURS`	`6`	How often to re-sync ingested works
`TESTING`	`false`	Bypass rate limits (for tests)

Testing

# Run all tests (unit + live OL API tests)
uv run pytest tests/ -v

# Run only unit tests (no network calls)
uv run pytest tests/ -v --ignore=tests/test_open_library.py

# Lint
uv run ruff check src/ tests/
uv run ruff format src/ tests/

What I Would Do Differently With More Time

SQLite to PostgreSQL: SQLite doesn't handle concurrent writes well. With multiple uvicorn workers or the huey consumer writing simultaneously, we'd hit database is locked errors under load. The trade-off: SQLite is zero-config for dev/prototyping, but real concurrent traffic needs PostgreSQL's row-level locking and connection pooling.
Sliding window rate limiting: Fixed windows have a known burst vulnerability — a tenant can send 60 requests at second 59 of one window and 60 more at second 1 of the next, effectively getting 120 requests in 2 seconds. A sliding window or token bucket would be fairer. The trade-off: fixed windows are simple and the current table schema works, but a determined client can double their rate at window boundaries.
Async task queue instead of huey+sqlite: The huey consumer runs asyncio.run() with its own SQLite connection, and the scheduled resync does the same. This means no connection pooling, no shared async session, and SQLite locks between the web server and the worker. A proper async task system (e.g., arq with Redis, or Dramatiq with PostgreSQL) would share the same database connection pool and avoid SQLite file lock contention.
Async ISBN resolution in submissions: When a patron submits a reading list with ISBNs, the endpoint blocks on get_edition_by_isbn() calls for each unknown ISBN. With 50 books (our max), that's up to 50 sequential HTTP requests at 0.33s each — ~16 seconds worst case. This should use asyncio.gather() for parallel resolution, or move resolution to a background task entirely.
Batch ingestion: Currently each ingestion job handles a single query. A batch endpoint would let tenants kick off multiple ingestion jobs in one request, reducing round trips.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
alembic		alembic
scripts		scripts
src/olcs		src/olcs
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
alembic.ini		alembic.ini
justfile		justfile
prompt-log.md		prompt-log.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OLCS Python API

Architecture

Key Design Decisions

Setup

Run

OpenAPI Schema

API Reference

Health

Tenants

Books

Ingestion

Submissions (Reading Lists)

Rate Limits

Tenant Isolation

Configuration

Testing

What I Would Do Differently With More Time

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OLCS Python API

Architecture

Key Design Decisions

Setup

Run

OpenAPI Schema

API Reference

Health

Tenants

Books

Ingestion

Submissions (Reading Lists)

Rate Limits

Tenant Isolation

Configuration

Testing

What I Would Do Differently With More Time

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages