GitHub - AI-Systems-Infrastructure/module-3: Reference implementation of Module 3 exercise

This module demonstrates how to build production-ready API servers for AI models using FastAPI.

What's New

While Modules 1-2 focused on consuming APIs, Module 3 teaches you to produce APIs - serving your own AI models through professional-grade endpoints with:

FastAPI framework for high-performance async APIs
Authentication & Authorization using Bearer tokens
Rate limiting to prevent abuse
Database persistence for user management and analytics
API versioning for backward compatibility
Async model loading for efficient resource usage

Architecture Overview

Client (client.py) → API Server (server.py) → AI Model (ResNet-18)
                            ↓
                     SQLite Database
                     (users, requests)

Setup

1. Install Dependencies

Installation with uv (recommended):

uv sync

Installation with pip:

pip install fastapi uvicorn sqlalchemy pillow torch transformers python-dotenv

2. Database Initialization

The SQLite database (ai_api.db) is created automatically on first run. For your convenience, I included the database file with one user entry (with API key your-secret-api-key).

If the database is actually newly created, you can manually add a test user:

sqlite3 ai_api.db
INSERT INTO users (api_key) VALUES ('your-secret-api-key');

3. Environment Variables

Create a .env file for the client:

API_KEY=your-secret-api-key

Running the Server

python server.py

API Endpoints

All endpoints are versioned under /v1:

`GET /v1/model/info`

Check model status without authentication.

curl http://localhost:8000/v1/model/info

`POST /v1/classify`

Classify an image (requires authentication).

curl -X POST http://localhost:8000/v1/classify \
  -H "Authorization: Bearer your-secret-api-key" \
  -H "Content-Type: application/json" \
  -d '{"image": "base64_encoded_image_here"}'

`GET /v1/usage`

Get usage statistics (requires authentication).

curl http://localhost:8000/v1/usage \
  -H "Authorization: Bearer your-secret-api-key"

Testing with Client

The included client.py is a modified version of Module 1's image_analyzer.py, adapted to work with our server:

python client.py meal.png

Key Learning Points

1. FastAPI Patterns

Dependency Injection: Depends() for auth and database sessions
Pydantic Models: Automatic request/response validation
Async/Await: Non-blocking I/O for better performance
Lifespan Events: Startup/shutdown resource management

2. Production Considerations

Authentication: Bearer tokens (same pattern as OpenAI/Anthropic)
Rate Limiting: Prevent abuse with per-minute request limits
Error Handling: Proper HTTP status codes and error messages
Usage Tracking: Database logging for analytics and billing

3. API Design

Versioning: /v1 prefix allows future updates without breaking clients
RESTful Routes: Logical endpoint naming and HTTP methods
Response Models: Consistent, documented response structures

Common Issues

"Connection refused"

Cause: Server not running
Solution: Start server with python server.py

"401 Unauthorized"

Cause: Invalid or missing API key
Solution: Check API key in database and .env file

"429 Rate limit exceeded"

Cause: Too many requests (>5 per minute)
Solution: Wait 60 seconds or increase limit in check_rate_limit()

Model loading takes too long

Cause: First-time model download from Hugging Face
Solution: Model is cached after first download

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
ai_api.db		ai_api.db
client.py		client.py
demo.png		demo.png
meal.png		meal.png
pyproject.toml		pyproject.toml
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What's New

Architecture Overview

Setup

1. Install Dependencies

2. Database Initialization

3. Environment Variables

Running the Server

API Endpoints

`GET /v1/model/info`

`POST /v1/classify`

`GET /v1/usage`

Testing with Client

Key Learning Points

1. FastAPI Patterns

2. Production Considerations

3. API Design

Common Issues

"Connection refused"

"401 Unauthorized"

"429 Rate limit exceeded"

Model loading takes too long

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What's New

Architecture Overview

Setup

1. Install Dependencies

2. Database Initialization

3. Environment Variables

Running the Server

API Endpoints

GET /v1/model/info

POST /v1/classify

GET /v1/usage

Testing with Client

Key Learning Points

1. FastAPI Patterns

2. Production Considerations

3. API Design

Common Issues

"Connection refused"

"401 Unauthorized"

"429 Rate limit exceeded"

Model loading takes too long

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /v1/model/info`

`POST /v1/classify`

`GET /v1/usage`

Packages