A simple AI Inference Backend built with FastAPI, PostgreSQL, Redis, and Docker, designed to register models, accept inference requests (sync/async), implement RBAC via API keys, handle async tasks with Celery/Redis, provide metrics, and include rate limiting.
Follow these steps to get the project running locally using Docker.
- Docker: Install Docker
- Docker Compose: Install Docker Compose
-
Clone the repository:
git clone <your-repo-url> cd <your-repo-name>
-
Create Environment File: Rename the example environment file. The default values should work for local setup.
cp .env.example .env
-
Build and Run Containers: This command builds the images and starts the
backend,db,redis, andworkerservices in detached mode.docker-compose up --build -d
Wait a moment for the containers, especially the database, to initialize.
-
Apply Database Migrations: Run Alembic migrations inside the
backendcontainer to create the database tables.docker-compose exec backend alembic upgrade head -
Create API Keys: Generate the necessary API keys using the provided script. You'll need at least one admin key to manage models.
- To create an Admin key:
docker-compose exec backend python create_key.py --admin - To create a regular User key:
docker-compose exec backend python create_key.py
Copy the generated key(s) from the output. You will need them for the
X-API-Keyheader. - To create an Admin key:
-
Access the Application:
- The API is running at
http://localhost:8000. - Interactive API documentation (Swagger UI) is available at
http://localhost:8000/docs.
- The API is running at
The project includes unit and integration tests written using pytest.
- Run all tests: Execute the following command:
docker-compose exec backend pytest - Test Location: 6 meaningful tests covering model management, inference requests (sync/async), RBAC, and rate limiting are located in the
tests/directory.
You can interactively explore and test the API endpoints using the auto-generated Swagger documentation:
http://localhost:8000/docs
Replace <your_admin_key> and <your_user_key> with the actual keys you generated.
-
Create a Model (Admin Only):
curl -X 'POST' \ 'http://localhost:8000/models/' \ -H 'accept: application/json' \ -H 'X-API-Key: <your_admin_key>' \ -H 'Content-Type: application/json' \ -d '{ "name": "My Sentiment Model", "type": "classification", "provider": "Local", "config": {} }'
-
List Models (Any Active User):
curl -X 'GET' \ 'http://localhost:8000/models/' \ -H 'accept: application/json' \ -H 'X-API-Key: <your_user_key>'
-
Run Synchronous Inference (Any Active User): (Assumes a model with ID=1 exists)
curl -X 'POST' \ 'http://localhost:8000/inferences/' \ -H 'accept: application/json' \ -H 'X-API-Key: <your_user_key>' \ -H 'Content-Type: application/json' \ -d '{ "model_id": 1, "mode": "sync", "input_data": { "prompt": "FastAPI is great!" } }'
-
Run Asynchronous Inference with Webhook (Any Active User): (Assumes a model with ID=1 exists)
curl -X 'POST' \ 'http://localhost:8000/inferences/' \ -H 'accept: application/json' \ -H 'X-API-Key: <your_user_key>' \ -H 'Content-Type: application/json' \ -d '{ "model_id": 1, "mode": "async", "input_data": { "prompt": "Tell me a story." }, "webhook_url": "[https://httpbin.org/post](https://httpbin.org/post)" }'
(The result will be sent to the webhook URL upon completion)
-
Get Metrics Summary (Any Active User):
curl -X 'GET' \ 'http://localhost:8000/metrics/summary' \ -H 'accept: application/json' \ -H 'X-API-Key: <your_user_key>'
Thank you,