ML API - AI-Powered Text Generation API

A production-ready FastAPI application that provides AI text generation with response caching and comprehensive monitoring.

Features

🤖 AI Text Generation - Uses Google's Flan-T5-Small model
⚡ Response Caching - 1000x faster for repeat queries
📊 Metrics Tracking - Real-time performance monitoring
🔍 Comprehensive Logging - Detailed request/response tracking
✅ Input Validation - Prevents malformed requests
💚 Health Checks - Production monitoring ready

Tech Stack

FastAPI - Modern Python web framework
Transformers (Hugging Face) - AI model integration
Flan-T5-Small - 80M parameter text generation model
Python 3.12

Quick Start

Prerequisites

Python 3.8+
8GB+ RAM (for model)

Installation

Clone the repository:

git clone <your-repo-url>
cd ml-api-project

Create virtual environment:

python3 -m venv venv
source venv/bin/activate  # On Mac/Linux

Install dependencies:

pip install fastapi uvicorn transformers torch

Run the API:

uvicorn main:app --reload

Open your browser:

http://127.0.0.1:8000/docs

API Endpoints

`GET /`

Root endpoint - confirms API is running

`GET /health`

Health check endpoint for monitoring

`GET /metrics`

Returns API usage statistics:

Total requests
Cache hit rate
Average inference time
And more...

`POST /generate`

Main text generation endpoint

Request Body:

{
  "prompt": "What is machine learning?",
  "max_length": 100
}

Response:

{
  "prompt": "What is machine learning?",
  "response": "Machine learning is a method of data analysis...",
  "model": "flan-t5-small",
  "inference_time_seconds": 5.59,
  "cached": false
}

Performance

First request: ~5-10 seconds (runs AI model)
Cached requests: ~0.01 seconds (instant)
Cache hit rate: Typically 60%+ in production

Example Usage

Using curl:

curl -X POST "http://127.0.0.1:8000/generate" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Python?", "max_length": 100}'

Using Python:

import requests

response = requests.post(
    "http://127.0.0.1:8000/generate",
    json={"prompt": "What is Python?", "max_length": 100}
)

print(response.json())

Logging

The API uses comprehensive logging with emoji markers:

🔵 New request received
📝 Prompt details
💾 Cache hit/miss
🤖 AI model running
✅ Response generated
⚠️ Warnings/errors

Future Improvements

Project Structure

ml-api-project/
├── main.py              # Main application code
├── README.md            # This file
├── requirements.txt     # Python dependencies
└── venv/               # Virtual environment

Author

Built as part of an AI engineering portfolio project.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML API - AI-Powered Text Generation API

Features

Tech Stack

Quick Start

Prerequisites

Installation

API Endpoints

`GET /`

`GET /health`

`GET /metrics`

`POST /generate`

Performance

Example Usage

Using curl:

Using Python:

Logging

Future Improvements

Project Structure

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML API - AI-Powered Text Generation API

Features

Tech Stack

Quick Start

Prerequisites

Installation

API Endpoints

GET /

GET /health

GET /metrics

POST /generate

Performance

Example Usage

Using curl:

Using Python:

Logging

Future Improvements

Project Structure

Author

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /`

`GET /health`

`GET /metrics`

`POST /generate`

Packages