Production-Ready Document Intelligence API
Secure document ingestion, AI-powered analysis, caching, rate limiting, and cloud storage —
deployed on AWS EC2 behind Nginx with SSL.
Swagger Docs • Postman Collection
| Resource | URL |
|---|---|
| Interactive API Docs (Swagger) | https://smartdocapi.duckdns.org/api-docs |
| API Base URL | https://smartdocapi.duckdns.org |
| Health Check | https://smartdocapi.duckdns.org/health |
The API runs on an AWS EC2 instance (Ubuntu) with Nginx terminating TLS via Let's Encrypt and reverse-proxying to a Docker Compose stack.
Smart Doc API is a scalable backend system that ingests documents (PDF, DOCX, TXT), extracts text, and performs AI-powered analysis using OpenAI.
Built using real-world backend engineering practices:
- Layered architecture (controllers → services → data layer)
- JWT-based authentication
- Redis response caching
- Cloud file storage (AWS S3)
- Background job processing (BullMQ + Redis)
- Real-time updates (Socket.io)
- Webhooks with HMAC-signed delivery
- Structured logging (Winston)
- Rate limiting (global, auth-specific, AI-specific)
- Integration & unit testing (Jest)
- Swagger interactive documentation
- Docker, Nginx + SSL, CI/CD with auto-deploy to AWS EC2
┌─────────────┐
│ Browser │
│ / Client │
└──────┬──────┘
│ HTTPS (443)
▼
┌────────────────────────────────────────────────────────────────────┐
│ AWS EC2 (Ubuntu) │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Nginx (host) │ │
│ │ - Terminates SSL (Let's Encrypt / Certbot) │ │
│ │ - Reverse proxy → 127.0.0.1:3000 │ │
│ └──────────────────────────┬───────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Docker Compose (app-network) │ │
│ │ ┌────────────────────┐ ┌────────────────────┐ │ │
│ │ │ Node.js API │ ───► │ Redis │ │ │
│ │ │ (Express) │ │ (cache + BullMQ) │ │ │
│ │ │ Port 3000 │ └────────────────────┘ │ │
│ │ └─────────┬──────────┘ │ │
│ └─────────────┼──────────────────────────────────────────────┘ │
└─────────────────┼──────────────────────────────────────────────────┘
│
┌─────────┼──────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────────┐ ┌──────────────┐
│ AWS S3 │ │ Neon │ │ OpenAI API │
│ (Files) │ │ (PostgreSQL) │ │ (Analysis) │
└──────────┘ └──────────────┘ └──────────────┘
How a request flows:
- Browser → Nginx (HTTPS on port 443, SSL terminated by Let's Encrypt cert)
- Nginx → Node.js API (HTTP reverse proxy to
127.0.0.1:3000— Docker publishes only to localhost) - API → Redis (cache lookups, BullMQ job queue) — runs in the same Docker network
- API → AWS S3 for document uploads and presigned downloads
- API → Neon (managed PostgreSQL) for users, documents, analyses, webhooks
- API → OpenAI for AI analysis (queued via BullMQ, processed by a worker)
| Feature | Description |
|---|---|
| JWT Authentication | Secure register/login with hashed passwords |
| File Upload | PDF, DOCX, TXT — stored in AWS S3 |
| AI Analysis | Summary, key points, sentiment, custom prompts via OpenAI |
| Background Processing | BullMQ workers process jobs asynchronously |
| Real-time Updates | Socket.io notifies clients on job completion |
| Webhooks | HMAC-signed HTTP callbacks on events |
| Caching | Upstash Redis caching to reduce AI costs & latency |
| Rate Limiting | Global, auth-specific, and AI-specific limits |
| Structured Logging | Winston with JSON (production) and color (development) |
| API Docs | Interactive Swagger UI at /api-docs |
git clone https://github.com/theboylexis/smart-doc-api.git
cd smart-doc-api
cp .env.example .env # Edit with your API keys
docker compose up --buildThe API will be running at http://localhost:3000.
Prerequisites: Node.js 18+, PostgreSQL, Redis
git clone https://github.com/theboylexis/smart-doc-api.git
cd smart-doc-api
npm install
cp .env.example .env # Edit with your credentials
# Generate Prisma client & run migrations
npx prisma generate
npx prisma migrate dev
# Start the server
npm run dev| Variable | Description | Required |
|---|---|---|
NODE_ENV |
development, test, or production |
✅ |
PORT |
Server port (default: 3000) | ✅ |
DATABASE_URL |
PostgreSQL connection string (Neon in production) | ✅ |
JWT_SECRET |
Secret key for signing JWTs | ✅ |
OPENAI_API_KEY |
OpenAI API key | ✅ |
UPSTASH_REDIS_REST_URL |
Upstash Redis REST endpoint | ✅ |
UPSTASH_REDIS_REST_TOKEN |
Upstash Redis REST token | ✅ |
REDIS_URL |
Redis TCP URL (for BullMQ) | ✅ |
AWS_ACCESS_KEY_ID |
IAM access key for S3 | ✅ |
AWS_SECRET_ACCESS_KEY |
IAM secret key for S3 | ✅ |
AWS_REGION |
S3 bucket region (e.g. eu-west-1) |
✅ |
S3_BUCKET_NAME |
Name of the S3 bucket for document storage | ✅ |
CORS_ORIGIN |
Allowed CORS origin (defaults to *) |
❌ |
| Method | Endpoint | Description | Auth |
|---|---|---|---|
POST |
/api/auth/register |
Register a new user | ❌ |
POST |
/api/auth/login |
Login and get JWT token | ❌ |
POST |
/api/documents/upload |
Upload a document | ✅ |
GET |
/api/documents |
List all user documents | ✅ |
GET |
/api/documents/:id |
Get a single document | ✅ |
POST |
/api/ai/analyze/:documentId |
Queue AI analysis | ✅ |
GET |
/api/ai/analyses/:documentId |
Get analyses for a document | ✅ |
POST |
/api/webhooks |
Register a webhook URL | ✅ |
GET |
/api/webhooks |
List user webhooks | ✅ |
DELETE |
/api/webhooks/:id |
Delete a webhook | ✅ |
GET |
/api-docs |
Swagger UI documentation | ❌ |
GET |
/health |
Health check | ❌ |
The production stack runs on a single EC2 instance: Nginx on the host terminates SSL, and Docker Compose runs the Node.js API plus Redis on a private bridge network. Postgres is hosted on Neon and files live in AWS S3 — neither runs on the box.
# SSH into the EC2 instance
ssh -i your-key.pem ubuntu@<EC2_HOST>
# Install Docker, Compose plugin, Nginx, Certbot
sudo apt update && sudo apt install -y docker.io docker-compose-plugin nginx certbot python3-certbot-nginx
sudo usermod -aG docker ubuntu
# Clone the repo into /home/ubuntu/smart-doc-api
cd /home/ubuntu
git clone https://github.com/theboylexis/smart-doc-api.git
cd smart-doc-api
# Create production .env (fill in real values)
cp .env.example .env
nano .env
# Issue the SSL certificate (Nginx config must already point smartdocapi.duckdns.org → 127.0.0.1:3000)
sudo certbot --nginx -d smartdocapi.duckdns.org
# First boot
docker compose -f docker-compose.prod.yml up -d --buildPushing to main triggers GitHub Actions, which runs the test suite and then SSHes into EC2 to run:
cd /home/ubuntu/smart-doc-api
git pull origin main
docker-compose -f docker-compose.prod.yml down
docker-compose -f docker-compose.prod.yml up -d --buildTo deploy manually, run those same four commands on the server.
| Secret | Description |
|---|---|
EC2_HOST |
Public IP or DNS of the EC2 instance |
EC2_USER |
SSH user (typically ubuntu) |
EC2_SSH_KEY |
Private key matching the instance's authorized key |
A pre-configured Postman collection is included at smart-doc-api.postman.json.
How to use:
- Import the file into Postman
- Set
base_urltohttps://smartdocapi.duckdns.orgfor the live API, orhttp://localhost:3000for local testing - Run Register → Login. The Login request has a test script that auto-saves the JWT to the
auth_tokencollection variable - All subsequent requests use the saved token automatically
# Run all tests
npm test
# Run with verbose output
npx jest --forceExit --verboseTests use mocked dependencies (Prisma, Redis, OpenAI, S3, BullMQ) — no real services needed.
Test coverage: 32 tests across 4 suites (auth, documents, AI, webhooks).
Every push to main and every pull request automatically runs the test suite via GitHub Actions. On a successful push to main, a deploy job SSHes into EC2 and rebuilds the production Docker stack. See .github/workflows/ci.yml.
smart-doc-api/
├── .github/workflows/ # CI + deploy pipeline
├── prisma/ # Database schema & migrations
├── src/
│ ├── config/ # App config, logger, Redis, BullMQ, Swagger
│ ├── controllers/ # Request handlers
│ ├── jobs/ # BullMQ queue & worker
│ ├── middleware/ # Auth, error handler, rate limiter, logger
│ ├── routes/ # Express route definitions
│ ├── services/ # Business logic layer
│ ├── app.js # Express app setup
│ └── server.js # HTTP server entry point
├── tests/
│ ├── __mocks__/ # Redis & S3 mocks
│ ├── integration/ # Integration test suites
│ ├── mocks.js # Shared test mocks
│ └── setup.js # Test environment setup
├── Dockerfile # Container build
├── docker-compose.yml # Local dev stack
├── docker-compose.prod.yml # Production stack (EC2)
└── package.json
MIT © Alex Marfo Appiah