A production-grade, LiteParse-compatible OCR microservice written in Swift 6 for macOS. Uses Apple's Vision framework for on-device text recognition with no external API dependencies.
- 100% LiteParse API compatible —
POST /ocr,GET /health,GET /metrics - Vision framework — native macOS OCR, Apple Silicon optimized
- Actor-based concurrency — bounded worker pool with backpressure (HTTP 429)
- Correct coordinates — Vision bottom-left origin converted to spec-required top-left pixel coords
- Structured logging — OSLog with request IDs and duration tracking
- Graceful shutdown — SIGTERM/SIGINT handled; prints status to stderr before exit
- Helpful errors — port-in-use and permission errors shown clearly
- macOS 15+ (Sequoia)
- Swift 6.0+
- Apple Silicon or Intel Mac
# Build (release)
make release
# Run
./.build/release/vision-ocr
# Server starts on http://0.0.0.0:8000
# Test
curl -X POST http://localhost:8000/ocr \
-F "file=@image.png" \
-F "language=en"make build # Debug build → .build/debug/vision-ocr
make release # Release build → .build/release/vision-ocr
make test # Run test suite
make install # Install to /usr/local/bin
make install-local # Install to ~/bin
make run # Build debug and runRequest: multipart/form-data
| Field | Type | Required | Description |
|---|---|---|---|
file |
binary | Yes | Image file (PNG, JPG, TIFF, WebP, BMP, GIF) |
language |
string | No | ISO 639-1 code, default en |
Response 200 OK:
{
"results": [
{
"text": "Hello World",
"bbox": [718.6, 749.0, 2396.5, 885.4],
"confidence": 1.0
}
]
}bbox is [x1, y1, x2, y2] in pixels, top-left origin, x2 > x1, y2 > y1.
Error responses:
| Status | Trigger |
|---|---|
400 |
Missing file field, invalid language code, unsupported image format |
429 |
Worker queue full (retry with backoff) |
500 |
OCR processing or internal failure |
504 |
OCR timed out (> 60 s) |
Error body: {"error": "description"}
curl http://localhost:8000/health{
"status": "healthy",
"timestamp": 802300420.0,
"poolStats": {
"workerCount": 4,
"activeWorkers": 0,
"queueDepth": 0,
"maxQueueSize": 100,
"totalProcessed": 42
}
}curl http://localhost:8000/metrics{
"timestamp": 802300420.0,
"metrics": {
"totalRequests": 42,
"successRequests": 41,
"errorRequests": 1,
"successRate": 0.976,
"errorRate": 0.024,
"averageLatency": 0.24,
"p50Latency": 0.21,
"p95Latency": 0.44,
"p99Latency": 0.58,
"throughput": 42
},
"pool": { ... }
}# CLI flags (all optional)
vision-ocr --port 8000 # default: 8000
vision-ocr --workers 4 # default: 4 (per CPU performance core recommended)
vision-ocr --max-queue-size 100 # default: 100Sources/
├── CLI/ Entry point, argument parsing, signal handling
├── OCRServer/ HTTP server actor, route handlers, multipart parsing
├── HTTP/ RequestValidator, ResponseHandler, ErrorHandler
├── OCR/ VisionOCREngine, OCRPipeline, LanguageMapper
├── Image/ ImageProcessor, ImageValidator, format detection
├── Coordinates/ BoundingBox, VisionCoordinateConverter (Y-flip)
├── Results/ TextOrderingEngine, ConfidenceNormalizer
├── Concurrency/ WorkerPool, AsyncQueue, ResultStore, OCRWorker
├── Metrics/ MetricsCollector (P50/P95/P99)
├── Health/ HealthStatus, MetricsResponse
└── Utilities/ ServerError, ServerConfiguration, logging
| Package | Source | Purpose |
|---|---|---|
hummingbird 2.0+ |
hummingbird-project | HTTP server |
multipart-kit 4.0+ |
vapor | Multipart form-data parser |
swift-async-algorithms 1.0+ |
Apple | Async utilities |
swift-metrics 2.4+ |
Apple | Metrics interface |
See DEPLOYMENT.md for launchd setup, log management, and production tuning.
See ARCHITECTURE.md for design decisions, concurrency model, and component details.
See API_COMPATIBILITY.md for the full spec compliance matrix.
Status: Production-ready Platform: macOS 15+ (Apple Silicon & Intel) Updated: June 5, 2026