Production-ready document cleanup API for real-world camera-captured forms.
Frame correction → shadow removal → template-handle dewarping → adaptive histogram normalization → single final image output.
WarpLess Docs turns a noisy phone photo of a document into a clean, scanner-like output.
It is built as a modular computer-vision pipeline with an HTTP API, progress tracking, and a single final image result.
The pipeline handles:
- camera perspective and document frame correction
- shadow removal with an ONNX document model
- template feature matching and handle-based dewarping
- adaptive histogram normalization for a brighter, more readable final scan
| Input | Frame | Deshadow | Template Handles | Dewarp | Final |
|---|---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
input image
↓
1. Frame correction
↓
2. Shadow removal
↓
3. Template-handle dewarping
↓
4. Adaptive histogram normalization
↓
single final PNG result
The API returns one final output file. Intermediate images are only used internally.
WarpLess-Docs/
├── api.py
├── run_pipeline.py
├── scan_document_frame.py
├── main.py
├── dewarp_with_template_handles.py
├── enhance_document.py
├── input/
│ ├── samples/
│ └── template/
├── models/
│ └── docshadow_sd7k.onnx
├── outputs/
│ ├── api_jobs/
│ ├── framed/
│ ├── deshadowed/
│ ├── template_rectified/
│ ├── template_dewarped/
│ └── final/
└── src/
└── warpless_docs/
├── document_frame/
├── shadow_removal/
├── template_rectification/
├── enhancement/
└── pipeline/
git clone https://github.com/ehsanwwe/WarpLess-Docs.git
cd WarpLess-Docs
python -m venv .venvWindows PowerShell:
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txtLinux / macOS:
source .venv/bin/activate
pip install -r requirements.txtRecommended Python:
Python 3.11 or 3.12
The ONNX model is not committed because it is a large binary file.
mkdir -p models
curl -L "https://github.com/fabio-sim/DocShadow-ONNX-TensorRT/releases/download/v1.0.0/docshadow_sd7k.onnx" -o "models/docshadow_sd7k.onnx"Windows PowerShell:
Invoke-WebRequest `
-Uri "https://github.com/fabio-sim/DocShadow-ONNX-TensorRT/releases/download/v1.0.0/docshadow_sd7k.onnx" `
-OutFile "models\docshadow_sd7k.onnx"Expected path:
models/docshadow_sd7k.onnx
Put the clean flat template image here:
input/template/template_page_1.png
The template should match the same form layout as the uploaded document.
Start the server:
uvicorn api:app --reloadOpen the interactive docs:
http://127.0.0.1:8000/docs
Health check:
curl http://127.0.0.1:8000/healthcurl -X POST \
-F "file=@input/samples/Untitled5.jpg" \
http://127.0.0.1:8000/api/v1/processResponse:
{
"job_id": "2f4c...",
"status": "queued",
"progress": 0,
"stage": "Queued",
"status_url": "/api/v1/jobs/2f4c...",
"result_url": "/api/v1/jobs/2f4c.../result"
}curl http://127.0.0.1:8000/api/v1/jobs/YOUR_JOB_IDExample progress response:
{
"status": "running",
"progress": 70,
"stage": "Applying handle-based template dewarping"
}Progress stages:
10% Frame correction
30% Shadow removal
55% Template feature matching
70% Handle-based dewarping
90% Adaptive normalization
100% Completed
curl -L \
-o result.png \
http://127.0.0.1:8000/api/v1/jobs/YOUR_JOB_ID/resultThe result endpoint returns only one file:
result.png
For demos or scripts where progress is not needed:
curl -X POST \
-F "file=@input/samples/Untitled5.jpg" \
-o result.png \
http://127.0.0.1:8000/api/v1/process-sync| Method | Path | Description |
|---|---|---|
GET |
/health |
Service health check |
POST |
/api/v1/process |
Upload image and start async processing |
GET |
/api/v1/jobs/{job_id} |
Read status and progress |
GET |
/api/v1/jobs/{job_id}/result |
Download final PNG |
GET |
/api/v1/jobs/{job_id}/metadata |
Get technical stage metadata |
DELETE |
/api/v1/jobs/{job_id} |
Delete job files |
WARPLESS_MODEL_PATH defaults to models/docshadow_sd7k.onnx
WARPLESS_TEMPLATE_PATH optional explicit template image
WARPLESS_API_OUTPUT_DIR defaults to outputs/api_jobs
WARPLESS_CPU set to 1 to force CPU
Example:
WARPLESS_CPU=1 uvicorn api:app --reloadWindows PowerShell:
$env:WARPLESS_CPU="1"
uvicorn api:app --reloadThe API uses the same pipeline as the CLI scripts.
python run_pipeline.pyManual execution:
python scan_document_frame.py
python main.py
python dewarp_with_template_handles.py
python enhance_document.pyfrom pathlib import Path
from warpless_docs.pipeline import DocumentPipeline
pipeline = DocumentPipeline()
image_bytes = Path("input/samples/Untitled5.jpg").read_bytes()
final_image, metadata = pipeline.process_bytes(
image_bytes=image_bytes,
progress=lambda percent, stage: print(percent, stage),
)
final_image.save("result.png")WarpLess Docs is designed like a real production preprocessing layer for document AI:
- it accepts imperfect camera photos
- it reports progress for UI integration
- it returns a single clean output
- it keeps heavy ML inference behind a simple API
- it exposes metadata for debugging and evaluation
- it is modular enough to replace or improve each stage independently
- Frame correction and perspective cleanup
- ML-based document shadow removal
- Template feature matching
- Handle-based template dewarping
- Adaptive histogram normalization
- FastAPI service with progress tracking
- Single final image output endpoint
- OCR-ready cell extraction
- Template region mapping and JSON field export
- Docker deployment
- Simple web UI
MIT License
Built by Ehsan Moradi.





