Kompaktes Document Layout Detection Model für deutsche Dokumente
- Kompakt: 6 MB PyTorch / 12 MB ONNX
- Schnell: <100ms Inference, 453 FPS mit TensorRT
- Präzise: 99.5% mAP50 auf Validierungsdaten
- 10 Klassen: text_block, heading, table, list, image, footer, header, signature, stamp, logo
- Multi-Format: PyTorch, ONNX, OpenVINO, TensorRT
# Repository klonen
git clone https://github.com/Keyvanhardani/German-Layout.git
cd german-layout
# Virtual Environment erstellen
python -m venv venv
source venv/bin/activate # Linux/Mac
# oder: venv\Scripts\activate # Windows
# Dependencies installieren
pip install ultralytics torch opencv-python pillow onnx fastapi uvicorn# Einzelnes Bild
python src/inference.py document.png --model models/german_layout_yolov8n.pt --output result.png
# Verzeichnis
python src/inference.py documents/ --model models/german_layout_yolov8n.pt --output results/# Server starten
uvicorn src.api:app --host 0.0.0.0 --port 8000
# Health Check
curl http://localhost:8000/health
# Layout Detection
curl -X POST http://localhost:8000/detect -F "file=@document.png"
# Mit Visualisierung
curl -X POST http://localhost:8000/detect/visualize -F "file=@document.png" -o result.pngfrom ultralytics import YOLO
# Model laden
model = YOLO("models/german_layout_yolov8n.pt")
# Inference
results = model("document.png")
# Ergebnisse
for r in results:
for box in r.boxes:
cls = int(box.cls[0])
conf = float(box.conf[0])
x1, y1, x2, y2 = box.xyxy[0].tolist()
print(f"Class: {model.names[cls]}, Confidence: {conf:.2f}")german-layout/
├── config/
│ └── dataset.yaml # Dataset-Konfiguration
├── data/
│ ├── train/ # Trainingsdaten
│ └── val/ # Validierungsdaten
├── models/
│ ├── german_layout_yolov8n.pt # PyTorch
│ ├── german_layout_yolov8n.onnx # ONNX
│ ├── german_layout_yolov8n_openvino_model/ # OpenVINO
│ └── german_layout_yolov8n.engine # TensorRT
├── src/
│ ├── api.py # FastAPI REST API
│ ├── train.py # Training Script
│ ├── evaluate.py # Evaluation
│ ├── inference.py # Inference Script
│ ├── export_onnx.py # ONNX Export
│ ├── export_openvino.py # OpenVINO Export
│ ├── export_tensorrt.py # TensorRT Export
│ └── generate_synthetic_data.py
├── MODEL_CARD.md
└── README.md
# Synthetische Daten generieren
python src/generate_synthetic_data.py --train 500 --val 100
# Training starten
python src/train.py --epochs 50 --batch 16 --device 0
# Evaluation
python src/evaluate.py --model models/german_layout_yolov8n.pt# ONNX
python src/export_onnx.py --model models/german_layout_yolov8n.pt
# OpenVINO
python src/export_openvino.py --model models/german_layout_yolov8n.pt
# TensorRT (nur NVIDIA)
python src/export_tensorrt.py --model models/german_layout_yolov8n.pt --fp16| Endpoint | Methode | Beschreibung |
|---|---|---|
/health |
GET | Health-Check |
/classes |
GET | Verfügbare Klassen |
/detect |
POST | Layout-Erkennung (JSON) |
/detect/visualize |
POST | Erkennung mit Visualisierung |
/detect/base64 |
POST | Base64-Input |
/detect/batch |
POST | Batch-Verarbeitung |
| Hardware | Inference Time | FPS |
|---|---|---|
| RTX 4000 SFF Ada (PyTorch) | ~12ms | 83 |
| RTX 4000 SFF Ada (TensorRT FP16) | 2.2ms | 453 |
| Intel CPU (OpenVINO) | ~50ms | 20 |
Apache License 2.0
- YOLOv8: Ultralytics
- System: AEGIS Supervisor
- Architect: Keyvan.ai
Erstellt von AEGIS Supervisor - 2025-12-30