Master's Thesis Project · Universität Passau · 2025 Author: Asadullah Rahoo
TARA is a desktop annotation tool that automates the labelling of Person Re-Identification (ReID) datasets. It integrates object detection, multi-object tracking, and ReID feature matching into a single Qt-based GUI built on top of LabelMe, reducing annotation time from 120–180 s/frame (manual) to ~10 s/frame (automated).
| Feature | Details |
|---|---|
| Detection models | YOLOv8, YOLOv11, Faster R-CNN, Hybrid (YOLO + SAM) |
| ReID models | OSNet, FastReID, TransReID |
| Tracking | EnhancedDeepSORT (cosine metric, Kalman filter) |
| Live preview | Real-time annotated frame display during video processing |
| Annotation export | LabelMe JSON, MOT format, Identity-Aware JSON |
| Review mode | Frame-by-frame manual correction after automated pass |
| Evaluation | MOTA, MOTP, IDF1, Precision, Recall, mAP via evaluate_pipeline.py |
- Python 3.9–3.12
- NVIDIA GPU with ≥ 4 GB VRAM (recommended) or CPU fallback
- CUDA 11.8 / 12.1 (for GPU inference)
- Windows 10/11 or Ubuntu 20.04+
git clone <your-repo-url>
cd labelmepython -m venv venv
# Windows
venv\Scripts\activate
# Linux / macOS
source venv/bin/activateVisit https://pytorch.org/get-started/locally/ and choose your CUDA version. Example for CUDA 11.8:
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118pip install -r requirements.txt# OSNet / AGW / SBS (torchreid)
pip install git+https://github.com/KaiyangZhou/deep-person-reid.git
# SAM (Segment Anything)
pip install git+https://github.com/facebookresearch/segment-anything.git
# MobileSAM (lightweight alternative, already included under MobileSam/)
pip install git+https://github.com/ChaoningZhang/MobileSAM.gitPlace model weights in the following locations (or update paths in tara_config.py):
labelme/
├── yolov8n.pt ← YOLOv8n detector (required)
├── MobileSam/MobileSAM/weights/
│ └── mobile_sam.pt ← MobileSAM (already in repo)
└── pretraine-models/
├── sam_vit_b_01ec64.pth ← SAM ViT-B (optional, for hybrid mode)
├── market_bot_R50.pth ← FastReID Market-1501 weights
└── ...
Download links:
- YOLOv8n:
pip install ultralytics && python -c "from ultralytics import YOLO; YOLO('yolov8n.pt')" - SAM ViT-B: https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
- FastReID Market: https://github.com/JDAI-CV/fast-reid/releases
python model_validator.pyThis prints a checklist of which models are available and which pipeline modes are enabled.
All tunable parameters live in tara_config.py — edit this file to adapt the tool without touching any other source file.
DETECTION = {
"conf_threshold": 0.20, # Lower → higher recall (thesis §4.7 fix)
"iou_threshold": 0.40,
"max_det": 100, # Raised for crowded scenes
}
TRACKING = {
"max_cosine_distance": 0.25, # Tighter → fewer false re-ID merges
"max_age": 60, # Doubled → survives longer occlusions
"nn_budget": 200, # Larger appearance gallery
}
PREVIEW_EVERY_N_FRAMES = 2 # Live preview throttle (1 = every frame)python main.py
# or
python app.pyOn startup TARA validates all model weights and shows a warning if any critical weights are missing.
python evaluate_pipeline.py \
--gt_dir MOT17DET/train/MOT17-02/gt/gt.txt \
--pred_dir output/annotations.json \
--output evaluation_results/Outputs: MOTA, MOTP, IDF1, Precision, Recall, mAP, ID-switches.
python -m pytest tests/ -vTests cover: config parameter validation, tracker output format, annotation worker helpers (preprocess_image, _draw_preview), and model validator schema.
Video / Image Input
│
▼
[preprocess_image] ← CLAHE + Gaussian (low-light fix)
│
▼
[Detection]
YOLOv8 / YOLOv11 / Faster-RCNN / Hybrid(YOLO+SAM)
│
▼
[Feature Extraction]
OSNet (512-d) / FastReID (2048-d) / TransReID (768-d)
│
▼
[EnhancedDeepSORT Tracking]
cosine distance + Kalman filter + IOU-based NMS
│
├──► [Live Preview] ← frame_preview signal → Qt canvas
│
▼
[LabelMe JSON Export]
+ ReID identity embeddings per person
│
▼
[Review Mode]
Frame-by-frame manual correction
| Scenario | Detection | ReID | Speed | Memory |
|---|---|---|---|---|
| Real-time / edge | YOLOv11 | OSNet | 3.91 fps | 8 MB |
| Balanced quality | YOLOv11 | FastReID | 3.75 fps | 8 MB |
| Max accuracy | Hybrid (YOLO+SAM) | TransReID | 0.23 fps | N/A |
| Mid-range GPU | YOLOv8 | FastReID | 1.01 fps | 1 GB |
- Recall gap: Even with optimised thresholds, recall peaks at ~0.37 on MOT17DET. Domain-specific fine-tuning of the ReID models would close this gap.
- Hybrid model memory: SAM adds significant VRAM overhead. Memory profiling is tracked via
GPU_MIN_FREE_GBintara_config.py. - Low-light / occlusion: CLAHE preprocessing reduces the reported ~8% low-light drop. Dedicated domain adaptation (thermal, depth modalities) is a future direction.
- Batch image parallelism:
annotateImagecurrently processes images sequentially.ThreadPoolExecutorintegration is the next planned optimisation.
labelme/
├── app.py ← Main Qt application (12 000+ lines)
├── annotation_worker.py ← QThread worker: detection → tracking → ReID
├── tracker.py ← EnhancedDeepSORT wrapper
├── hybirdAnnotation.py ← Hybrid YOLO+SAM pipeline
├── evaluate_pipeline.py ← MOTA/MOTP evaluation
├── tara_config.py ← Central configuration (edit me!)
├── model_validator.py ← Startup model checker
├── requirements.txt ← Pinned dependencies
├── tests/ ← Unit test suite
│ ├── conftest.py
│ ├── test_config.py
│ ├── test_tracker.py
│ └── test_annotation_worker.py
├── MobileSam/ ← MobileSAM weights (included)
├── fastreid/ ← FastReID source
└── MOT17DET/ ← Evaluation dataset
If you use TARA in your research, please cite:
Rahoo, A. (2025). Automatic Annotation Tool for ReID Dataset (TARA).
Master's Thesis, Universität Passau.
This project is released for academic and research use. See the individual model licences (YOLOv8, SAM, FastReID, TransReID) for commercial usage terms.