Official codebase for paper implementation of text-based person anomaly search.
Accepted by ACL 2026.
SSDC addresses the pose-semantic gap in text-based person anomaly search with a two-stage pipeline:
- Structure-aware coarse retrieval (SSDC stage-1 retriever).
- Multi-round semantic verification and re-ranking with MLLMs.
This repository includes training, evaluation, and vLLM-based semantic refinement scripts.
- Coarse-to-fine retrieval for large-scale anomaly search.
- Pose-aware cross-modal backbone for fast candidate filtering.
- LLM-based detective-style multi-round semantic verification.
- Supports standard and variant scripts for SSDC + vLLM evaluation.
Search.py: main training and evaluation entry for the SSDC stage-1 retriever.run.py: distributed launch wrapper forSearch.py.train.py: training loop.eval.py: retrieval metrics and ITC/ITM evaluation.evaluate.py: standalone evaluation script.vllm_infer_SSDC.py: stage-2 semantic verification and re-ranking.configs/ssdc.yaml: SSDC stage-1 retriever configuration.run.sh,evaluate.sh,run_stage2.sh: example shell launch scripts.
conda create -n ssdc python=3.10 -y
conda activate ssdcpip install -r requirements.txtFor vLLM-based stage-2 inference, additionally install:
pip install vllm easydict ftfy qwen_vl_utilsThe default config file is configs/ssdc.yaml. Before running, update these paths to your local environment:
image_roottrain_filetest_filevision_configtext_config
Dataset source note: the PAB dataset setup and related resources in this project are adapted with reference to the CMP repository: https://github.com/Shuyu-XJTU/CMP.
Current defaults in configs/ssdc.yaml point to local absolute paths from the original training machine.
Use run.py as the launcher:
python run.py \
--task ssdc \
--dist f4 \
--output_dir output/ssdcNotes:
--distoptions are defined in run.py (f4,f2, orgpuX).- The default checkpoint in run.py is
checkpoint/16m_base_model_state_step_199999.th. - Training logs and checkpoints are saved under
--output_dir.
Note: old cmp arguments are still accepted for backward compatibility.
python run.py \
--task ssdc \
--dist gpu0 \
--output_dir evaluation_results/ssdc_eval \
--checkpoint checkpoint/best.pth \
--evaluatepython evaluate.py \
--config configs/ssdc.yaml \
--task ssdc \
--output_dir evaluation_results/run_01 \
--checkpoint checkpoint/best.pthEvaluation metrics include R1, R5, R10, mAP, and mINP.
After SSDC Stage-1 retriever is ready, run vLLM-based re-ranking:
python vllm_infer_SSDC.py \
--xi 0.1 \
--lambda 0.4 \
--embed_model_path checkpoint \
--source PAB \
--target PAB \
--ssdc_checkpoint checkpoint/best.pth \
--ssdc_config configs/ssdc.yaml \
--model_dir /path/to/Qwen3-VL-8B-sft \
--base_model RDE \
--root_dir /path/to/data \
--tag testYou can also use the stage-2 shell template:
- Framework: assets/framework.jpg
- Dataset examples: assets/dataset.jpg
- Qualitative examples: assets/example.jpg
- Comparison figure: assets/comparison.JPG
The dataset organization and baseline preparation in this repository reference the public CMP project: https://github.com/Shuyu-XJTU/CMP.
- Set all dataset/model paths before running.
- Make sure GPU visibility matches your scripts (
CUDA_VISIBLE_DEVICES). - For distributed launch, ensure
MASTER_PORTin run.py is available. - Stage-2 scripts may cache intermediate CMP features to speed up repeated runs.
If you use this repository, please cite the corresponding paper:
@inproceedings{ssdc2026,
title={Bridging the Pose-Semantic Gap: A Cascade Framework for Text-Based Person Anomaly Search},
author={Anonymous},
booktitle={Findings of the Association for Computational Linguistics: ACL 2026},
year={2026}
}This project is released under the license in LICENSE.