This repository contains the PyTorch implementation of our paper: [Rethinking Open Vocabulary Video Anomaly Detection - Normality Matters]
Please set up the environment by following the requirements.txt file.
To reproduce the inference results:
-
Change the test list path in
src/configs_base2novel.py, to all/base/novel test set. The 'All' option is set by default in configs_base2novel.py. -
Download and move
ckpt/to your own path, set the ckpt path insrc/configs_base2novel.py. -
Inference
cd src python main.py --mode infer --dataset ucf --test best_ckpt --device cuda:0
if you want to training in scratch:
-
Official Dataset Download The original datasets for UCF-Crime, ShanghaiTech, XD-Violence, and UBnormal can be obtained from their official sources.
-
Extract the CLIP feature The extracted CLIP features for the UCF-Crime, ShanghaiTech and XD-Violence datasets can be obtained from CLIP-TSA.
You can also use the CLIP model to extract features by referring to the scripts under
./scripts/feature_extract.
The following files need to be modified in order to run the code on your own machine:
-
Change the file paths to the CLIP features of the datasets above in
src/list/, and feel free to change the hyperparameters inconfigs_base2novel.py -
run training command:
cd src
python main.py --mode train --dataset ucf --test best_ckpt --device cuda:0
The --dataset option can be ucf, sh, xd, or ub, referring to UCF-Crime, ShanghaiTech, XD-Violence, or UBnormal.
--test option create new folder for training.
--device option asign the GPU
You could add more options like --seed and --lamda2 to change the training options. Default parameter could be found in main.py.
