Implementation for Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation, CVPR 2021. PDF.
- Python 3.8
- PyTorch 1.9
- CUDA 11
Download A2D Sentences dataset from this link. Modify the paths of dataset in ./datasets/a2d_dataset.py
. The GloVe word embeddings can be downloaded from this link and put glove.840B.300d.zip
in ./word_embedding
.
We provide the trained checkpoint of our model in Baidu Drive, password: fs58.
For testing on A2D Sentences dataset, please use the following command:
python test.py \
--data_margin 2 --batch_size 8 \
--gpu_id 0 --resize 320 --skip single \
--dim_semantic 512 \
--dataset A2D --model_root cstm_checkpoints \
--checkpoint checkpoint.pth.tar