Skip to content

dadadidodi/video-caption.pytorch

 
 

Repository files navigation

pytorch implementation of video captioning

recommend installing pytorch and python packages using Anaconda

requirements

  • cuda
  • pytorch 0.3.1
  • python3
  • ffmpeg (can install using anaconda)

python packages

  • tqdm
  • pillow
  • pretrainedmodels
  • nltk

Data

MSR-VTT. Test video doesn't have captions, so I spilit train-viedo to train/val/test. Extract and put them in ./data/ directory

Options

all default options are defined in opt.py or corresponding code file, change them for your like.

Usage

(Optional) c3d features

you can use video-classification-3d-cnn-pytorch to extract features from video. Then mean pool to get a 2048 dim feature for each video.

Steps

  1. preprocess videos and labels

    this steps take about 3 hours for msr-vtt datasets use one titan XP gpu

python prepro_feats.py --output_dir data/feats/resnet152 --model resnet152 --n_frame_steps 40  --gpu 4,5

python prepro_vocab.py
  1. Training a model
python train.py --gpu 5,6,7 --epochs 9001 --batch_size 450 --checkpoint_path data/save --feats_dir data/feats/resnet152 --dim_vid 2048 --model S2VTAttModel
  1. test

    opt_info.json will be in same directory as saved model.

python eval.py --recover_opt data/save/opt_info.json --saved_model data/save/model_1000.pth --batch_size 100 --gpu 1,0

Metrics

I fork the coco-caption XgDuan. Thanks to port it to python3.

TODO

  • lstm
  • beam search
  • reinforcement learning

Note

This repository is not maintained, please see my another repository video-caption-openNMT.py. It has higher performence and test score.

About

pytorch implementation of video captioning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%