Skip to content

m516825/Caption-Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Caption-Generation

Machine generate a reasonable caption for input video, using attention-based seq2seq model with LSTM cell

Model Prediction : "a small dog is playing with a ball"

Environment

python3
tensorflow 1.0

Data

source link
google drive link

Usage

Download hw2 data from kaggle, and GloVe 300 dim

./Caption-Generation/MLDS_hw2_data/*
./Caption-Generation/MLDS_hw2_data/glove/glove.6B.300d.txt

Train

First time use, you need to do the preprocessing

$ python3 caption_gen.py --prepro 1

If you already have done the preprocessing

$ python3 caption_gen.py --prepro 0

Model

There are three different models available.

  1. CaptionGeneratorBasic
  • greedy inference
  1. CaptionGeneratorMyBasic
  • beam search
  • greedy inference
  1. CaptionGeneratorSS
  • schedule sampling
  • beam search
  • greedy search

You can set model_type to new different model. e.g.

$ python3 caption_gen.py --prepro [1/0] --model_type=CaptionGeneratorSS

Inference

This code provide two inference methods, Greedy Search and Beam Search
beam search inference is not available in CaptionGeneratorBasic model.
(default beam search @k is set to 5)