Show, Attend & Tell

English | 简体中文

Mindspore Implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

Pre-requisites

Mindspore=2.0.0
Convolutional Neural Networks
Long Short Term Memory Cells
Attention Mechanism

Environment

Ascend910 or RTX 3090
Mindspore=2.0.0
Python=3.8.0
mode=ms.PyNative

Usage

Clone the repo:

git clone https://github.com/NicholasKX/ShowAttendTell.git

1. Flickr8k Dataset

Prepare Dataset (Flickr8k).
Extract and move images to a folder named Images and text to captions.txt.
Put the folder containing Images and captions.txt in a folder named flickr8k
Use Andrej Karpathy's training, validation, and test splits.

-- flickr8k
    |-- Images
      |-- 1000268201_693b08cb0e.jpg
      |-- ......
    |-- captions.txt
    |-- train.csv
    |-- val.csv
    |-- test.csv

2. Training

Run the following command :

python train.py

You should specify the path of the checkpoint file in the train.py file.
You can change the hyperparameters in the train.py file alternatively.
The model will be saved in the model_saved folder

3. Inference

Download the checkpoint file and put it in the model_saved folder.
Run the following command :

python caption.py --img <path_to_image> --beam_size <beam search>

4. Evaluation

Run the following command :

python evaluation.py

5. Results

Some of the results obtained are shown below :

 Caption : a dog is running on the beach .

 Caption : a man is standing on top of a mountain .

Bad Case:

 Caption : a man rides a motorcycle.

References

Link: Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Modified from Link: https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
flickr8k		flickr8k
networks		networks
test_imgs		test_imgs
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
caption.py		caption.py
dataset.py		dataset.py
evaluation.py		evaluation.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Show, Attend & Tell

English | 简体中文

Mindspore Implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

Pre-requisites

Environment

Usage

1. Flickr8k Dataset

2. Training

3. Inference

4. Evaluation

5. Results

References

About

Releases

Packages

Languages

License

NicholasKX/ShowAttendTell

Folders and files

Latest commit

History

Repository files navigation

Show, Attend & Tell

English | 简体中文

Mindspore Implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

Pre-requisites

Environment

Usage

1. Flickr8k Dataset

2. Training

3. Inference

4. Evaluation

5. Results

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages