GitHub - v-user1098new/ImageCaption: Image Caption app based on tensorflow and flask (paper: "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention", DeepRNN, https://github.com/DeepRNN/image

Name	Name	Last commit message	Last commit date
Latest commit History 25 Commits
examples	examples
models	models
summary	summary
test	test
train	train
utils	utils
val	val
.gitattributes	.gitattributes
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
base_model.py	base_model.py
config.py	config.py
dataset.py	dataset.py
eval.sh	eval.sh
main.py	main.py
model.py	model.py
requirements.txt	requirements.txt
result.jpg	result.jpg
run.ipynb	run.ipynb
run.py	run.py
vocabulary.csv	vocabulary.csv

Name

Last commit message

Last commit date

Example

Here are some captions generated by this model:

Introduction

This is a image caption app baseed on DeepRNN source code(tensorflow) and flask. We provide a local api(http://127.0.0.1:5000) for start the DeepRNN to inference by flask.

This neural system for image captioning is roughly based on the paper "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention" by Xu et al. (ICML2015). The input is an image, and the output is a sentence describing the content of the image. It uses a convolutional neural network to extract visual features from the image, and uses a LSTM recurrent neural network to decode these features into a sentence. A soft attention mechanism is incorporated to improve the quality of the caption. This project is implemented using the Tensorflow library, and allows end-to-end training of both CNN and RNN parts.

Usage

Requirements

Linux(python2/3) | MacOS(python3 only)
python2.7/3.5
nltk==3.3
numpy==1.15.4
scikit_image==0.14.0
tqdm==4.26.0
matplotlib==2.2.3
tensorflow_gpu==1.12.0
pandas==0.23.4
opencv_python==4.1.0.25
tensorflow==1.13.1

Install the requirements

pip install -r requirements.txt
- To make pip more friendly:
  
  cd ~
  
  mkdir .pip
  
  cd .pip
  
  touch pip.conf and add code below:
```
[global]
index-url = https://mirrors.aliyun.com/pypi/simple/
[install]
trusted-host=mirrors.aliyun.com
```
download the pretrained model file
- option1: Box
- ~~option2: GoogleDrive~~
- option3: BaiDuYun, code:nubk
please put the 289999.npy to models folder.
Inference
- Firstly, start the local api: python main.py
- Then, start the jupyter notebook: jupyter notebook
  1. Copy some images to test/images folder for testing.
- Finally, open the run.ipynb or run.py:
  1. run the first cell to start inference.
  2. run the second cell to visualize the result.
  - or run the run.py.
- The generated captions will be saved in the folder test/results.

References

DeepRNN
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio. ICML 2015.
The original implementation in Theano
An earlier implementation in Tensorflow
Microsoft COCO dataset

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example

Introduction

Usage

References

About

Releases

Packages

Languages

License

v-user1098new/ImageCaption

Folders and files

Latest commit

History

Repository files navigation

Example

Introduction

Usage

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages