Image_Captioning

This project conatins file from Udacity Computer Vision Nanodegree

In this project we combine CNN and RNN as decoder and encoder respectively, to produce caption for images from the COCO Dataset - Common Objects in Context.

Dataset

To set up the COCOAPI to use the dataset, follow the instruction in this readme file

Project Structure

The project is structured as a series of Jupyter notebooks that are designed to be completed in sequential order:

Notebook 0 : Microsoft Common Objects in COntext (MS COCO) dataset;

Notebook 1 : Load and pre-process data from the COCO dataset;

Notebook 2 : Training the CNN-RNN Model;

Notebook 3 : Load trained model and generate predictions.

Installation

$ git clone https://github.com/kenkai/Image_Captioning.git
$ pip3 install -r requirements.txt

References

Microsoft COCO, arXiv:1411.4555v2 [cs.CV] 20 Apr 2015 and arXiv:1502.03044v3 [cs.LG] 19 Apr 2016

Licence

This project is licensed under the terms of the

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
images		images
0_Dataset.ipynb		0_Dataset.ipynb
1_Preliminaries.ipynb		1_Preliminaries.ipynb
2_Training.ipynb		2_Training.ipynb
3_Inference.ipynb		3_Inference.ipynb
README.md		README.md
data_loader.py		data_loader.py
model.py		model.py
requirements.txt		requirements.txt
vocab.pkl		vocab.pkl
vocabulary.py		vocabulary.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image_Captioning

Dataset

Project Structure

Installation

References

Licence

About

Releases

Packages

Languages

kenkai21/Image_Captioning

Folders and files

Latest commit

History

Repository files navigation

Image_Captioning

Dataset

Project Structure

Installation

References

Licence

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages