Image_Captioning_Project

Image Captioning Project from Coursera. In this project we will define and train an image-to-caption model, that can produce descriptions for real world images!

Model architecture: CNN encoder and RNN decoder.(https://research.googleblog.com/2014/11/a-picture-is-worth-thousand-coherent.html)

Encoder: We will use pre-trained InceptionV3 model for CNN encoder (https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html) and extract its last hidden layer as an embedding:

Decoder: The decoder part of the model is using a recurrent neural networks and LSTM cells to generate the captions.

Since our problem is to generate image captions, RNN text generator should be conditioned on image. The idea is to use image features as an initial state for RNN instead of zeros. During training we will feed ground truth tokens into the lstm to get predictions of next tokens.(http://cs.stanford.edu/people/karpathy/):

Dataset

The dataset is a collection of images and captions. Here, it’s the COCO dataset. For each image, a set of sentences (captions) is used as a label to describe the scene. Ralavent Links:

train images http://msvocds.blob.core.windows.net/coco2014/train2014.zip
validation images http://msvocds.blob.core.windows.net/coco2014/val2014.zip
captions for both train and validation http://msvocds.blob.core.windows.net/annotations-1-0-3/captions_train-val2014.zip

Results

Following are a few results obtained after training the model for 12 epochs.

Image	Caption
	Generated Caption: a close up of a laptop computer on a desk
	Generated Caption: a elephant is standing in the dirt next to a fence
	Generated Caption: a baseball player swinging a bat at a ball
	Generated Caption: a group of people standing around a boat on a river
	Generated Caption: a man sitting at a table with a laptop computer
	Generated Caption: a group of people sitting on a couch playing video games

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Output		Output
README.md		README.md
week6_final_project_image_captioning_clean.ipynb		week6_final_project_image_captioning_clean.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image_Captioning_Project

Dataset

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image_Captioning_Project

Dataset

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages