Skip to content

this following repository contains program for processing images and generating captions(sentences) describing the image

Notifications You must be signed in to change notification settings

anand-371/image_captioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMAGE CAPTIONING

The problem of generaing captions based on the image provided can be effectively solved using Deep Neural Networks in the following program we use a CNN + RNN architecture.
Here the Convolutional Neural Network is used for extracting features from an image before passing it through the pipeline.
We use a VGG16 model that is trained for classifying images ,but instead of using the last classification layer,
we redirect the output of the previous layer.This gives us a vector with 4096 elements that summarizes the image-contents.
We will use this vector as the initial state of the Gated Recurrent Units(GRU).However we need to map the 4096 elements down to a
vector with only 512 as this is the internal state-size of the GRU .To do this we need an intermediate fully-connected(dense) layer.

INPUT : RGB Image size of (224,224)

OUTPUT: complete captions describing the image

DATASET: we are using Flickr30k dataset for training the model.

LOSS FUNCTION: We use a loss-function like sparse cross-entropy.

OPTIMIZER: We chose to use RMSprop over Adam optimizer as in some cases Adam Optimizer seems to diverge with Recurrent Neural Networks.

Implemented using: Tensorflow,keras

Model Summary:

The following is the summary of the VGG 16 model

VGG16 VGG16-2

The following is the summary of the Recurrent layer Rnn

The processed Tensorboard graphs are as follows graph_large_attrs_key=_too_large_attrs limit_attr_size=1024 run=

adversial

About

this following repository contains program for processing images and generating captions(sentences) describing the image

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages