Skip to content

Combined CNN and RNN to build a deep learning model which produces captions for given input image. CNN transforms an input image into a set of features and RNN that turns features into the rich language.

Notifications You must be signed in to change notification settings

iKhushPatel/Image-Captioning

Repository files navigation

Image Captioning Project

Introduction

The repository contains a neural network, which can automatically generate captions from images.

Network Architecture

The solution architecture consists of:

  1. CNN encoder, which encodes the images into the embedded feature vectors:

2. Decoder, which is a sequential neural network consisting of LSTM units, which translates the feature vector into a sequence of tokens:

Results

These are some of the outputs give by the network using the COCO dataset:

About

Combined CNN and RNN to build a deep learning model which produces captions for given input image. CNN transforms an input image into a set of features and RNN that turns features into the rich language.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages