Udacity Computer Vision Nanodegree Project for automatic image captioning. The dataset used is Microsoft COCO (Common Objects in Context). In these notebooks a network is trained to create dewscriptions of what is shown in the image.
For example:
This project is fully made using Jupyter Notebooks. The four main ones are:
-0_Dataset.ipynb
: Shows the dataset used
-1_Preliminaries.ipynb
: Checks the dataloader, and how the word embeddings are setup
-2_Training.ipynb
: Setup the main hyperparameters and trains the network
-3_Inference.ipynb
: Test the trained network with some examples
I recommend downloading everything and following along with the notebooks in order.