Captionly Captionly describes the images in the form of text (captions). Simply generate the captions Dataset & files : Drive link Documented with Colab Notebook For more details PPT link Model Architecture: Visual Attention: Loss Graph Model Predictions: