Task is to design a Deep Learning+NLP based toolkit (python-library) which can input images and generate textual description of the Image. Further addons to the project can be a web application where the user can upload an image and get the caption for input image or an Android/iOS application where the user can capture/upload an image and get the caption generated using this toolkit.
Skills Required: Python, Object-Oriented Programming, NLP, Famous DL Libraries like Tensorflow or Keras. Optional Skills include Full stack web app, Android/iOS App Development.
- COCO Dataset
- Flicker 8k Dataset
- Flicker 30k Dataset
- Implementation in PyTorch
- Keras Implementation
- Another Keras Implementation
- Another Implementation using PyTorch
- Understanding Project at Root Level
- Research Paper on Convolutional Image Captioning
- Intro to RNN
- Understanding RNN & LSTM
- Understanding CNN - Part 2 link in article.