Skip to content

paluchnuggets/ImageCaptioning

Repository files navigation

Udacity Computer Vision Nanodegree Part 2: Advanced Computer Vision & Deep Learning

This repository covers 5 topics:

  1. Advanced CNN Architectures

    Credits: Udacity Computer Vision Nanodegree

Describtions of various neural networks architectures for object recognition and detection: R-CNN, Fast R-CNN and Faster R_CNN. Enjoy 😄

  1. YOLO
    In progress

  2. RNN's


Credits: Udacity Computer Vision Nanodegree

Explanation of Reccurent Neural Network structure. You can find here description of Unfolded model as well as Backpropagation through time

  1. Long Short-Term Memory Networks (LSTMs)


Credits: Udacity Computer Vision Nanodegree

If you wonder how does exactly LSTM cell receive input, process it and returns output click 👉 here 👈
However, if you are interested in implementation of LSTM models you may check Part of Speech Tagging 💬 or Character Level LSTM: generate another chapter of Anna Karenina 📕

  1. Attention Mechanisms ❤️ 🐥


Credits: Udacity Computer Vision Nanodegree

Overview of how does Attention can be used to deal with problems in sequence to sequence models. Also, you can find here detailed description of how Encoder and Decoder works in such models and how does it connects with Attention Mechanisms.

Final Project will be to implement an effective RNN decoder for a CNN encoder to predict captions for a given image. It can be found in this repository.

About

Materials connected with Image Captioning techniques

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published