This is the repository for my implementations on mayor projects of the Deep Learning Nanodegree from Udacity.
-
Mathematical demonstrations:
-
- Implemented using TensorFlow.
- Implication of different initializations over Cost function and Gradient descent.
- Reviewed:
- Ones initializatialization.
- Uniform distribution, saled uniform.
- Normal distribution, truncated distribution.
- Comparison to Xavier initialization.
-
- Implemented on TensorFlow.
- Used in fully connected and convolutional layers.
- Two levels of implementation:
- Higher level of abstraction, tf.layers.batch_normalization: TensorFlow takes care of the normalization for training and inference, control dependencies through tf.control_dependencies() and tf.GraphKeys.UPDATE_OPS.
- Lower level, tf.nn.batch_normalization: Explicit implementation instanciating gamma, beta, and calculating the batch/population mean, variance. Control training and inference through tf.cond().
-
Sentiment Analysis using MLPs:
- Implemented on Numpy/Python.
- Predict Positive/Negative sentiment over movie reviews.
- Preprocess data:
- Create vocabulary, word frequency.
- Analyze word-freq/sentiment review ratio.
- Bit encoding per word.
- Built the neural network.
- Reviewed limitations with word freq instead of word-sentiment relationship. 10% Validation accuracy improvement.
-
- Implemented on Numpy/Python.
- Load & prepare the data:
- Normalize features.
- Created training, validation and test data.
- Implement forward and backward propagation.
- Trained and tested accuracy.
-
- Implemented using Keras.
- Usage of CNNs for encoding-decoding.
- Denoising images.
-
Data Augmentation & Transfer Learning:
- Implemented using Keras.
- Explored data augmentation of CIFAR-10 with ImageDataGenerator from Keras, and impact of it over training.
- Reviewed transfer learning on VGG-16, bottleneck feature extraccion and new FC layers over them.
-
- Implemented using Keras.
- Created CNN model from scratch and achieved at least 5% test accuracy in the first 5 epochs using data augmentation.
- Used transfer learning of Xception model, and data augmentation to achieve 83% test accuracy.
- Xception paper: Xception: Deep Learning with Depthwise Separable Convolutions
-
Mathematical demonstrations:
-
- Implemented in TensorFlow.
- Developed a Character-Wise RNN sequence predictor. A two 2 layer depth LSTM with Tx=50 time sequence length. With a 128 dimension for the LSTM memory cell, and a vocabulary size 83.
- Steps:
- Data processing for minibatches.
- Built LSTM model.
- Optimizer & Gradient clipping.
- Checkpoint training.
- Sequence generation with output sampling.
-
- Implemented in TensorFlow.
- Implemented and trained a Skip-gram Word Embedding matrix.
- Used Subsampling, negative sampling.
- Visualization of word vectors using T-SNE.
- Based on papers:
-
- Implemented in TensorFlow.
- Sentiment prediction using Word Embedding on LSTM.
-
The Simpsons Script Generation:
- Implemented in TensorFlow.
- Language sequence generation on a LSTM network using Word Embedding.
-
- Implemented in TensorFlow.
- GAN implementation for the MNIST database.
- Based on Generative Adversarial Networks Paper
-
DCGAN: Deep Convolutional GAN:
- Implemented in TensorFlow.
- DCGAN implementation for the Street View House Number database.
-
DCGAN for Face image generation: Deep Convolutional GAN:
- Implemented in TensorFlow.
- DCGAN implementation to generate faces, trained over CelebFaces Attributes Dataset (CelebA).
- Based on papers:
-
- Implementation on Frozen Lake enviroment.
- Reinforcement Learning by Richard S. Sutton and Andrew G. Barto: Chapters 3 & 4
- Covers Finite Markov Processes and Dynamic Programming:
- Policy Evaluation.
- Policy Improvement.
- Policy Iteration.
- Truncated Policy Evaluation.
- Value Iteration.
-
- Implementation on BlackJack enviroment.
- Reinforcement Learning by Richard S. Sutton and Andrew G. Barto: Chapter 5.
- Monte Carlo Methods:
- Monte Carlo Predictions: State-value and Action-value functions.
- Monte Carlo Control.
- GLIE MC Control(Greedy in the limit with Infinite Exploration).
- Constant aplha-GLIE MC Control.
-
- Implementation on CliffWalking enviroment.
- Reinforcement Learning by Richard S. Sutton and Andrew G. Barto: Chapter 6.
- Temporal-Difference Methods:
- Temporal-Difference Predictions: State-value and Action-value functions.
- Sarsa.
- Q-Learning (Sarsamax).
- Expected Sarsa.
-
- Implemented agent to solve the OpenAI gym of Taxi.
- Tested Q-Learning, Sarsa, Expected Sarsa.
- Best Score over 100 episode average rewards: 9.359 on Q-Learning.
-
Reinforcement Learning in cotinuous spaces:
-
- Based on V.Mnih et al. "Playing Atari with Deep Reinforcement Learning", 2013
- Deep Q-Learning implementation.
- Implementations of neural network action-value approximator in TensorFlow.
- Implemented experience replay memory and fixed Q targets.
-
- Based on H.Hasselt et al. "Deep Reinforcement Learning with Double Q-learning", 2015
- Double Deep Q-Learning implementation.
- Implemented experience replay memory and fixed Q targets.
- Implemented two action-value neural network approximators, for action decision and fixed target.
-
Deep Deterministic Policy Gradient:
- Based on T.Lillicrap et al. "Continuous control with deep reinforcement learning", 2016
- Deep Deterministic Policy Gradient implementation.
- Implemented action repeat, experience replay memory and fixed targets for Actor/Critic Networks with soft update.
- MountainCarContinuous-v0 solved after 70 episodes
-
- Deep Deterministic Policy Gradient implementation based on T.Lillicrap et al. "Continuous control with deep reinforcement learning", 2016
- Implemented action repeat, experience replay memory and fixed targets for Actor/Critic Networks with soft update.
- Define 'Take off' task for the drone agent to solve, implementing the rewards function for it.
- The drone is able to learn how to take off after 55 Episodes.