Skip to content


Repository files navigation


This repo includes

  • slides of basicRL tutorials made by Judy
  • reading notes and some python implementations from Reinforcement Learning an Introduction 2nd Edition by Sutton and Barto
  • machine learning basics and their numpy examples
  • some interesting extensions and papers throughout the reading

Here are some detailed notes of basicRL:

Here are some classic implementations of RL in control:

Let me know if you found any typo or mistakes :)

Contents of Slides

basicRL_1 - explanations of basic RL concepts: agent-environment interactions, (discrete/continuous) states, (discrete/continuous) actions, (the art of) rewards, (deterministic/stochastic) policies, (deterministic/stochastic) dynamics, and MDP

basicRL_2 - explanations of the goal of RL: episode, history trajectory, return, discounting factor, value functions, optimal value functions and optimal policy

basicRL_3 - explanations of the learning mechanism of RL: Bellman equations, one-step update rules, generalized policy iteration, Q-learning and Cliff Walking example

basicRL_4 - explanations of policy-based RL: the concepts of objective function and gradient descent using linear regression example, REINFORCE, REINFORCE-baseline in Short Corridor and Cartpole-gym example

basicRL_5 - preparations for deep RL: logistic regression, Bernoulli distribution, Maximum Likelihood Estimation, cross entropy loss, batch, mini-batch and stochastic gradient descent and their implementations with two-feature Iris dataset

basicRL_6 - preparations for deep RL: softmax (multiclass logistic) regression, Multinomial distribution, MLE for softmax regression, feedforward neural networks (MultiLayer Perceptron), activation functions, loss functions, optimizers for nn, back-propagation, MNIST keras example, weights visualization

basicRL_7 - preparations for deep RL: convolution of two functions, convolution in images, convolution layer and max-pooling layer, convolutional neural networks (cnn), cnn in keras MNIST, visualizations of activations, filter visualizations of AlexNet, VGG, etc

Interesting Papers

Dulac-Arnold, Gabriel, et al. "Deep reinforcement learning in large discrete action spaces." arXiv preprint arXiv:1512.07679 (2015).

Haarnoja, Tuomas, et al. "Learning to walk via deep reinforcement learning." arXiv preprint arXiv:1812.11103 (2018).

Frémaux, Nicolas, Henning Sprekeler, and Wulfram Gerstner. "Reinforcement learning using a continuous time actor-critic framework with spiking neurons." PLoS computational biology 9.4 (2013): e1003024.

Parisi, Simone, et al. "Long-Term Visitation Value for Deep Exploration in Sparse-Reward Reinforcement Learning." Algorithms 15.3 (2022): 81.

Sugiyama, Masashi, et al. "Geodesic Gaussian kernels for value function approximation." Autonomous Robots 25.3 (2008): 287-304.

Osogami, Takayuki, and Makoto Otsuka. "Seven neurons memorizing sequences of alphabetical images via spike-timing dependent plasticity." Scientific reports 5.1 (2015): 1-13.

Deisenroth, Marc Peter, Gerhard Neumann, and Jan Peters. "A survey on policy search for robotics." Foundations and trends in Robotics 2.1-2 (2013): 388-403.

Kober, Jens, J. Andrew Bagnell, and Jan Peters. "Reinforcement learning in robotics: A survey." The International Journal of Robotics Research 32.11 (2013): 1238-1274.

Schulman, John, et al. "High-dimensional continuous control using generalized advantage estimation." arXiv preprint arXiv:1506.02438 (2015).

Sutton, Richard S., et al. "Policy gradient methods for reinforcement learning with function approximation." Advances in neural information processing systems 12 (1999).

Further Reading

Neuro-Dynamic Programming by Dimitri Bertsekas

Algorithms for Reinforcement learning by Csaba Szepesvari

Markov Decision Processes in Artificial Intelligence by Olivier Sigaud, Olivier Buffet

Dynamic Programming and Optimal Control by Dimitri Bertsekas

Pattern Recognition and Machine Learning by Christopher Bishop

The Elements of Statistical Learning by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie

Machine Learning: a Probabilistic Perspective by Kevin Patrick Murphy

Deep Learning by Aaron Courville, Ian Goodfellow, and Yoshua Bengio

Optimization for Machine Learning by Suvrit Sra, Sebastian Nowozin and Stephen J. Wright

Convex Optimization by Stephen Boyd, Lieven Vandenberghe

Introduction to Autonomous Mobile Robots by Roland Siegwart, Illah Reza Nourbakhsh and Davide Scaramuzza

Probabilistic Robotics by Dieter Fox, Sebastian Thrun, and Wolfram Burgard

Useful Resources

[RL Course by David Silver's]

[Denny Britz's GitHub repository]

[The Deep RL Bootcamp]

[RL Udacity]

[Deep Reinforcement Learning Hands-On 1st Edition]

[Deep Reinforcement Learning Hands-On 2nd Edition]

[Dive into Deep Learning]

[OpenAI Baselines]

[OpenAI Spinning Up]

[Berkeley AI Research Blog]

[Mathematics For Machine Learning]

[The Mathematical Engineering of Deep Learning]

[Marc Toussaint's teaching page ]

[Ashwin Rao's teaching page]

[Stanford cs231n]

[Stanford cs229]

[Stanford AI courses]

[ML Glossary]

[Python Programming And Numerical Methods: A Guide For Engineers And Scientists]

[Wolfram Alpha]

Useful Resources for understanding CNN

[Student Notes: Convolutional Neural Networks (CNN) Introduction]

[CS231n CNN for Visual Recognition]

[How convolutional neural networks see the world]

[Visualizing what convnets learn]

[Applied Deep Learning - Part 4: Convolutional Neural Networks]

[Feature Visualization - How neural networks build up their understanding of images]

[Understanding Neural Networks Through Deep Visualization]

[Convolutional Neural Network Visualizations]

[Inceptionism: Going Deeper into Neural Networks]

[Understanding your Convolution network with Visualizations]


No description, website, or topics provided.






No releases published


No packages published