CS 885 Spring 2018 at University of Waterloo - Reinforcement Learning
The course introduces students to the design of algorithms that enable machines to learn based on reinforcements. In contrast to supervised learning where machines learn from examples that include the correct decision and unsupervised learning where machines discover patterns in the data, reinforcement learning allows machines to learn from partial, implicit and delayed feedback. This is particularly useful in sequential decision making tasks where a machine repeatedly interacts with the environment or users. Applications of reinforcement learning include robotic control, autonomous vehicles, game playing, conversational agents, assistive technologies, computational finance, operations research, etc.
- Markov decision processes
- Bandits
- Model free reinforcement learning
- Model based reinforcement learning
- Partially observable reinforcement learning
- Deep reinforcement learning
- Hierarchical reinforcement learning
- Robotic control
- Game playing
- Conversational agents
- Operations research
- Assistive technologies
- Intelligent tutoring systems
- Computational finance
- Autonomous vehicles
- Course introduction (slides) (video)
- Markov Processes (slides) (video)
- Markov Decision Processes (slides) (video)
- Value Iteration (slides) (video)
- Policy Iteration (slides) (video)
- Introduction to RL (slides) (video)
- Deep neural networks (slides) (video)
- Deep Q-Networks (slides) (video)
- Guest lecture by Nabiha Asghar on RL for dialog systems (slides) (video)
- Guest Lecture by Mike Rudd on OpenAI environments (slides) (video)
- Guest Lecture by Timmy Tse on DQN and TensorFlow (slides) (video)
- Policy Gradient (slides) (video)
- Actor Critic (slides) (video)
- Multi-armed bandits (slides) (video)
- Bayesian and contextual bandits (slides) (video)
- Model-based RL (slides) (video)
- Bayesian RL (slides) (video)
- Hidden Markov models (slides) (video)
- Partially observable RL (slides) (video)
- Deep recurrent Q-networks (slides) (video)
- RL for video games. Playing FPS Games with Deep RL (slides) (video)
- RL for video games. A Deep Hierarchical Approach to Lifelong Learning in Minecraft (slides) (video)
- Adversarial Search (slides) (video)
- RL for computer Go (slides) (video)
- RL for board games (slides) (video)
- Trust Region Methods (slides) (video)
- Policy optimization. Trust region policy optimization (slides) (video)
- Policy optimization. Proximal policy optimization (slides) (video)
- Semi-Markov Decision Processes (slides) (video)
- Hierarchical RL. The Option-Critic Architecture (slides) (video)
- Hierarchical RL. FeUdal Networks for Hierarchical RL (slides) (video)
- RL for robotics. Target-driven Visual Navigation in Indoor Scenes using Deep RL (slides) (video)
- RL for robotics. Control of a Quadrotor with Reinforcement Learning (slides) (video)
- Inverse Reinforcement Learning (slides) (video)
- RL for autonomous vehicles. Safe, Multi-Agent, RL for Autonomous Driving (slides) (video)
- RL for autonomous vehicles. Learning Driving Styles for Autonomous Vehicles from Demonstration (slides) (video)
- RL for conversational agents. E2E lstm-based dialog control optimized with supervised and RL (slides) (video)
- RL for conversational agents. Learning Cooperative Visual Dialog Agents with Deep RL (slides) (video)
- Memory Augmented Networks (slides) (video)
- Memory based RL. Neural Map: Structured Memory for Deep RL (slides) (video)
- Memory based RL. Memory Augmented Control Networks (slides) (video)