Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Introduction to Reinforcement Learning at KDD 2018

Table of Contents


Reinforcement Learning is the computational approach to learning from interaction (Sutton & Barto). Autonomous agents performing goal-oriented learning based on experience is the holy grail of AI. Recent successes of Reinforcement Learning algorithms include human-level performance on many Atari games, beating world's best Go player, and robots learning dexterity and grasping. Despite these successes, industrial applications of RL outside of organizations with easy access to large-scale compute and software infrastructure remain sparse. Wide-spread applications of RL require more sample efficient algorithms and new software tools for doing distributed computing. Our goal with this tutorial is two-fold:

Provide a solid conceptual foundation for understanding and evaluating the existing state of the art RL methods (both strengths and weaknesses).

Showcase Ray, an emerging distributed execution framework, for experimenting with and deploying large-scale Reinforcement Learning algorithms.

We believe that RL works well on applied problems where: 1.) Gathering simulations and experience is cheap, e.g. games, narrow-domain robotics; 2.) Modellers have access to a compact representation of the environment dynamics to apply approximate dynamic programming. During the tutorial, we will also deep-dive into practical applications of RL such as optimizing tax collections with constrained value iteration, predicting market micro-structure for better trade execution, and others alike.


While portions of the tutorial will focus on conceptual foundations and applied case studies, we will do hands-on implementations of the algorithms on standard Reinforcement Learning environments for debugging and pedagogical purposes.


Topic / Slides Code Along Case Study Estimated Time
Installation - - 8.30AM-9.00AM
Markov Decision Processes & Planning Algorithms Implementing Value and Policy Iteration Optimizing Tax Collections to Save NY State Government $120-$150M over 3 Years 9.00AM-9.30AM
Markov Decision Processes & Planning Algorithms Implementing Value and Policy Iteration Optimizing Tax Collections to Save NY State Government $120-$150M over 3 Years 10.00AM-11.00AM
Model Free Methods: Q-Learning Q-Learning Walkthrough Case Study: Deep Q-Learning for Supply Chain Optimization; Exciting Up and Coming Applications: Reinforcement Learning for Skip Lists: A Case Study in Building Simulators from the Groundup & Learning to Optimize Database Joins 11.00AM-12.00PM
Lunch - - 12.30PM-1.30PM
Model Free Methods: Policy Gradients, Reinforce, TRPO, PPO, and Actor-Critic Implementing Policy Gradients and PPO Deep reinforcement learning for de novo drug design 1.30AM-3.30PM
Introduction to Ray & RLLib Ray Exercises and Tutorial - 3.30PM-5.30PM (Might Conflict with Closing Ceremony)

Installation & Prerequisites


Option 1: Installations your own Computer

  • Create a directory for this workshop with mkdir rl-kdd.
  • cd into your rl-kdd directory. Create a virtualenv associated with the tutorial. Checkout more detailed instructions for OS specific virtualenv creation here.
  • Activate the previously created virtualenv.
  • Clone this repo.
  • cd into the cloned repo. While the virtualenv is active, pip install -r to download the associated packages.
    • If you are having issues with gym[atari] installation, make sure you have the following dependencies.
  • Test installation works properly by:
    • Running python tests/
    • Running python tests/

Side Note: Make sure you are using python 3.6.4

Option 2: JupyterHub

If personal computer is not working for you for any reasons, try out JupyterHub. Windows not supported for Ray so if you have windows then JupyterHub is the default.

Log-In with your JupyterHub at Notes:

  • On First Login, wait a bit (around 5-10 mins). The UI doesn't give instant feedback.
  • Don't edit the solutions. Edit the starter code notebooks. If you do edit the solutions, I'd recommend users copy the notebooks over to their home directory if they want to edit them. The git puller will not update a file that has been locally modified, so if you modify notebooks you will have to tell them to copy & delete it if they want to get a newer version. This shouldn't be that big of a deal :)

Once you are into the Hub, open up terminal and pip install matplotlib and pip install seaborn.


This tutorial will require a solid foundations in Machine Learning, Python Programming, and basic familiarity with popular Deep Learning frameworks like Tensorflow and Torch.



No description, website, or topics provided.






No releases published


No packages published