Skip to content

jparras/drl_classes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DRL codes

Collection of exercises and minimal DRL implementations used for teaching. All algorithms are coded using Torch. Note that these algorithms are for pedagogical purposes, and hence, they include minimal implementation tricks, as the purpose of this code is to have a clear view on how DRL codes are implemented (thus, the performance of these codes may be low compared to state-of-the-art implementations as Stable Baselines 3).

Basic methods

The following examples and exercises correspond to some basic ideas needed to understand the basics of RL:

Classic RL

The following examples and exercises correspond to some classic RL algorithms, including iterative methods, tabular methods, and linear function approximation methods:

Model-free DRL

The following examples implement model-free DRL algorithms (all tested on the Cartpole problem):

  • DDQN (Double Deep Q-Networks)
  • VPG (Vanilla Policy Gradient)
  • A2C (Advantage Actor Critic)
  • TRPO (Trust Region Policy Optimization, note that in this case, we use the implementation of Stable Baselines 3 instead of providing an implementation to show a state-of-the-art library)
  • DDPG (Deep Deterministic Policy Gradient)

Model-based DRL

For model-based DRL, the only implemented example is AlphaZero (tested on tic-tac-toe).

PASD students guide

Link Observations
Example 7.6 Code for the example in the slides
Example 8.1 Code for the example in the slides
Example 8.3 Code for the example in the slides
Example 8.5 Homework
Example 8.6 Homework
Example 8.7 Homework
Example 9.1 Code for the example in the slides
Example 9.2 Code for the example in the slides
Example 9.3 Code for the example in the slides
Example 9.4 Homework
Example 9.5 Homework
Example 9.6 Homework
Example 9.7 Homework
Example 9.8 Homework
Example 9.9 Code for the example in the slides
Example 9.10 Code for the example in the slides
Example 9.11 Code for the example in the slides
Example 9.12 Code for the example in the slides

REIL students guide

Link Observations
Exercise 2.1 Exercise to be completed by the student
Exercise 3.2 Exercise to be completed by the student
Exercise 3.3 Exercise to be completed by the student
Exercise 3.4 Exercise to be completed by the student
Exercise 3.5 Exercise to be completed by the student
Exercise 3.6 Exercise to be completed by the student
Exercise 3.7 Exercise to be completed by the student
Exercise 4.1 Exercise to be completed by the student
Exercise 4.2 Exercise to be completed by the student
Exercise 4.3 Exercise to be completed by the student
Exercise 4.4 Exercise to be completed by the student
Exercise 5.1 Exercise to be completed by the student
Exercise 5.2 Exercise to be completed by the student
Exercise 5.3 Exercise to be completed by the student
Exercise 5.4 Exercise to be completed by the student
Exercise 5.5 Exercise to be completed by the student
Exercise 6.1 Exercise to be completed by the student
Exercise 6.2 Exercise to be completed by the student
Exercise 6.3 Exercise to be completed by the student
Exercise 6.4 Exercise to be completed by the student
Example 7.1 Code for the example in the slides
Example 7.2 Code for the example in the slides
Example 7.3 Code for the example in the slides
Example 7.4 Code for the example in the slides
Example 7.5 Code for the example in the slides
Example 7.6 Code for the example in the slides
Example 7.7 Code for the example in the slides

Execution in Google Colab

The recommended way of executing these codes is to use Google Colab. The simplest way of doing that is to navigate to the code you want to execute, and then replace github.com in the URL by githubtocolab.com.

A second option is to go to Colab, and in the Open options, select GitHub and add this repository.

And finally, you can also download the code and execute it in your own machine, by installing all required dependencies.

More websites

If you are interested in DRL and want to keep on learning, it might be worthy checking the following resources:

  • Spinning up in DRL is an OpenAI webpage devoted to give an in-depth introduction to DRL, as well as a set of papers to learn more advanced topics. Their documentation is well-written, and their code is also available and worth checking.
  • CleanRL is a project that has developed several implementations of DRL algorithms in a single file, in order to facilitate understanding. Their documentation is nice, and it is a code repository worth checking.
  • Stable Baselines 3 is a high-quality implementation of most DRL algorithms, that is highly recommended if you want to use state-of-the-art implementations of the most popular DRL algorithms. They are a solid alternative to OpenAI Baselines.

About

DRL codes used for teaching

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published