This seminar was created to get familiar with modern RL

mipt_course

contains solutions for tasks proposed by RL course at MIPT, which is based on David Silver's course. All tasks use OpenAI gym environments.

deephack

contains our attempts to solve Skiing game, a problem for qualification round of DeepHackLab hackathon. Core of model consists of training convolutional autoencoder with dense layers in bottleneck. Before trainig, we convert images from RGB to greys and compress it to 60x60. With autoencoder, we are obtaining ability to get low-dimensional features for images (64, basically). Code presented in autoencoder_simple_features.ipynb.

Then, we have 3 main directions of evolution:

parametrize agent's policy and use policy gradient algorithms, e.g. Monte-Carlo Policy Gradient (REINFORCE). Code presented in Skiing.ipynb
approximate value or action-value function and use epsilon-greedy policy. Code presented in linear_fa.ipynb
collect more features, e.g. via object detection. NB: due to competition rules, features should NOT be environment-specific. Code presented in features_demo.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
deephack		deephack
mipt_course		mipt_course
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deephack

deephack

mipt_course

mipt_course

README.md

README.md

Repository files navigation

This seminar was created to get familiar with modern RL

mipt_course

deephack

About

Releases

Packages

Contributors 5

Languages

izmailovpavel/rl-seminar

Folders and files

Latest commit

History

Repository files navigation

This seminar was created to get familiar with modern RL

mipt_course

deephack

About

Resources

Stars

Watchers

Forks

Languages