RLforDummy: Reinforcement Learning Algorithms Implementation

Project Overview

This project is a collection of my implementations of several key Reinforcement Learning (RL) algorithms. It serves as a practical exploration into the field of RL and demonstrates the application of these algorithms in solving complex environments.

Implemented Algorithms

In this repository, you will find my implementations of the following algorithms:

Semi-RainbowDQN: An adaptation of the Rainbow DQN algorithm that incorporates several, but not all, enhancements over the standard DQN.
- Reference: M. Hessel et al., "Rainbow: Combining Improvements in Deep Reinforcement Learning," in AAAI Conference on Artificial Intelligence, 2018. Link
Advantage Actor-Critic (A2C): This algorithm combines the benefits of value-based and policy-based RL, using an actor-critic approach.
- Reference: V. Mnih et al., "Asynchronous Methods for Deep Reinforcement Learning," in International Conference on Machine Learning, 2016. Link
Proximal Policy Optimization (PPO): PPO is an on-policy algorithm that optimizes a clipped surrogate objective function to balance exploration and exploitation. It has achieved state-of-the-art results on both Atari and Mujoco environments.
- Reference: J. Schulman et al., "Proximal Policy Optimization Algorithms," arXiv preprint arXiv:1707.06347, 2017. Link
Soft Actor-Critic (SAC): SAC is an off-policy deep reinforcement learning algorithm designed for training agents in continuous action spaces. It incorporates entropy regularization to encourage exploration and achieve high sample efficiency.
- Reference: T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, "Soft Actor-Critic: Off-Policy Deep Reinforcement Learning with Real-World Robots," arXiv preprint arXiv:1801.01290, 2018. Link

Solved Environment

Each of the implemented algorithms has demonstrated benchmark results. Results can be checked in wandb report below. Due to limited computing resources, I couldn't run all algorithms on all atari and mujoco benchmark. Contributions are welcome.

https://api.wandb.ai/links/phdminh01/t99sr00t

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
__pycache__		__pycache__
algos		algos
bash		bash
media		media
misc		misc
.DS_Store		.DS_Store
Dockerfile		Dockerfile
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

algos

algos

bash

bash

media

media

misc

misc

.DS_Store

.DS_Store

Dockerfile

Dockerfile

readme.md

readme.md

requirements.txt

requirements.txt

Repository files navigation

RLforDummy: Reinforcement Learning Algorithms Implementation

Project Overview

Implemented Algorithms

Solved Environment

About

Releases

Packages

Languages

minhphd/RLforDummy

Folders and files

Latest commit

History

Repository files navigation

RLforDummy: Reinforcement Learning Algorithms Implementation

Project Overview

Implemented Algorithms

Solved Environment

About

Resources

Stars

Watchers

Forks

Languages