Skip to content

Explorations and implementations of Deep Reinforcement Learning techniques from the Hugging Face course, featuring DQN and PPO applied to classic and VizDoom environments.

Notifications You must be signed in to change notification settings

Tikhon-Radkevich/HuggingFaceDeepRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Reinforcement Learning

Deep Reinforcement Learning

This repository is dedicated to the Deep Reinforcement Learning Course offered by Hugging Face. Explore the implementations and find all models hosted on my Hugging Face Hub profile: TikhonRadkevich.

Featured Units:

  • Unit 3: DQN - Delve into the application of Deep Q-Networks.
  • Unit 8: PPO - Explore Proximal Policy Optimization techniques.
  • Doom: PPO - Implement PPO in VizDoom environments.

Unit 3: DQN

Instead of relying on a Q-table, Deep Q-Learning utilizes a Neural Network to approximate the Q-values for each action based on the current state. I trained the model to play Space Invaders among other Atari games using RL-Zoo, a training framework for reinforcement learning that utilizes Stable-Baselines. This framework provides scripts for training, evaluating agents, tuning hyperparameters, plotting results, and recording gameplay.

DQN Replay

Unit 8: PPO

In this unit, I explored Proximal Policy Optimization (PPO), an algorithm that enhances training stability by limiting the extent of policy updates. This is achieved using a ratio that reflects the change from the old to the current policy, which is then clipped to stay within the range [1−ϵ,1+ϵ]. Such control ensures that the updates are not overly drastic, promoting more stable training. Initially, I learned the theoretical aspects of PPO and then implemented a PPO agent from scratch using the CleanRL framework to test on LunarLander-v2.

PPO Replay

Doom: PPO

I delved deeper into PPO optimization by applying the Sample-Factory, an asynchronous PPO implementation, to train an agent in VizDoom—a community-based open source Doom game. My first project involved training the agent to survive the Health Gathering scenario, where the objective is to collect health packs to prolong survival. Subsequently, I expanded to more complex scenarios like Deathmatch.

Doom PPO Replay

About

Explorations and implementations of Deep Reinforcement Learning techniques from the Hugging Face course, featuring DQN and PPO applied to classic and VizDoom environments.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published