Skip to content

This project implements two AI agents that learn to solve the Cart-Pole game. It also includes an agent learns to play the Lunar-Lander game using Prioritized Experience Replay (PER).

Notifications You must be signed in to change notification settings

Myk72/RL-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RL Algorithms Project

This repository contains implementations of Reinforcement Learning (RL) algorithms. I implemented two agents to solve the CartPole environment and compared their learning behaviors. I also completed a task using Prioritized Experience Replay (PER) on LunarLander-v2.

Features

1. Q-Learning Agent

  • Implemented a classic Q-Learning agent.
  • Uses a Q-table for value storage.
  • Applies state discretization using bins to handle continuous observations.

2. DQN Agent

  • Implemented a Deep Q-Network (DQN) using PyTorch.
  • Uses a neural network to approximate Q-values.
  • Includes Experience Replay and a Target Network for improved learning stability.

3. Comparison

  • Trained both Q-Learning and DQN agents.
  • Created plots showing average reward and learning curves.
  • Compared stability, speed, and performance of both algorithms.

4. Additional Task: PER

  • Implemented Prioritized Experience Replay (PER).
  • Trained PER-based agent on the LunarLander-v2 environment.

Folder Structure

RL_Algorithm/
│
├── DQN/
│   ├── dqn.ipynb                 # Code for the DQN agent
│   ├── dqn_learning_curve.png    # Plot for DQN training
│   ├── dqn_model.pth             # Saved model file
│   └── final_comparison_plot.png # Plot comparing DQN vs Q-Learning
│
├── PER/
│   └── per.ipynb                 # PER on LunarLander-v3
│
├── Q_Learning/
│   ├── Q_Learning.ipynb          # Code for the Q-Learning agent
│   ├── q_table.npy               # Saved Q-Table file
│   ├── learning_curve.png        # Plot for Q-Learning training
│   ├── rewards_log.npy           # Data file for rewards
│   └── avg_rewards_log.npy       # Data file for average rewards
│
└── README.md

About

This project implements two AI agents that learn to solve the Cart-Pole game. It also includes an agent learns to play the Lunar-Lander game using Prioritized Experience Replay (PER).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published