Proximal Policy Optimization (PPO) Implementation

A comprehensive implementation of Proximal Policy Optimization (PPO) algorithms in PyTorch, featuring both theoretical foundations and practical demonstrations.

🌟 Features

Clean, modular PyTorch implementation of PPO
Support for continuous and discrete action spaces
Implementations of key PPO components:
- Clipped surrogate objective
- Value function estimation
- Generalized Advantage Estimation (GAE)
- Policy and value function updates
Multiple environment demonstrations:
- CartPole-v1
- LunarLander-v2
Real-time visualization of agent performance
Training progress tracking and plotting

📋 Requirements

# Install dependencies
pip install -r requirements.txt

🚀 Quick Start

Clone the repository:

git clone https://github.com/ai-in-pm/Proximal-Policy-Optimization-Algorithms.git
cd Proximal-Policy-Optimization-Algorithms

Install dependencies:

pip install -r requirements.txt

Run a demo:

# Run CartPole demo
python demonstrations/cartpole_demo.py

# Run LunarLander demo
python demonstrations/lunar_lander_demo.py

🏗️ Project Structure

.
├── ppo.py              # Core PPO implementation
├── demonstrations/     # Example implementations
│   ├── cartpole_demo.py
│   ├── lunar_lander_demo.py
│   └── README.md
├── requirements.txt    # Project dependencies
└── README.md          # This file

💻 Implementation Details

Core Components

Actor-Critic Architecture
- Actor (Policy) network outputs action distributions
- Critic (Value) network estimates state values
PPO Algorithm
- Clipped surrogate objective for stable updates
- Value function loss with clipping
- Entropy bonus for exploration
- Generalized Advantage Estimation (GAE)
Key Features
- Modular design for easy extension
- Configurable hyperparameters
- Support for different environments
- Training progress visualization

Hyperparameters

Learning rate: 3e-4
Discount factor (gamma): 0.99
GAE parameter (lambda): 0.95
Clipping parameter (epsilon): 0.2
Value function coefficient: 1.0
Entropy coefficient: 0.01

📊 Results

The implementation has been tested on various environments:

CartPole-v1
- Achieves optimal performance (500 steps) within 500 episodes
- Stable learning across different random seeds
LunarLander-v2
- Achieves landing within 1000 episodes
- Demonstrates stable control and smooth landing

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. See CONTRIBUTING.md for guidelines.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

📚 References

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347.
Gymnasium Documentation: https://gymnasium.farama.org/
PyTorch Documentation: https://pytorch.org/docs/

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
demonstrations		demonstrations
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Proximal Policy Optimization Algorithms Paper.pdf		Proximal Policy Optimization Algorithms Paper.pdf
README.md		README.md
cartpole_results.png		cartpole_results.png
concept		concept
ppo.py		ppo.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Proximal Policy Optimization (PPO) Implementation

🌟 Features

📋 Requirements

🚀 Quick Start

🏗️ Project Structure

💻 Implementation Details

Core Components

Hyperparameters

📊 Results

🤝 Contributing

📝 License

📚 References

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

ai-in-pm/Proximal-Policy-Optimization-Algorithms

Folders and files

Latest commit

History

Repository files navigation

Proximal Policy Optimization (PPO) Implementation

🌟 Features

📋 Requirements

🚀 Quick Start

🏗️ Project Structure

💻 Implementation Details

Core Components

Hyperparameters

📊 Results

🤝 Contributing

📝 License

📚 References

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages