Skip to content

ray075hl/PPO_super_mario

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PPO_super_mario

Play game super mario using Proximal Policy Optimization method.

Setup

Tested in Windows 8.1, Windows 10, Ubuntu16.04.

Python=3.6, Pytorch>=0.4.0.

Other requirements package.

pip install -r requirements.txt

Save video need install ffmpeg.

Usage

# Train a agent from scratch
python run.py train	

Download pre-trained model from here.

# Play game with a trained model
python run.py play ./pre_trained_model/mario_10000-best.dat

Training processing takes about 5 hours when I use nvidia-V100(1GPU, 16 parallel game envs), rewards will reach about 200.0 and game length 275 steps. It look like below when model converge.

Alt text

Reference

About

Reinforcement Learning. Pytorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages