Lab 5 Programming Assignment
Classic Atari game Pong
pong.py - tests 1200 episodes and tests for 100 episodes [render_mode = "ansi"] Results after 100 episodes: Average timesteps per episode: 163.69 Average penalties per episode: 0.0
hyper.py -
Hyperparameter Tuning Results: Alpha | Gamma | Epsilon | Avg Timesteps | Avg Penalties
0.1 | 0.5 | 0.1 | 182.82 | 0.00 0.1 | 0.5 | 0.3 | 175.10 | 0.00 0.1 | 0.5 | 0.5 | 186.64 | 0.00 0.1 | 0.8 | 0.1 | 131.75 | 0.00 0.1 | 0.8 | 0.3 | 169.63 | 0.00 0.1 | 0.8 | 0.5 | 160.26 | 0.00 0.1 | 0.99 | 0.1 | 113.32 | 5.99 0.1 | 0.99 | 0.3 | 111.45 | 0.00 0.1 | 0.99 | 0.5 | 126.25 | 0.00 0.5 | 0.5 | 0.1 | 70.59 | 0.00 0.5 | 0.5 | 0.3 | 55.52 | 0.00 0.5 | 0.5 | 0.5 | 30.90 | 0.00 0.5 | 0.8 | 0.1 | 22.42 | 0.00 0.5 | 0.8 | 0.3 | 25.80 | 0.00 0.5 | 0.8 | 0.5 | 20.45 | 0.00 0.5 | 0.99 | 0.1 | 16.78 | 0.00 0.5 | 0.99 | 0.3 | 14.47 | 0.00 0.5 | 0.99 | 0.5 | 13.06 | 0.00 0.9 | 0.5 | 0.1 | 18.61 | 0.00 0.9 | 0.5 | 0.3 | 16.75 | 0.00 0.9 | 0.5 | 0.5 | 18.51 | 0.00 0.9 | 0.8 | 0.1 | 12.98 | 0.00 0.9 | 0.8 | 0.3 | 12.91 | 0.00 0.9 | 0.8 | 0.5 | 14.93 | 0.00 0.9 | 0.99 | 0.1 | 22.99 | 0.00 0.9 | 0.99 | 0.3 | 14.32 | 0.00 0.9 | 0.99 | 0.5 | 16.80 | 0.00
Here is the Video Demonstration if render_mode = "human"