- Install Python version 3.9.0
- Install Visual C++ 14.0 or greater from https://visualstudio.microsoft.com/visual-cpp-build-tools/
- Run
pip install setuptools==65.5.0 pip==21
as gym 0.21 installation is broken with more recent versions - You can now install the packages manually or using the requirements.txt file. To do the latter, run
pip3 install -r requirements.txt
You will likely see errors during installations here. Largely these should be ignored, just try running things after installing, then work out manually what is broken. There are lots of deprecated or broken packages involved in running rlgym unfortunately, as it does not support the newer version 2 of stable baselines 3 that uses gymnasium instead of gym.
- Run
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
- Run
pip3 install "stable-baselines3[extra]==1.8.0"
- Run
pip3 install "gym[box2d]"
tensorboard --logdir=out/logs --bind_all
in a new terminal to load the web UI to track agent training. --bind_all
is optional and exposes it on the network so you can monitor it from another device.
pip3 freeze > requirements.txt
to save current packages installed into an updated requirements.txt file.
Run the python file rl-training.py
Within the bot folder, run the python file run_gui.py
This bot is configured to be able to play any Rocket League game mode. The plan for training it is as below. This will be done by playing games in as many instances of Rocket League simultaneously as possible.
Learn the basic mechanics as fast as possible. To do this I'm using 3v3, so that there are 6 agents training for each instance of the game. This will make the logs noisier, but should give the bot more exposure to the basics of the game.
Reward functions at this point are as below, along with their scale:
- VelocityPlayerToBallReward - 0.1
- VelocityBallToGoalReward - 1
- LiuDistanceBallToGoalReward - 1
- EventReward - 1
- team_goal=100.0,
- concede=-100.0,
- shot=5.0,
- save=30.0,
- demo=10.0,
- BallYCoordinateReward - 0.1
- RewardIfClosestToBall - 0.2
- LiuDistancePlayerToBallReward - 0.1
- FaceBallReward - 0.2
- TouchBallReward - 1
- AlignBallGoal - 0.1
- KickoffReward - 10
The primary goal is to get the model to learn to kickoff, and generally aim to be ball chasing, with a plan to get the ball into the opposing goal.
Terminal conditions for the first 265million steps:
- TimeoutCondition(fps * 30)
- NoTouchTimeoutCondition(fps * 10)
- GoalScoredCondition()
After 265 million steps I changed the terminal conditions and reward weightings.
-
TimeoutCondition -> fps * 300
-
VelocityPlayerToBallReward -> 1
-
TouchBallReward -> 10
-
LiuDistanceBallToGoalReward -> 50
This is to try to get the bot to gain experience in other areas of the game now it has vaguely got the hang of what a kickoff is.
After 528 million steps I changed things again, changing the rewards as below:
- VelocityBallToGoalReward -> 10
- EventReward -> 10
- RewardIfClosestToBall -> 1
- LiuDistancePlayerToBallReward -> 1
- AlignBallGoal -> 20
The bot should now have acceptable mechanics and a vague understanding of 3v3 strategies.
However, I noticed that there's a lot of dead time, where it's not necessarily driving towards the ball or preparing to defend or receive a pass, so I changed things again, changing the rewards as below:
- VelocityPlayerToBallReward -> 50
- LiuDistanceBallToGoalReward -> 20
- RewardIfClosestToBall -> 10
The bot can hold its own against the Psyonix Rookie bot, but only just beats it. Its fundamental weakness seems to be that it ignores actually scoring goals in favour of doing other things. On inspection, the scaling for rewards for goals is well behind everything else, so the rewards are changed as below:
- EventReward -> 1000
- team_goal -> 10000
- concede -> -10000
- shot -> 10
- save -> 60
- demo -> 20
The bot is now worse, mostly just sitting around in the middle of the pitch and moving very slowly. Given it has learned basic mechanics, I'm trying totally changing the rewards at this point, so the new rewards are as below, and are all equally weighted:
- VelocityPlayerToBallReward
- VelocityBallToGoalReward
- EventReward
- team_goal=1000.0
- concede=-100.0
- shot=10.0
- save=60.0
- demo=20.0
- KickoffReward
- SaveBoostReward
My hope is that this encourages the bot to move faster, pick up more boost, and prioritise hitting the ball quickly towards the enemy goal.
Learn the strategies for playing each game mode. To do this I will make the bot train in each game mode in series, or possibly in parallel if I can configure it to run different game modes in different instances of Rocket League.
Reward functions?
Learn more advanced mechanics.
Test it.
Test it in a bot tournament.