This project is based on an OpenAI Gym environment for Super Mario Bros and uses the Stable-Baseline3 library to train a Reinforcement Learning Agent (PPO aka. Proximal Policy Optimization) to play the game.
A new agent training can be initialized using a GitHub Action. Here, a parameter is used to decide whether the training should take place locally (in a GitHub Action itself) or on the Google Cloud Platform. This way, new ideas can be quickly tested locally whereas the final training then takes place in Vertex AI. A custom job pulls a current Docker Image (is also created and uploaded via a GitHub Action) with all code from the Artifact Registry and creates a new Cloud Storage bucket, where all model artifacts are then stored.