# Training an SB3 Agent

This notebook will demonstrate how to use primaite to create and train a PPO agent, using a pre-defined configuration file.

#### First, we import the inital packages and read in our configuration file.

In [None]:
from primaite.game.game import PrimaiteGame
from primaite.session.environment import PrimaiteGymEnv
import yaml

In [None]:
from primaite.config.load import data_manipulation_config_path

In [None]:
with open(data_manipulation_config_path(), 'r') as f:
    cfg = yaml.safe_load(f)

Using the given configuration, we generate the environment our agent will train in.

In [None]:
gym = PrimaiteGymEnv(game_config=cfg)

Lets define training parameters for the agent.

In [None]:
from stable_baselines3 import PPO

EPISODE_LEN = 128
NUM_EPISODES = 10
NO_STEPS = EPISODE_LEN * NUM_EPISODES
BATCH_SIZE = 32
LEARNING_RATE = 3e-4

In [None]:
model = PPO('MlpPolicy', gym, learning_rate=LEARNING_RATE,  n_steps=NO_STEPS, batch_size=BATCH_SIZE, verbose=0, tensorboard_log="./PPO_UC2/")

With the agent configured, let's train for our defined number of episodes.

In [None]:
model.learn(total_timesteps=NO_STEPS)

Next, let's save the agent to a zip file that can be used in future evaluation.

In [None]:
model.save("PrimAITE-PPO-example-agent")

Now, we load the saved agent and run it in evaluation mode.

In [None]:
eval_model = PPO("MlpPolicy", gym)
eval_model = PPO.load("PrimAITE-PPO-example-agent", gym)

Finally, evaluate the agent.

In [None]:
from stable_baselines3.common.evaluation import evaluate_policy

evaluate_policy(eval_model, gym, n_eval_episodes=10)