# Train a Single agent system using RLLib

© Crown-owned copyright 2025, Defence Science and Technology Laboratory UK

This notebook demonstrates how to use the ``PrimaiteRayEnv`` to train a basic PPO agent on the [UC2 scenario](./Data-Manipulation-E2E-Demonstration.ipynb).

In [1]:
!primaite setup

2025-03-24 09:52:58,122: Performing the PrimAITE first-time setup...
2025-03-24 09:52:58,122: Building the PrimAITE app directories...
2025-03-24 09:52:58,123: Building primaite_config.yaml...
2025-03-24 09:52:58,123: Rebuilding the demo notebooks...
2025-03-24 09:52:58,146: Rebuilding the example notebooks...
2025-03-24 09:52:58,148: PrimAITE setup complete!


In [2]:
import yaml
import ray
from primaite.config.load import data_manipulation_config_path
from primaite.session.ray_envs import PrimaiteRayEnv
from ray.rllib.algorithms.ppo import PPOConfig

# If you get an error saying this config file doesn't exist, you may need to run `primaite setup` in your command line
# to copy the files to your user data path.
with open(data_manipulation_config_path(), 'r') as f:
    cfg = yaml.safe_load(f)

ray.init(local_mode=True)


E0000 00:00:1742809979.025500    7087 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1742809979.030316    7087 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1742809979.043028    7087 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1742809979.043048    7087 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1742809979.043050    7087 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1742809979.043051    7087 computation_placer.cc:177] computation placer already registered. Please check linka

2025-03-24 09:53:08,025	INFO worker.py:1788 -- Started a local Ray instance.


0,1
Python version:,3.10.16
Ray version:,2.32.0


#### Create a Ray algorithm and pass it our config.

In [3]:
for agent in cfg['agents']:
    if agent["ref"] == "defender":
        agent['agent_settings']['flatten_obs'] = True
env_config = cfg

config = (
    PPOConfig()
    .environment(env=PrimaiteRayEnv, env_config=env_config)
    .env_runners(num_env_runners=0)
    .training(train_batch_size=128)
    .evaluation(evaluation_duration=1)
)


#### Start the training

In [4]:
algo = config.build()
results = algo.train()


`UnifiedLogger` will be removed in Ray 2.7.
  return UnifiedLogger(config, logdir, loggers=None)
The `JsonLogger interface is deprecated in favor of the `ray.tune.json.JsonLoggerCallback` interface and will be removed in Ray 2.7.
  self._loggers.append(cls(self.config, self.logdir, self.trial))
The `CSVLogger interface is deprecated in favor of the `ray.tune.csv.CSVLoggerCallback` interface and will be removed in Ray 2.7.
  self._loggers.append(cls(self.config, self.logdir, self.trial))
The `TBXLogger interface is deprecated in favor of the `ray.tune.tensorboardx.TBXLoggerCallback` interface and will be removed in Ray 2.7.
  self._loggers.append(cls(self.config, self.logdir, self.trial))
2025-03-24 09:53:09,122: PrimaiteGymEnv RNG seed = None


2025-03-24 09:53:10,472: Resetting environment, episode 0, avg. reward: 0.0


2025-03-24 09:53:10,474: Saving agent action log to /home/runner/primaite/4.0.0/sessions/2025-03-24/09-53-03/agent_actions/episode_0.json


2025-03-24 09:53:11,745: Resetting environment, episode 1, avg. reward: -43.500000000000085


2025-03-24 09:53:11,746: Saving agent action log to /home/runner/primaite/4.0.0/sessions/2025-03-24/09-53-03/agent_actions/episode_1.json




### Evaluate the results

In [5]:
eval = algo.evaluate()

2025-03-24 09:53:13,562: Resetting environment, episode 2, avg. reward: -41.20000000000006


2025-03-24 09:53:13,564: Saving agent action log to /home/runner/primaite/4.0.0/sessions/2025-03-24/09-53-03/agent_actions/episode_2.json
