Skip to content

Latest commit

 

History

History
79 lines (53 loc) · 2.87 KB

reinforcement-learning.rst

File metadata and controls

79 lines (53 loc) · 2.87 KB

Reinforcement Learning Guide

SLIM is, in principle, a game with different agents optimising their profits. We support this by encapsulating the simulator state within a RL environment. However as Gym's API was not designed for MARL scenarios we decided to adopt PettingZoo.

AEC and PettingZoo

Alternated Environment Cycle or AEC is a type of MARL in which multiple agents act in turns exactly once before the turn is over. The order in which they act is irrelevant.

Note

Currently, we adopt a simple AEC scheme which does not allow for parallel agent execution.

Each agent performs two actions, in the given order:

  • samples from the observation space
  • performs an action

Only after all agents have performed an action farms' spaces will be updated. Each agent can only predicate about its own space, and only has access to a limited subset of what the simulator models. In particular, the simulator exposes to an agent the following:

  • current lice aggregation;
  • fish population;
  • which treatments are being used;
  • how many treatments can still be used within the year;
  • whether the organisation has asked to treat;

The action space is made of T+2 actions with T being the number of available treatments. The two extra options are fallowing and inaction.

The main logic is implemented in :class:`slim.simulation.simulator.SimulatorPZEnv`.

Policies

A number of policies are defined in :mod:`slim.simulation.simulator`. These are namely:

Additionally, any policy within the stable-baselines package should be supported although they have not been tested yet.

The main policy prediction loop is performed inside :class:`slim.simulation.simulator.Simulator`.

To select a policy one needs to set the treatment_strategy option in the configuration.

For example:

.. tabs::
    .. group-tab:: Command Line

        .. code-block:: bash

            python -m slim.SeaLiceMgmt \
                output_folder/Loch_Fyne \
                config_data/Fyne \
                --treatment-strategy=bernoulli

    .. group-tab:: Python

        .. code-block:: python

            from slim.simulation.config import Config
            from slim.simulation.Simulator import Simulator

            cfg = Config("config_data/config.json", "config_data/Fyne")
            cfg.treatment_strategy = "bernoulli"
            sim = Simulator("output", "Fyne_foobar", cfg)
            sim.run_model()

See :ref:`Environment Config` for details.