# Intro to this package
---

Here we introduce the functionalities of our package and the dynamical models it uses.

# 1. Setup
---
Uncomment the following line to install `rl4greencrab`. 
After installation you need to restart the jupyter kernel in order to use the package and run this notebook.

In [5]:
# %pip install -e ..

# 2. Integral projection model
---

Here we conceptually describe our integral projection model (IPM) of the crab population dynamics.

Our model describes the process in which an agent observes green crab counts by laying traps to catch the crabs. 
Each time-step corresponds to a year's worth of data, and has two components: 
First, for 9 months crabs are caught using traps laid by the agent.
Second, for the last 3 months of the year, the crabs undergo a gestation period and no crabs are caught during this time.
This is the timeline of a time-step:
1. The agent receives an observation (nine months' worth of catch data).
2. The agent decides a density of traps to lay for the next year.
3. The population dynamics model evolves somatically for nine months, each month producing an overall catch count observation. This observation is sampled from a distribution that depends on the size-structure of the crab population.
5. New crabs are spawned and grow for 3 months. The number of new-borns is determined by a logistic function (plus a random term). During this timeline, too, new crabs immigrate with a fixed immigration rate.

The following code block shows the input and output of this model.
Our model is encoded as a `gymnasium env` class in order to leverage existing RL algorithms.

In [2]:
from rl4greencrab import greenCrabEnv

gce = greenCrabEnv()

After declaring `gce`, the `.reset()` function sets the state of the crab population to its initial value, and produces the following output:

`initial observation`, `info`.

Currently, `initial observation` is a random sequence of numbers for simplicity's sake.

In [10]:
observation, info = gce.reset()
observation

array([17., 20., 52., 45., 78., 74., 65., 25., 40.], dtype=float32)

A dynamical step for the IPM is produced by  the `.step(action)` function.
Here, the `action` argument takes values in [0, 2000] and corresponds to the number of traps laid.
The output is

`observation`, `reward`, `terminated`, `truncated`, `info`

The latter three outputs are beyond the scope of this notebook.
The `reward` output is used by the agent to train---the agent looks for strategies that lead to high average rewards over 100 time-steps.

The following call of the step function returns a null observation since no traps were laid.

In [14]:
observation, reward, terminated, truncated, info = gce.step([0])
observation

array([0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)

Lets try some other action values and see what observations we obtain:

In [18]:
# 1.
_ = gce.reset()
observation, reward, terminated, truncated, info = gce.step([1])
observation

array([0., 1., 0., 2., 1., 1., 0., 2., 1.], dtype=float32)

In [19]:
# 2.
_ = gce.reset()
observation, reward, terminated, truncated, info = gce.step([10])
observation

array([ 7.,  4., 11.,  5.,  8.,  9.,  7.,  3.,  6.], dtype=float32)

In [20]:
# 3.
_ = gce.reset()
observation, reward, terminated, truncated, info = gce.step([100])
observation

array([87., 76., 58., 58., 43., 49., 61., 55., 63.], dtype=float32)