# devlog 2023-08-23

_author: Tyler Coles_

IPM parameter broadcasting enables flexibility in the way parameters are specified and interpreted.

## Mo' Better Betas

Imagine a modeler who wishes to run an experiment with a simple SIRS model. This model requires them to provide a value for $\beta$. Depending on what they are trying to do, how they wish to specify $\beta$ may vary widely.

$\beta$ could be a static, scalar value for very simple experiments. Or it could vary by location, vary over time, or vary by time and location simultaneously. These four ways of specifying values all have their purposes, and, rather than write a different IPM for each possible combination, it would be nice if a single IPM could support each of them. Compartment model IPMs in epymorph now have this ability, as long as an IPM's parameters are defined to allow it.

This is how our SIRS model defines its `beta` parameter:

---
```python
# ...
param('beta', shape=Shapes.TxN, dtype=float, allow_broadcasting=True),
# ...
```
---

- `shape=Shapes.TxN` expresses that this parameter should be interpreted by the model as varying by Time (simulation day) and Node (geo location). The shape of the underlying value array will have at least T items in its first axis, and N items in its second axis.
- `allow_broadcasting=True` expresses that the user is allowed the provide any of the shapes which are broadcastable to the specified `shape`; the system will do the necessary translation.
- `dtype=float` expresses that this parameter is interpreted as a floating point value.

_Note: the actual SIRS source code omits the unnecessary arguments: `float` is the default `dtype`, and `allow_broadcasting` defaults to `True`. Be wary: the default shape is `S` (scalar)._

In general, IPMs should declare their parameters as the largest shape they can use. By allowing broadcasting, the user has the option to provide a smaller shape.

## Available shapes

epymorph supports the following basic shapes with are relative to the simulation being run:

- `S`: a single, scalar value
- `T`: time-varying
- `N`: node-varying
- `TxN`: time-and-node-varying

And then there are special "arbitrary" shapes that build on the basic shapes.

Imagine a data attribute which defines an age breakdown for a number of locations: that is, counts of individuals divided up by age category. If we had three age categories, each location would have three data points. The shape of this data would be N-by-3. epymorph refers to that "3" as an arbitrary axis, because its length is arbitrarily-long in relation to the simulation context. (It doesn't matter how many geo nodes we have or how many days our simulation runs for, this axis is always 3 values.)

Each of the basic shapes can be "extended" by adding an arbitrary axis to the end:

- `A`: a one-dimensional, arbitrary-length value
- `TxA`: time-varying first axis, arbitrary second axis
- `NxA`: node-varying first axis, arbitrary second axis
- `TxNxA`: time-varying first axis, node-varying second axis, arbitrary third axis

Internally however each IPM attribute must reduce to a single value for calculations, so for each attribute the IPM must declare which index (zero-based) of the arbitrary axis it's interested in. It does that like this (example from the sparsemod model):

---
```python
# ...
# These parameters have the same attribute name,
# so they refer to the same value in the input .toml file.
param(symbol_name='omega_1', attribute_name='omega', shape=Shapes.TxNxA(0)),
param(symbol_name='omega_2', attribute_name='omega', shape=Shapes.TxNxA(1)),
# ...
```
---

This sparsemod model is defined to use two subscripted `omega` values. We interpret `A` as being of length (at least) two.

Then in the .toml file `[params]` section we specify something like: `omega = [0.55, 0.05]`

Of course, we could make the user specify those parameters separately (`omega_1 = ...; omega_2 = ...`) but this way is a little nicer to look at.

## Broadcasting rules

What makes a shape broadcastable from our input to the shape needed by the IPM? (Our rules are not exactly the same as numpy's broadcasting rules!)

The basic intuition is that, when an array of values is needed, you can either provide all array values or we can copy a single value to fill the needed array. A scalar `42` becomes a length-four array: `[42, 42, 42, 42]`.

At higher dimensions, this copy operation applies recursively through the dimensions. A scalar `42` becomes a length-2 array, which then becomes a 2x2 array:

```
[[42, 42],
 [42, 42]]
```

And a length-two array (`[42, 84]`) could become a 2x3 array:

```
[[42, 42, 42],
 [84, 84, 84]]
```

or a 3x2 array:

```
[[42, 84],
 [42, 84],
 [42, 84]]
```

However a length-four array cannot be broadcast as a 2x3 array: it doesn't match any of the required dimensions!

To put this into practice with epymorph's shapes:

- `S`: only a scalar value matches `S` (no extra data allowed)
- `N`: you can provide a scalar value or `N` values (nothing else)
- `T`: you can provide a scalar value or at-least `T` values (extra time-series data is allowed!)
- `TxN`: you can provide: a scalar value, exactly `N` values, at-least `T` values, or a two-dimensional array of size at-least-`T`-by-exactly-`N`

Arbitrary axes will not be broadcast: you must provide at least `A` values. Otherwise the above broadcasting rules apply to the other axes. (e.g., `[4, 5, 6]` will be successfully broadcast to a parameter of shape `TxNxA(2)`. And index 2 implies there must be values at indices 0, 1, and 2; at least three values.)

A few notes:

- Both the Time and Arbitrary axes allow extra values, but the Node axis will throw an error if extra values are provided.
- In the ambiguous case where `N` and `T` are equal, `N`-broadcasting "wins". So in a simulation with 6 nodes running for 6 days, a `TxN` parameter input value of `[1,2,3,4,5,6]` will be interpreted as node-varying: the first node having a value of 1 for all time, the second node having a value of 2 for all time, etc.
- On memory efficiency: just because an IPM allows a giant array for each parameter doesn't mean it will use that much memory at runtime. numpy broadcasting uses views, so rather than literally copying values to fill out entire arrays, it merely provides an access layer that adapts the larger view-indexing to the true in-memory indexing.