## Working with FlodymArrays

### Initializing arrays

- `numpy` arrays with dimension management
- defined over a `DimensionSet`. 

In [None]:
import flodym as fd

# Create a dimension set
dims = fd.DimensionSet(
    dim_list=[
        fd.Dimension(name="Region", letter="r", items=["EU", "US", "MEX"]),
        fd.Dimension(name="Product", letter="p", items=["Product A", "Product B"]),
        fd.Dimension(name="Time", letter="t", items=[2020]),
    ]
)

In [None]:
production = fd.FlodymArray(dims=dims)
print(production)

Optional parameters:
- name (default "unnamed")
- values (default all zeros)

In [None]:
import numpy as np

flow_a = fd.FlodymArray(name="Flow a", dims=dims["t", "p"], values=0.2 * np.ones((1, 2)))
print(flow_a)

### Math operations

Typical operations in MFA, and their dimensions:

- Summation: $FlowC = FlowA + FlowB$
  - Example: $FlowA$ differentiated by product type, but not $FlowB$,
    - No information how to split $FlowB$ over product types
    - => Only keep dimensions which are in both arrays, sum over others
  - Same for subtraction

- Multiplication: $FlowD = FlowC * ShareD$
  - $ShareD$ can introduce a new dimension that $FlowC$ is split by (e.g. product type)
    - => Keep all dimensions present in either array
  - perform `np.einsum`: tile, then multiply element-wise
  - Same for division

- Always possible to reduce information in the result, by summing over some dimensions (preserves total mass flow)

#### Examples

Preparation: Create two more arrays

In [None]:
flow_b = fd.FlodymArray(dims=dims["r", "t"], values=0.1 * np.ones((3, 1)))
parameter_a = fd.FlodymArray(dims=dims["r", "p"], values=0.5 * np.ones((3, 2)))

Reminder:

In [None]:
print(f"flow a dims: {str(flow_a.dims.letters)}; flow b dims: {str(flow_b.dims.letters)}")

Summation

In [None]:
print(flow_a + flow_b)

What happened here?

- dimensions in both arrays (set intersection) are preserved
- Other dims are summed over before summation
- result is a `FlodymArray`

Same for subtraction:

In [None]:
print(flow_a - flow_b)

Multiplication and division are different:

Keep all dimensions in either of the inputs, i.e. the set union.

In [None]:
# recall:
print(f"flow a dims: {str(flow_a.dims.letters)}; parameter a dims: {str(parameter_a.dims.letters)}")


In [None]:
print(flow_a * parameter_a)

In [None]:
print(flow_a / parameter_a)

### Caveat

- This rule (addition -> dims intersection; multiplication -> dims union) yields the right behavior in *almost* all cases.
- There are exceptions! (e.g. adding two dimensionless parameters) 
- Stay vigilant: Prescribe result dimensions (next slide)

### Control result dimensions 

- pre-define array
- use the ellipsis slice `[...]`:

In [None]:
product = fd.FlodymArray(name="product", dims=dims["r", "p"])
product[...] = flow_a * parameter_a
print(product)

This only works to reduce dimensions, throws an error if dimensions would have to be added.

### Look-ahead: MFASystems 

- important to use ellipsis slice on the left hand side! 
- Why? flows are stored as a dictionary.

Dictionaries roughly look like this: 

In [None]:
flows = {
    "flow_a": flow_a,
    "flow_b": flow_b,
    "product": product,
}
parameters = {
    "parameter_a": parameter_a,
}

Correct operation:

In [None]:
flows["product"][...] = flows["flow_a"] * parameters["parameter_a"]
print(flows["product"])

Only changes the values of the `FlodymArray` object!

__Wrong__: without the ellipsis slice

=> Whole object overwritten, with uncontrolled outcome (changed name and dimensions)

In [None]:
flows["product"] = flows["flow_a"] * parameters["parameter_a"]
print(flows["product"])

### Slicing

- Access parts of the array with indexing
- replaces numpy indexing
- independent of number and order of dims and their entries

Example:

In [None]:
# recall
print("flow_a dimensions: ", flow_a.dims.letters)

slice_a1 = flow_a["Product A"]
print(slice_a1)

Further capabilities: 
- Slice along several dimensions
- Several items along one dimension
- Alternative dictionary notation to avoid ambiguities


In [None]:
print(flow_a["Product A", 2020])

print(flow_a[{"t": 2020}])

print(flow_a[{"t": 2020, "p": "Product A"}])

Numpy indexing of the whole object like `flow_a[0, :]` is not supported!
(flodym wouldn't know if `2020` is an index or an item)

### Access several items along one dimension

Problem: 
- dimension can't be dropped, but items are missing
- dimensions should be unique!

Solution:
- create new dimension object with new name and letter
- pass new dimension object to the slice, along with the dimension letter to replace

In [None]:
regions_na = fd.Dimension(name="RegionsNA", letter="n", items=["US", "MEX"])

slice_b1 = flow_b[{"r": regions_na}]
print(slice_b1)

### Indexing on left-hand side of an assignment

Only set a part of the values

In [None]:
flow_b["EU"] = flow_a["Product A"]
print("flow_b.values:\n", flow_b.values)

Several items along one dimension:

In [None]:
flow_b[{"r": regions_na}] = flow_b[{"r": regions_na}] * 3
print(flow_b)

### Read-in

- Values from pandas dataframe
- Performs checks on data, sorts entries
- Switches to allow missing or excess data

In [None]:
import pandas as pd

df = pd.DataFrame.from_dict({
    "Region": ["MEX", "US", "EU"],
    "Time": [2020, 2020, 2020],
    "Value": [1., 2., 3.]
})

In [None]:
print(df)

In [None]:
array_from_df = fd.FlodymArray.from_df(dims=dims["r", "t"], name="Some Dataframe Data", df=df)
print(array_from_df)

### Plotting and export

#### Plotting: 

- `ArrayPlotter` class (example omitted here)
  - Can create plotly and pyplot plots
  - Can speed up plotting multiple dimensions to different subplots, line styles, colors
- Alternative: Export to df, use plotly yourself 

In [None]:
# create a fancy array to plot
fancy_array = array_from_df.cast_to(dims)
fancy_array["Product A"] *= 1.3

In [None]:
import flodym.export as fde

plt = fde.PlotlyArrayPlotter(
    array=fancy_array,
    intra_line_dim="Region",
    linecolor_dim="Time",
    subplot_dim="Product",
    chart_type="scatter"
)
fig = plt.plot()
fig.update_traces(marker=dict(size=12))
fig.show()

#### Export: to DataFrames

In [None]:
df_out = array_from_df.to_df()
print(df_out)

### Subclasses 

- `Parameter` and `StockArray`
  - No changes, just for clarity
- `Flow`
  - links to the processes they connect: `to_process` and `from_process`