## Working with FlodymArrays

### Initializing arrays

- `numpy` arrays with dimension management
- defined over a `DimensionSet`. 

In [None]:
import flodym as fd

# Create a dimension set
dims = fd.DimensionSet(
    dim_list=[
        fd.Dimension(name="Time", letter="t", items=[2020, 2030, 2050]),
        fd.Dimension(name="Region", letter="r", items=["EU", "US"]),
        fd.Dimension(name="Product", letter="p", items=["Vehicles", "Buildings"]),
        fd.Dimension(name="Grade", letter="g", items=["Carbon Steel", "Alloy Steel"]),
    ]
)

In [None]:
production = fd.FlodymArray(dims=dims["t", "r", "g"])
print(production)

Optional parameters:
- name (default "unnamed")
- values (default all zeros)

In [None]:
import numpy as np

values = 0.2 * np.ones(dims["t", "r", "g"].shape)
flow_production = fd.FlodymArray(name="production", dims=dims["t", "r", "g"], values=values)
print(flow_production)

### Math operations

Typical operations in MFA, and their dimensions:

- Addition / Subtraction:
  - Only keep dimensions which are in both arrays, sum over others

- Multiplication / Division:
  - Keep all dimensions present in either array
  - Broadcast both arrays to full dimension set
  - Multiply element-wise

- Always possible to reduce information in the result, by summing over some dimensions (preserves total mass flow)

#### Examples

Preparation: Create two more arrays

In [None]:
flow_losses = fd.FlodymArray(dims=dims["t", "r"], values=0.04 * np.ones(dims["t", "r"].shape))
prm_loss_rate = fd.FlodymArray(dims=dims["t",], values=0.15 * np.ones(dims["t",].shape))

Reminder:

In [None]:
print(f"production dims: {str(flow_production.dims.letters)}; losses dims: {str(flow_losses.dims.letters)}")

Addition / Subtraction

In [None]:
print(flow_production - flow_losses)

What happened here?

- dimensions in both arrays (set intersection) are preserved
- Other dims are summed over before addition
- result is a `FlodymArray`

#### Perform calculation manually 

(subtract numpy arrays)

In [None]:
flow_fabrication = fd.FlodymArray(dims=dims["t", "r"])
flow_fabrication.values[...] = flow_production.sum_to(("t", "r")).values - flow_losses.values
print(flow_fabrication)

Same for Addition:

In [None]:
print(flow_production + flow_losses)

Multiplication and division are different:

Keep all dimensions in either of the inputs, i.e. the set union.

In [None]:
# recall:
print(f"production dims: {str(flow_production.dims.letters)}; loss rate dims: {str(prm_loss_rate.dims.letters)}")


In [None]:
print(flow_production * prm_loss_rate)

Do it manually (multiply numpy arrays element-wise):

In [None]:
numpy_losses = flow_production.values * prm_loss_rate.cast_to(dims["t", "r", "g"]).values
print(numpy_losses)

Same for division (does not make physical sense in this example):

In [None]:
print(flow_production / prm_loss_rate)

Math operations also work with scalars. 
Those are broadcast to full dimensionality.

In [None]:
flow_losses = 0.04 * flow_production
print(flow_losses)

In [None]:
flow_fabrication = (1 - prm_loss_rate) * flow_production
print(flow_fabrication)

Math operations between `FlodymArray` and `numpy` arrays do not work
(numpy array dimensions have no name -> no matching possible)

In [None]:
try:
    print(prm_loss_rate * flow_production.values)
except Exception as e:
    print(e)

Remedies: 
- use flodym arrays for both
- use numpy arrays for both (use `values` attribute)

EXERCISES 1B1 and 1B2

### Caveat

- The dimension handling (addition -> dims intersection; multiplication -> dims union) yields the right behavior in *almost* all cases.
- There are exceptions! (E.g.: Adding two shares)
- Stay vigilant: Prescribe result dimensions (next slide)

### Control result dimensions 

- pre-define array
- use the ellipsis slice `[...]`:

In [None]:
flow_losses = fd.FlodymArray(name="losses", dims=dims["r", "t"])
flow_losses[...] = flow_production * prm_loss_rate
print(flow_losses)

This only works to reduce dimensions, throws an error if dimensions would have to be added.

### Look-ahead: MFASystems 

- important to use ellipsis slice on the left hand side! 
- Why? flows are stored as a dictionary.

Dictionaries roughly look like this: 

In [None]:
flows = {
    "production": flow_production,
    "losses": flow_losses,
}
parameters = {
    "loss_rate": prm_loss_rate,
}

Correct operation:

In [None]:
flows["losses"][...] = flows["production"] * parameters["loss_rate"]
print(flows["losses"])

Only changes the values of the `FlodymArray` object!

__Wrong__: without the ellipsis slice

=> Whole object overwritten, with uncontrolled outcome (changed name and dimensions)

In [None]:
flows["losses"] = flows["production"] * parameters["loss_rate"]
print(flows["losses"])

EXERCISES 1B3 to 1B5

### Slicing

- Access parts of the array with indexing
- replaces numpy indexing
- independent of number and order of dims and their entries

Example:

In [None]:
# recall
print("production dimensions: ", flow_production.dims.letters)

slice_a1 = flow_production["EU"]
print(slice_a1)

### Further capabilities
Slice along several dimensions: 

In [None]:
print(flow_production["EU", 2020])

Alternative dictionary notation to avoid ambiguities 
("2020" could be an item along several dimensions -> which to slice?)

In [None]:
print(flow_production[{"t": 2020}])

Several items along one dimension: 
Need to create a new dimension object

In [None]:
future_time = fd.Dimension(name="Future Time", letter="f", items=[2030, 2050])
print(flow_production[{"t": future_time}])

# if future_time was part of dims:
# flow_production[{"t": dims["f"]}]

### Indexing on left-hand side of an assignment

Only set a part of the values

In [None]:
flow_losses["EU"] = flow_losses["EU"] * 0.8
print(flow_losses.to_df())

EXERCISES 1B6 and 1B7

### Read-in

- Values from pandas dataframe
- Performs checks on data, sorts entries
- Switches to allow missing or excess data

In [None]:
import pandas as pd

df = pd.DataFrame.from_dict({
    "Region": ["US", "EU", "US", "EU", "US", "EU"],
    "Time": [2020, 2020, 2030, 2030, 2050, 2050],
    "Value": [2.6, 1.6, 2.3, 1.3, 2.0, 1.0]
})

In [None]:
print(df)

In [None]:
losses_from_df = fd.FlodymArray.from_df(dims=dims["r", "t"], name="Prescribed Losses", df=df)
print(losses_from_df)

### Plotting and export

#### Plotting: 

- `ArrayPlotter` class (example omitted here)
  - Can create plotly and pyplot plots
  - Can speed up plotting multiple dimensions to different subplots, line styles, colors
- Alternative: Export to df, use plotly yourself 

In [None]:
# create a fancy array to plot
fancy_array = losses_from_df.cast_to(dims["r", "t", "p"])
fancy_array["Vehicles"] *= 1.3

In [None]:
import flodym.export as fde

plt = fde.PlotlyArrayPlotter(
    array=fancy_array,
    intra_line_dim="Time",
    linecolor_dim="Region",
    subplot_dim="Product"
)
fig = plt.plot()
fig.show()

#### Export: to DataFrames

In [None]:
df_out = losses_from_df.to_df()
print(df_out)

### Subclasses 

- `Parameter` and `StockArray`
  - No changes, just for clarity
- `Flow`
  - Contain links to the processes they connect: `to_process` and `from_process`

EXECERCISES 1B8 and 1B9