# Array measures

atoti is optimized to handle array data

## Loading arrays from CSV

atoti can load array from CSV files. The separator for array elements must be provided to the `read_csv` method, and the CSV columns must use another separator. All the arrays should have the same length.

In [None]:
import atoti as tt

session = tt.create_session()

In [None]:
store = session.read_csv(
    "data/arrays.csv", keys=["TradeId"], store_name="Store With Arrays", array_sep=";"
)
store.head()

In [None]:
cube = session.create_cube(store, "Cube")

## Arrays default aggregations

atoti provides default aggregation functions on arrays: `SUM` and `MEAN`. They are applied element by element on the array.

In [None]:
lvl = cube.levels
m = cube.measures
cube.query(m["PnlArray.SUM"], levels=lvl["Continent"])

In [None]:
cube.query(m["PnlArray.MEAN"], levels=lvl["Continent"])

## Additional array functions

### Sum, Average, Min or Max of all the array elements

In [None]:
m["sum vect"] = tt.array.sum(m["PnlArray.SUM"])
cube.query(m["sum vect"], levels=lvl["Continent"])

In [None]:
m["mean vect"] = tt.array.mean(m["PnlArray.SUM"])
cube.query(m["mean vect"], levels=lvl["Continent"])

In [None]:
m["min vect"] = tt.array.min(m["PnlArray.SUM"])
cube.query(m["min vect"], levels=lvl["Continent"])

In [None]:
m["max vect"] = tt.array.max(m["PnlArray.SUM"])
cube.query(m["max vect"], levels=lvl["Continent"])

### Length

In [None]:
m["length"] = tt.array.len(m["PnlArray.SUM"])
cube.query(m["length"])

### Variance and Standard Deviation

In [None]:
m["variance"] = tt.array.var(m["PnlArray.SUM"])
m["standard deviation"] = tt.array.std(m["PnlArray.SUM"])
cube.query(m["variance"], m["standard deviation"], levels=lvl["Continent"])

### Sort

In [None]:
m["sort"] = tt.array.sort(m["PnlArray.SUM"])
cube.query(m["sort"], levels=lvl["Continent"])

### Quantile

In [None]:
m["95 quantile"] = tt.array.quantile(m["PnlArray.SUM"], 0.95, mode="simple")
m["95 exc quantile"] = tt.array.quantile(m["PnlArray.SUM"], 0.95, mode="exc")
m["95 inc quantile"] = tt.array.quantile(m["PnlArray.SUM"], 0.95, mode="inc")
m["95 centered quantile"] = tt.array.quantile(m["PnlArray.SUM"], 0.95, mode="centered")
cube.query(
    m["95 quantile"],
    m["95 exc quantile"],
    m["95 inc quantile"],
    m["95 centered quantile"],
    levels=[lvl["Continent"], lvl["Country"]],
)

In [None]:
m["95 linear"] = tt.array.quantile(
    m["PnlArray.SUM"], 0.95, mode="inc", interpolation="linear"
)
m["95 lower"] = tt.array.quantile(
    m["PnlArray.SUM"], 0.95, mode="inc", interpolation="lower"
)
m["95 higher"] = tt.array.quantile(
    m["PnlArray.SUM"], 0.95, mode="inc", interpolation="higher"
)
m["95 nearest"] = tt.array.quantile(
    m["PnlArray.SUM"], 0.95, mode="inc", interpolation="nearest"
)
m["95 midpoint"] = tt.array.quantile(
    m["PnlArray.SUM"], 0.95, mode="inc", interpolation="midpoint"
)
cube.query(
    m["95 linear"], m["95 lower"], m["95 higher"], m["95 nearest"], m["95 midpoint"]
)

### n greatest / n lowest

Returns an array with the n greatest/lowest values of a another array.

In [None]:
m["Top 3"] = tt.array.n_greatest(m["PnlArray.SUM"], 3)
cube.query(m["Top 3"])

In [None]:
m["Bottom 2"] = tt.array.n_lowest(m["PnlArray.SUM"], 2)
cube.query(m["Bottom 2"])

### nth greatest value / nth lowest value

Returns nth greatest or lowest value of a vector

In [None]:
m["Third largest value"] = tt.array.nth_greatest(m["PnlArray.SUM"], 3)
cube.query(m["Third largest value"])

In [None]:
m["Second smallest value"] = tt.array.nth_lowest(m["PnlArray.SUM"], 2)
cube.query(m["Second smallest value"])

### Scale

In [None]:
m["scale x10"] = m["PnlArray.SUM"] * 10.0
cube.query(m["scale x10"])

### Element at index

Extract the element at a given index

In [None]:
m["first element"] = m["PnlArray.SUM"][0]
cube.query(m["first element"], m["PnlArray.SUM"])

With the `create_parameter_hierarchy` function it is possible to create a hierarchy corresponding to the indices of the array. This hierarchy can then be used to "slice" this array and create a measure which depends on the selected index.

In [None]:
cube.create_parameter_hierarchy("index", list(range(0, 10)))
m["PnL at index"] = m["PnlArray.SUM"][lvl["index"]]
cube.query(m["PnL at index"], levels=lvl["index"])

You can also build non-integer hierarchies and map each member to its index in the hierarchy using the `index_measure` argument:

In [None]:
from datetime import date, timedelta

cube.create_parameter_hierarchy(
    "Vector Dates",
    [date(2020, 1, 1) + timedelta(days=x) for x in range(0, 10)],
    index_measure="Date Index",
)
m["PnL at date"] = m["PnlArray.SUM"][m["Date Index"]]
cube.query(m["Date Index"], m["PnL at date"], levels=lvl["Vector Dates"])

In cases the indices also need to be of arbitrary order or range, it is also possible to manually provide them as a list.

In [None]:
cube.create_parameter_hierarchy(
    "Custom Vector Dates",
    [date(2020, 1, 1) + timedelta(days=x) for x in range(0, 10)],
    indices=[9, 8, 7, 6, 5, 0, 1, 2, 3, 4],
    index_measure="Custom Date Index",
)
m["PnL at custom date"] = m["PnlArray.SUM"][m["Custom Date Index"]]
cube.query(
    m["Custom Date Index"], m["PnL at custom date"], levels=lvl["Custom Vector Dates"]
)

### Sub-arrays

Extract a slice of the array

In [None]:
m["first 2 elements"] = m["PnlArray.SUM"][0:2]
cube.query(m["first 2 elements"], m["PnlArray.SUM"])

## Load DataFrame with lists

atoti can load DataFrame containing numpy arrays and python list

In [None]:
import pandas as pd
import numpy as np

df = pd.DataFrame(
    {
        "index": [0, 1, 2],
        "python list": [
            [3.2, 1.0, 8, 9, 4.5, 7, 6, 18],
            [4.2, 4.0, 4, 9, 4.5, 8, 7, 8],
            [12, 1.0, 8, 9, 4.5, 7, 6, 18],
        ],
        "numpy array": [
            np.array([3.2, 1.0, 8, 9, 4.5, 7, 6, 18]),
            np.array([4.2, 4.0, 4, 9, 4.5, 8, 7, 8]),
            np.array([12, 1.0, 8, 9, 4.5, 7, 6, 18]),
        ],
    }
)

In [None]:
pd_store = session.read_pandas(df, "Pandas")
pd_store

In [None]:
pd_store.head()