# Manipulating trees

This tutorial intends on demonstrating possible manipulations with the implemented `TreeClass`

In [32]:
# Standard Libraries
import pandas as pd
import numpy as np
# Tree package
import hierarchicalearning.TreeClass as treepkg

# Defining a simple (A, (B, C, D)) hierarchy
leaves = ['B', 'C', 'D']
periods = 100
random_data = np.random.rand(periods, 3)
tidx = pd.date_range('2022-01-01', periods=periods, freq='H')
df = pd.DataFrame(random_data, columns=leaves, index=tidx)

df_tree = dict()
for leaf_id in leaves:
    df_tree[leaf_id] = df.loc[:, [leaf_id]]
    df_tree[leaf_id].rename(columns={leaf_id: 'node_timeseries1'}, inplace=True)

## Creating hierarchies

### Spatial trees
A spatial tree can be created from 2 different structures:
1. a `set` object composed of `Node` class objects representing the elements of the tree
2. a `tuple` object composed of the summation matrix `S` and the identification numbers of its associated `y` vector in the ordering: (S, y_ID)

In [33]:
# Creating the set object `nodes` aggregating all considered tree elements, with their respective parent/child relationships
root = treepkg.Node(id='A',
                    parent=None,
                    children=leaves)
nodes = {root}
for leaf_id in leaves:
    leaf = treepkg.Node(id=leaf_id,
                        parent='A',
                        children=None)
    nodes.add(leaf)

# Creating spatial tree
tree_S = treepkg.Tree(nodes, dimension='spatial')

In [34]:
tree_S.print()

'A'
		'B'
		'C'
		'D'


In [35]:
tree_S.print_flat()

node:  C leaf - parent: A - children: None
node:  D leaf - parent: A - children: None
node:  B leaf - parent: A - children: None
node:  A root - parent: None - children: ['B', 'C', 'D']


By creating the tree structure from the node elements, the package automatically calculates the Summation matrix, y vectore identification k-level mapping values. 

In [36]:
tree_S.S

array([[1, 1, 1],
       [1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

In [37]:
tree_S.y_ID

['A', 'C', 'D', 'B']

In [38]:
tree_S.k_level_map

{3: ['A'], 1: ['C', 'D', 'B']}

We can now create the time series hierarchy from a dictionnary of {leaf_id: pandas.DataFrame} object that will be aggregated across the tree using the `create_spatial_hierarchy` method

In [39]:
tree_S.create_spatial_hierarchy(df_tree, columns2aggr=['node_timeseries1'])
tree_S.head(3)

node:  C
                     node_timeseries1
2022-01-01 00:00:00          0.822418
2022-01-01 01:00:00          0.648446
2022-01-01 02:00:00          0.361801
node:  D
                     node_timeseries1
2022-01-01 00:00:00          0.727178
2022-01-01 01:00:00          0.475632
2022-01-01 02:00:00          0.771693
node:  B
                     node_timeseries1
2022-01-01 00:00:00          0.456913
2022-01-01 01:00:00          0.299870
2022-01-01 02:00:00          0.298897
node:  A
                     node_timeseries1
2022-01-01 00:00:00          2.006509
2022-01-01 01:00:00          1.423949
2022-01-01 02:00:00          1.432392


### Temporal trees
A temporal tree can be created from 2 structural input objects:
1. a `dict` object representing the k-level sampling frequencies of the temporal tree. E.g.: {1: '1D', 2: '6H', 3:'1H'}
2. a `tuple` object composed of the summation matrix `S` and the identification numbers of its associated `y` vector in the ordering: (S, y_ID)

In [40]:
tree_input_structure = {1: '6H', 2: '2H', 3: '1H'}
tree_T = treepkg.Tree(tree_input_structure, dimension='temporal')

In [41]:
tree_T.S

array([[1, 1, 1, 1, 1, 1],
       [1, 1, 0, 0, 0, 0],
       [0, 0, 1, 1, 0, 0],
       [0, 0, 0, 0, 1, 1],
       [0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 1],
       [0, 0, 0, 1, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0]])

In [42]:
tree_T.y_ID

['6H_0',
 '2H_0',
 '2H_1',
 '2H_2',
 '1H_2',
 '1H_4',
 '1H_5',
 '1H_3',
 '1H_1',
 '1H_0']

In [43]:
tree_T.k_level_map

{6: ['6H_0'],
 2: ['2H_0', '2H_1', '2H_2'],
 1: ['1H_0', '1H_1', '1H_2', '1H_3', '1H_4', '1H_5']}

We create the temporal time-series hierarchy from one input pandas.DataFrame object

In [44]:
tree_T.create_temporal_hierarchy(df_tree[list(df_tree.keys())[1]], columns2aggr=['node_timeseries1'])
tree_T.head(3)

node:  6H_0
                     node_timeseries1
2022-01-01 00:00:00          2.416109
2022-01-01 06:00:00          2.698158
2022-01-01 12:00:00          4.480967
node:  2H_0
                     node_timeseries1
2022-01-01 00:00:00          1.470864
2022-01-01 06:00:00          1.108819
2022-01-01 12:00:00          1.552218
node:  2H_1
                     node_timeseries1
2022-01-01 02:00:00          0.871142
2022-01-01 08:00:00          0.637602
2022-01-01 14:00:00          1.882327
node:  2H_2
                     node_timeseries1
2022-01-01 04:00:00          0.074103
2022-01-01 10:00:00          0.951738
2022-01-01 16:00:00          1.046422
node:  1H_0
                     node_timeseries1
2022-01-01 00:00:00          0.822418
2022-01-01 06:00:00          0.482615
2022-01-01 12:00:00          0.626919
node:  1H_1
                     node_timeseries1
2022-01-01 01:00:00          0.648446
2022-01-01 07:00:00          0.626204
2022-01-01 13:00:00          0.925299
node:  1H_2
    

### Spatiotemporal trees
Spatiotemporal trees are defined using `Multi_Tree`, a parent class of the uni-dimensional `Tree` class. The spatiotemporal tree is created from 2 defined spatial and temporal `Tree` objects respectively.

In [45]:
### Creating Spatio-Temporal tree
tree_ST = treepkg.Multi_Tree(tree_S, tree_T)

In [46]:
tree_ST.S

array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0],
       [0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0],
       [0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1],
       [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

In [47]:
tree_ST.y_ID

[('A', '6H_0'),
 ('A', '2H_0'),
 ('A', '2H_1'),
 ('A', '2H_2'),
 ('A', '1H_2'),
 ('A', '1H_4'),
 ('A', '1H_5'),
 ('A', '1H_3'),
 ('A', '1H_1'),
 ('A', '1H_0'),
 ('C', '6H_0'),
 ('C', '2H_0'),
 ('C', '2H_1'),
 ('C', '2H_2'),
 ('C', '1H_2'),
 ('C', '1H_4'),
 ('C', '1H_5'),
 ('C', '1H_3'),
 ('C', '1H_1'),
 ('C', '1H_0'),
 ('D', '6H_0'),
 ('D', '2H_0'),
 ('D', '2H_1'),
 ('D', '2H_2'),
 ('D', '1H_2'),
 ('D', '1H_4'),
 ('D', '1H_5'),
 ('D', '1H_3'),
 ('D', '1H_1'),
 ('D', '1H_0'),
 ('B', '6H_0'),
 ('B', '2H_0'),
 ('B', '2H_1'),
 ('B', '2H_2'),
 ('B', '1H_2'),
 ('B', '1H_4'),
 ('B', '1H_5'),
 ('B', '1H_3'),
 ('B', '1H_1'),
 ('B', '1H_0')]

In [48]:
tree_ST.k_level_map

{18: [('A', '6H_0')],
 6: [('A', '2H_0'),
  ('C', '6H_0'),
  ('A', '2H_1'),
  ('A', '2H_2'),
  ('D', '6H_0'),
  ('B', '6H_0')],
 3: [('A', '1H_0'),
  ('A', '1H_1'),
  ('A', '1H_2'),
  ('A', '1H_3'),
  ('A', '1H_4'),
  ('A', '1H_5')],
 2: [('C', '2H_0'),
  ('C', '2H_1'),
  ('C', '2H_2'),
  ('D', '2H_0'),
  ('D', '2H_1'),
  ('D', '2H_2'),
  ('B', '2H_0'),
  ('B', '2H_1'),
  ('B', '2H_2')],
 1: [('C', '1H_0'),
  ('C', '1H_1'),
  ('C', '1H_2'),
  ('C', '1H_3'),
  ('C', '1H_4'),
  ('C', '1H_5'),
  ('D', '1H_0'),
  ('D', '1H_1'),
  ('D', '1H_2'),
  ('D', '1H_3'),
  ('D', '1H_4'),
  ('D', '1H_5'),
  ('B', '1H_0'),
  ('B', '1H_1'),
  ('B', '1H_2'),
  ('B', '1H_3'),
  ('B', '1H_4'),
  ('B', '1H_5')]}

The time series hierarchy is then created in a similar fashion to the uni-dimensional trees, here using the `create_ST_hierarchy` method;

In [49]:
tree_ST.create_ST_hierarchy(df_tree, columns2aggr=['node_timeseries1'])
tree_ST.head(2)

node:  ('A', '6H_0')
                     node_timeseries1
2022-01-01 00:00:00          9.167365
2022-01-01 06:00:00          9.493111
node:  ('A', '2H_0')
                     node_timeseries1
2022-01-01 00:00:00          3.430457
2022-01-01 06:00:00          4.202178
node:  ('A', '2H_1')
                     node_timeseries1
2022-01-01 02:00:00          3.190420
2022-01-01 08:00:00          2.732963
node:  ('A', '2H_2')
                     node_timeseries1
2022-01-01 04:00:00          2.546488
2022-01-01 10:00:00          2.557969
node:  ('A', '1H_2')
                     node_timeseries1
2022-01-01 02:00:00          1.432392
2022-01-01 08:00:00          0.785643
node:  ('A', '1H_4')
                     node_timeseries1
2022-01-01 04:00:00          0.940151
2022-01-01 10:00:00          1.768399
node:  ('A', '1H_5')
                     node_timeseries1
2022-01-01 05:00:00          1.606337
2022-01-01 11:00:00          0.789570
node:  ('A', '1H_3')
                     node_timeseri