# State and Mode of the Tensor: Main part of meta information

**Note:** this tutorial assumes that you are familiar with the notion of N-dimensional arrays and basic definitions. The related material can be found in out previous tutorials: [tutorial_1](https://github.com/hottbox/hottbox-tutorials/blob/master/1_N-dimensional_arrays_and_Tensor_class.ipynb) and [tutorial_4](https://github.com/hottbox/hottbox-tutorials/blob/master/4_Ecosystem_of_Tensor_class.ipynb).


**Requirements:** ``hottbox==0.1.3``

**Authors:** 
Ilya Kisil (ilyakisil@gmail.com); 

Meta information about the tensor is represented by the **State** and **Mode** classes.

1. **State** keeps track of transformation applied to the underlying data array and can be seen as a link between current form of data array and current interpretatin of its original modes. 
2. **Mode** brings interpretability of the values for the underlying data array.

Without the data array, both of them are standalone classes. But within an ecosystem of **Tensor** class they interact with each other and the data array itself.

Any tensor that created using **hottbox** is assigined a default state which depends on data array. Each mode of the tensor will always have the accosiated names.

In [1]:
import numpy as np
import pandas as pd
from hottbox.core import Tensor
from hottbox.pdtools import tensor_to_pd, pd_to_tensor


def print_tensor_state(tensor, data=True, modes=True, transforms=True):
    """ Quick util for showing relevant information for this tutorial
    
    Parameters
    ----------
    tensor : Tensor    
    data : bool
        If True, show data array    
    modes : bool
        If True, show mode information
    """
    state = tensor._state
    
    if data:
        print("\tUnderlying data array:")
        print(tensor.data)                
    
    if modes:
        print("\n\tInformation about its modes:")
        for i, tensor_mode in enumerate(tensor.modes):
            print("#{}: {}".format(i, tensor_mode))        
            
    print("\nProperties described by modes: {}".format(tensor.mode_names))
    print("Associated normal shape: {}".format(state.normal_shape))    
    
    if transforms:
        print("\n\t\tApplied transformations:")
        for i, transformation in enumerate(state.transformations):
            print("\tTransformation #{}:".format(i))
            print("Reshaping type: {}".format(transformation[0]))
            print("New mode order: {}\n".format(transformation[1]))


def print_sep_line():
    print("\n==========================="
          "============================="
          "===========================\n")

## Tensor state: Default VS Custom

The same data values can be characterised by different states. By specifying custom state we implicitly apply transformation to the state of the tensor during its creation.
Each transformation is represented by the used reshaping type and the resulting order of the modes. List of **modes** of the tensor is created at the tensor initialisation. It depends on the normal shape if custom state is provided, otherwise it dependes on the shape of the data array.

In [2]:
I, J, K, L = 2, 3, 2, 2

data = np.arange(I*J*K*L).reshape(I, (J*K*L))


custom_state_1 = dict(mode_order=([0], [1, 2]),
                      normal_shape=(I, J, K*L),
                      rtype="T"
                     )
custom_state_2 = dict(mode_order=([0], [1, 2, 3]),
                      normal_shape=(I, J, K, L),
                      rtype="T"
                     )

tensor = Tensor(data)
tensor_1 = Tensor(data, custom_state_1)
tensor_2 = Tensor(data, custom_state_2)

print("\t\t2-D array as a tensor")
print_tensor_state(tensor)

print_sep_line()

print("\t\t3-D array as an unfolded tensor")
print_tensor_state(tensor_1)

print_sep_line()

print("\t\t4-D array as an unfolded tensor")
print_tensor_state(tensor_2)

		2-D array as a tensor
	Underlying data array:
[[ 0  1  2  3  4  5  6  7  8  9 10 11]
 [12 13 14 15 16 17 18 19 20 21 22 23]]

	Information about its modes:
#0: Mode(name='mode-0', index=None)
#1: Mode(name='mode-1', index=None)

Properties described by modes: ['mode-0', 'mode-1']
Associated normal shape: (2, 12)

		Applied transformations:
	Transformation #0:
Reshaping type: Init
New mode order: ([0], [1])



		3-D array as an unfolded tensor
	Underlying data array:
[[ 0  1  2  3  4  5  6  7  8  9 10 11]
 [12 13 14 15 16 17 18 19 20 21 22 23]]

	Information about its modes:
#0: Mode(name='mode-0', index=None)
#1: Mode(name='mode-1', index=None)
#2: Mode(name='mode-2', index=None)

Properties described by modes: ['mode-0', 'mode-1_mode-2']
Associated normal shape: (2, 3, 4)

		Applied transformations:
	Transformation #0:
Reshaping type: Init
New mode order: ([0], [1], [2])

	Transformation #1:
Reshaping type: T
New mode order: ([0], [1, 2])



		4-D array as an unfolded tensor
	Underl

Here we can see, that the tensors with same data values are actually in different states and have different number of modes. These modes have default names by can be changed during object creation or by calling **set_mode_names()** - the designated method of **Tensor** class to changes their names.

Next, we can bring tensor (for which we specified **custom state**) to the normal form by calling **fold()** method.

In [3]:
tensor_1.fold()
tensor_2.fold()

print_tensor_state(tensor_1)

print_sep_line()

print_tensor_state(tensor_2)

	Underlying data array:
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

	Information about its modes:
#0: Mode(name='mode-0', index=None)
#1: Mode(name='mode-1', index=None)
#2: Mode(name='mode-2', index=None)

Properties described by modes: ['mode-0', 'mode-1', 'mode-2']
Associated normal shape: (2, 3, 4)

		Applied transformations:
	Transformation #0:
Reshaping type: Init
New mode order: ([0], [1], [2])



	Underlying data array:
[[[[ 0  1]
   [ 2  3]]

  [[ 4  5]
   [ 6  7]]

  [[ 8  9]
   [10 11]]]


 [[[12 13]
   [14 15]]

  [[16 17]
   [18 19]]

  [[20 21]
   [22 23]]]]

	Information about its modes:
#0: Mode(name='mode-0', index=None)
#1: Mode(name='mode-1', index=None)
#2: Mode(name='mode-2', index=None)
#3: Mode(name='mode-3', index=None)

Properties described by modes: ['mode-0', 'mode-1', 'mode-2', 'mode-3']
Associated normal shape: (2, 3, 2, 2)

		Applied transformations:
	Transformation #0:
Reshaping type: Init
New mode 

**Note:** at the moment, only one transformation can be applied to a tensor. This will be generalised in the future. 

## Tensor modes: integration with pandas library

**Hottbox** is equipped with tools to convert multi-index pandas dataframe to tensors and vice versa. You can keep all meta information, only mode names or drop all of it.

### Multi-index dataframe to Tensor

In [4]:
data = {'Year': [2005, 2005, 2005, 2005, 2010, 2010, 2010, 2010],
        'Month': ['Jan', 'Jan', 'Feb', 'Feb', 'Jan', 'Jan', 'Feb', 'Feb'],
        'Day': ['Mon', 'Wed', 'Mon', 'Wed', 'Mon', 'Wed', 'Mon', 'Wed'],
        'Population': np.arange(8)
       }
df = pd.DataFrame.from_dict(data)
df.set_index(["Year", "Month", "Day"], inplace=True)
df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Population
Year,Month,Day,Unnamed: 3_level_1
2005,Jan,Mon,0
2005,Jan,Wed,1
2005,Feb,Mon,2
2005,Feb,Wed,3
2010,Jan,Mon,4
2010,Jan,Wed,5
2010,Feb,Mon,6
2010,Feb,Wed,7


In [5]:
tensor_1 = pd_to_tensor(df)
print_tensor_state(tensor_1, transforms=False)

	Underlying data array:
[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]

	Information about its modes:
#0: Mode(name='Year', index=[2005, 2010])
#1: Mode(name='Month', index=['Jan', 'Feb'])
#2: Mode(name='Day', index=['Mon', 'Wed'])

Properties described by modes: ['Year', 'Month', 'Day']
Associated normal shape: (2, 2, 2)


In [6]:
tensor_2 = pd_to_tensor(df, keep_index=False)
print_tensor_state(tensor_2, transforms=False)

	Underlying data array:
[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]

	Information about its modes:
#0: Mode(name='Year', index=None)
#1: Mode(name='Month', index=None)
#2: Mode(name='Day', index=None)

Properties described by modes: ['Year', 'Month', 'Day']
Associated normal shape: (2, 2, 2)


### Tensor to Multi-index dataframe

When tensor is converted to multi-index dataframe, the information about its modes is extracted, which then is used for column name and index values of the resulting dataframe. Next we show, various ways of specifying names/indecies for modes of the tensor and how this affects the result of the conversion.

In [7]:
# Default meta information
data = np.arange(8).reshape(2, 2, 2)
tensor = Tensor(data)
df = tensor_to_pd(tensor)

print_tensor_state(tensor, data=False, transforms=False)
df


	Information about its modes:
#0: Mode(name='mode-0', index=None)
#1: Mode(name='mode-1', index=None)
#2: Mode(name='mode-2', index=None)

Properties described by modes: ['mode-0', 'mode-1', 'mode-2']
Associated normal shape: (2, 2, 2)


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Values
mode-0,mode-1,mode-2,Unnamed: 3_level_1
0,0,0,0
0,0,1,1
0,1,0,2
0,1,1,3
1,0,0,4
1,0,1,5
1,1,0,6
1,1,1,7


In [8]:
# Custom mode names
# Can also be passed as a list of names during creation of the tensor
data = np.arange(8).reshape(2, 2, 2)
new_mode_names = {0: "Year",
                  1: "Month",
                  2: "Day"
                 }
tensor = Tensor(data)
tensor.set_mode_names(new_mode_names)
df = tensor_to_pd(tensor)

print_tensor_state(tensor, data=False, transforms=False)
df


	Information about its modes:
#0: Mode(name='Year', index=None)
#1: Mode(name='Month', index=None)
#2: Mode(name='Day', index=None)

Properties described by modes: ['Year', 'Month', 'Day']
Associated normal shape: (2, 2, 2)


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Values
Year,Month,Day,Unnamed: 3_level_1
0,0,0,0
0,0,1,1
0,1,0,2
0,1,1,3
1,0,0,4
1,0,1,5
1,1,0,6
1,1,1,7


In [9]:
# Custom mode index
data = np.arange(8).reshape(2, 2, 2)
tensor = Tensor(data)
new_mode_index = {0: [2005, 2010],
                  1: ["Jan", "Feb"],
                  2: ["Mon", "Wed"],
                 }
tensor.set_mode_index(new_mode_index)
df = tensor_to_pd(tensor)

print_tensor_state(tensor, data=False, transforms=False)
df


	Information about its modes:
#0: Mode(name='mode-0', index=[2005, 2010])
#1: Mode(name='mode-1', index=['Jan', 'Feb'])
#2: Mode(name='mode-2', index=['Mon', 'Wed'])

Properties described by modes: ['mode-0', 'mode-1', 'mode-2']
Associated normal shape: (2, 2, 2)


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Values
mode-0,mode-1,mode-2,Unnamed: 3_level_1
2005,Jan,Mon,0
2005,Jan,Wed,1
2005,Feb,Mon,2
2005,Feb,Wed,3
2010,Jan,Mon,4
2010,Jan,Wed,5
2010,Feb,Mon,6
2010,Feb,Wed,7


In [10]:
# Custom mode names, mode index and column name for dataframe
data = np.arange(8).reshape(2, 2, 2)
new_mode_index = {0: [2005, 2010],
                  1: ["Jan", "Feb"],
                  2: ["Mon", "Wed"],
                 }
tensor = Tensor(data, mode_names=["Year", "Month", "Day"])
tensor.set_mode_index(new_mode_index)
df = tensor_to_pd(tensor, col_name="Population")

print_tensor_state(tensor, data=False, transforms=False)
df


	Information about its modes:
#0: Mode(name='Year', index=[2005, 2010])
#1: Mode(name='Month', index=['Jan', 'Feb'])
#2: Mode(name='Day', index=['Mon', 'Wed'])

Properties described by modes: ['Year', 'Month', 'Day']
Associated normal shape: (2, 2, 2)


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Population
Year,Month,Day,Unnamed: 3_level_1
2005,Jan,Mon,0
2005,Jan,Wed,1
2005,Feb,Mon,2
2005,Feb,Wed,3
2010,Jan,Mon,4
2010,Jan,Wed,5
2010,Feb,Mon,6
2010,Feb,Wed,7
