## Basic Usage

### Regression Data

The BKTR algorithm is only able to run on a type of dataset respecting some criteria. The dataset should contain four different dataframes.

  * A dataframe for the covariates (`covariates_df`) with dimensions $ST$ x $c$. Having two row indexes: `location` and `time`.
  * A dataframe for the response variable $y$ (`y_df`) with dimensions $S$ x $T$.
  * A dataframe for the spatial point location coordinates (`x_spatial_df`) with dimensions $S$ x $l_s$.
  * A dataframe for the temporal point location coordinates (`x_temporal_df`) with dimensions $T$ x $l_t$.

Where:

  * $S$ is the number of spatial points
  * $T$ is the number of temporal points
  * $l_s$ and $l_t$ are the number of dimensions used to represent the location of spatial and temporal points (respectively).
  * $c$ is the number of covariates (features) used through space and time.

Note: If the data provided does not hold corresponding labels (via columns and indexes) for the above mentioned dimensions. There will be a validation error raised by the `BKTRRegressor`.

### Data Example

In this package we used the same dataset as the *BIXI* dataset presented in the BKTR article (section 5). We can explore those datasets to be able to ensure that they fit the dimensions criteria.

##### Let's start by analyzing the dimensions of the BIXI dataset

In [2]:
from pyBKTR.examples.bixi import BixiData

# Load the BIXI data example
bixi_data = BixiData()

print('Lets Explore the BIXI data dimensions: \n')

print('Departure data (y):')
s, t = bixi_data.departure_df.shape
print(f'\t S={s} & T={t}')

print('Covariates')
s, cs = bixi_data.covariates_df.shape
print(f'\t S*T={s} & c={cs}')

print('Spatial points coordinates:')
s, ls = bixi_data.spatial_positions_df.shape
print(f'\t S={s} & ls={ls}')


print('Spatial points coordinates:')
t, lt = bixi_data.temporal_positions_df.shape
print(f'\t T={t} & lt={lt} \n\n')


Lets Explore the BIXI data dimensions: 

Departure data (y):
	 S=587 & T=196
Covariates
	 S*T=115052 & c=18
Spatial points coordinates:
	 S=587 & ls=2
Spatial points coordinates:
	 T=196 & lt=1 




##### Bixi Data - Spatial Labels

In [4]:
bixi_data.covariates_df.index.get_level_values('location').unique()

Index(['7149 - 16e avenue / Jean-Talon', '7148 - Papineau / Émile-Journault',
       '7147 - Métro de Castelnau (de Castelnau / Clark)',
       '7146 - Lusignan / St-Jacques', '7145 - Argyle / Bannantyne',
       '7144 - Hickson / Wellington', '7143 - LaSalle / Godin',
       '7142 - Elgar / de l'Île-des-Sœurs', '7141 - Turgeon / Notre-Dame',
       '7140 - St-Jacques / des Seigneurs',
       ...
       '6004 - du Champ-de-Mars / Gosford', '6003 - Clark / Evans',
       '6002 - Ste-Catherine / Dezery',
       '6001 - Métro Champ-de-Mars (Viger / Sanguinet)',
       '5007 - Métro Longueuil - Université de Sherbrooke',
       '5006 - Collège Édouard-Montpetit (de Gentilly / de Normandie)',
       '4002 - Graham / Wicksteed', '4001 - Graham / Brookfield',
       '4000 - Jeanne-d'Arc / Ontario',
       '10002 - Métro Charlevoix (Centre / Charlevoix)'],
      dtype='object', name='location', length=587)

In [7]:
print('Here is a list of the first 5 spatial points labels:')
print('\t', '\n\t '.join(bixi_data.departure_df.index.to_list()[:5]))

print()

print('The spatial labels should be identical in their corresponding axis.')
has_same_spatial_points_labels = (
    set(bixi_data.departure_df.index.to_list())
    == set(bixi_data.covariates_df.index.get_level_values('location').to_list())
    == set(bixi_data.spatial_positions_df.index.to_list())
)
print(f'And they are {"" if has_same_spatial_points_labels else "not "}identical.')

Here is a list of the first 5 spatial points labels:
	 10002 - Métro Charlevoix (Centre / Charlevoix)
	 4000 - Jeanne-d'Arc / Ontario
	 4001 - Graham / Brookfield
	 4002 - Graham / Wicksteed
	 5006 - Collège Édouard-Montpetit (de Gentilly / de Normandie)

The spatial labels should be identical in their corresponding axis.
And they are identical.


##### Bixi Data - Temporal Labels

In [3]:
print('Here is a list of the first 5 temporal points labels:')
print('\t', '\n\t '.join(bixi_data.departure_df.columns.to_list()[:5]))

print()

print('The temporal labels should be identical in their corresponding axis.')
has_same_temporal_points_labels = (
    bixi_data.departure_df.columns.to_list()
    == bixi_data.temporal_features_df.index.to_list()
    == bixi_data.temporal_positions_df.index.to_list()
)
print(f'And they are {"" if has_same_temporal_points_labels else "not "}identical.')

Here is a list of the first 5 temporal points labels:
	 2019-04-15
	 2019-04-16
	 2019-04-17
	 2019-04-18
	 2019-04-19

The temporal labels should be identical in their corresponding axis.
And they are identical.


#### Bixi Data - Note
The Bixi data used in the BKTR article was initially different from the above presented datasets. The data was presented in a spatial covariates and temporal covariates manner. By that, we mean that the covariates were presented in two different datasets since the spatial covariates did not vary through time (i,e. The population around a station did not change with time.) and the temporal covariates did not vary through space (i,e. The temperature was defined as the same temperature for all the station studied). Thus, we needed to merge the covariates in a single long dataframe and since this operation might be needed for other datasets we present this operation in the *utility* documentation.

### Running BKTR
Once the dataframes have been loaded with the right dimensions, we can easily run the BKTR algorithm on our dataset.

In [4]:
from pyBKTR.bktr import BKTRRegressor

bktr_regressor = BKTRRegressor(
    spatial_covariates_df=bixi_data.spatial_features_df,
    temporal_covariates_df=bixi_data.temporal_features_df,
    y_df=bixi_data.departure_df,
    rank_decomp=6,
    burn_in_iter=5,
    sampling_iter=5,
    spatial_positions_df=bixi_data.spatial_positions_df,
    temporal_positions_df=bixi_data.temporal_positions_df,
)
bktr_regressor.mcmc_sampling();

** Result for iter 1    : elapsed time is 0.7909 || total sq error is 2092.6414 || mae is 0.1006 || rmse is 0.1446 **
** Result for iter 2    : elapsed time is 0.4751 || total sq error is 823.4394 || mae is 0.0665 || rmse is 0.0907 **
** Result for iter 3    : elapsed time is 0.9002 || total sq error is 664.3007 || mae is 0.0598 || rmse is 0.0815 **
** Result for iter 4    : elapsed time is 0.3751 || total sq error is 648.9187 || mae is 0.0590 || rmse is 0.0805 **
** Result for iter 5    : elapsed time is 0.6862 || total sq error is 631.2584 || mae is 0.0582 || rmse is 0.0794 **
** Result for iter 6    : elapsed time is 0.5136 || total sq error is 608.4890 || mae is 0.0573 || rmse is 0.0780 **
** Result for iter 7    : elapsed time is 0.5314 || total sq error is 601.3555 || mae is 0.0569 || rmse is 0.0775 **
** Result for iter 8    : elapsed time is 0.6055 || total sq error is 598.6233 || mae is 0.0568 || rmse is 0.0773 **
** Result for iter 9    : elapsed time is 2.1064 || total sq er

It is interesting to note that the package use default kernels and distance calculation methods that we think most users will want to use. By default, the `BKTRRegressor` use a Matern kernel $\frac{3}{2}$ and the haversine distance for the spatial coordinates, meaning that the `spatial_kernel_x` should have a dimension of $S$ x $2$ for (longitude, latitude). For temporal coordinates we use a SE Kernel by default with linear distance for the temporal coordinates meaning the `temporal_kernel_x` should have a dimension of $T$ x $1$.

Also, the `rank_decomp` parameter represent the rank of the decomposition used in the BKTR algorithm, in general a higher rank have the possibility to give more precise results at the cost of computation time. The `burn_in_iter` and `sampling_iter` params are used to dertermine the number of iterations of the algorithm, the `burn_in_iter` represent the number of iterations before we start the sampling (helps to the parameters to converge before starting the sampling phase) and the `sampling_iter` is the number of iterations used for sampling.

In the next section, we will demonstrate how to use different kernels and distance matrix to better fit your own data.