## This Notebook will be used for ***testing*** purposes in the process of working on recreating the proposed model from [this paper](https://ieeexplore.ieee.org/document/8751972) with Jupyter Lab/Notebook, Python, and scikit-learn.

In [None]:
from sklearn import preprocessing
import numpy as np

### Testing out preprocessing techniques as described [here](https://scikit-learn.org/stable/modules/preprocessing.html) in order to get a feel for when I apply it to the dataset:

In [None]:
X_train = np.array([[ 1., -1.,  2.],
                   [ 2.,  0.,  0.],
                   [ 0.,  1., -1.]])
scaler = preprocessing.StandardScaler().fit(X_train)
scaler

In [None]:
scaler.mean_

array([1.        , 0.        , 0.33333333])

In [None]:
scaler.scale_

array([0.81649658, 0.81649658, 1.24721913])

In [None]:
X_scaled = scaler.transform(X_train)
X_scaled

array([[ 0.        , -1.22474487,  1.33630621],
       [ 1.22474487,  0.        , -0.26726124],
       [-1.22474487,  1.22474487, -1.06904497]])

After using the `StandardScaler` utility class from the `preprocessing` module, the operations performed on the dataset above have scaled the "array-like" dataset.
Also, "Scaled data has zero mean and unit variance"; keep this in mind.

The following is an example from the documentation of an implementation of the `Transformer` API so as to compute the mean and standard deviation on a trainaing set so as to be able to later re-apply the same transformation on the testing set.
This is probably something you'll wanna keep in mind, and is "suitable for use in the early steps of a [pipeline](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline):

In [None]:
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

In [None]:
X, y = make_classification(random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
pipe = make_pipeline(StandardScaler(), LogisticRegression())
pipe.fit(X_train, y_train)  # apply scaling on training data

In [None]:
pipe.score(X_test, y_test) # apply scaling on testing data, without leaking training data

0.96

Something that seems worth noting here, as is mentioned in the documentation, is that it's possible to disable centering or scaling by passing `with_mean=False` or `with_std=False` to the constructor of `StandardScaler`.

### Now, we're gonna do `"Scaling features to a range"` ([6.3.1.1](https://scikit-learn.org/stable/modules/preprocessing.html))

According to the documentation, `features scaling` is an alternative to `standardization`; this scales features to lie between a given minimum and maximum, often between zero and one, or so that the max absolute value of each feature isscaled to unit size.
`Feature scaling` can be done using the `MinMaxScaler` or `MaxAbsScaler`, respectively.
It seems that a motivation for using scaling includes 
`note to self: get back to playing around with this section`