# Scikit-Learn and OOP

In [1]:
import numpy as np

Very often the way you'll interact with `sklearn` will take the form of instantiating an object of a given class. And very often that object will have methods like `.fit()`, `.predict()`, `.score()`, and `.transform()`.

Let's look at a case of this that is already very familiar:

In [2]:
from sklearn.preprocessing import StandardScaler

We'll start by bringing in a StandardScaler object:

In [3]:
ss = StandardScaler()

Of course, we could have called it anything we wanted:

In [4]:
greg = StandardScaler()

Are these two objects the same?

In [5]:
ss == greg

False

Of course not! I can have as many StandardScaler objects as I want.

What attributes and methods are available for a Standard Scaler object? Let's check out the code on [GitHub](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/preprocessing/_data.py)!

## Attributes

### `.scale_`

In [6]:
greg.scale_

AttributeError: 'StandardScaler' object has no attribute 'scale_'

Some attributes and methods don't arise until we've fitted the object.

In [7]:
X1, X2 = np.random.normal(size=20), np.random.normal(loc=2, size=20)

In [8]:
X = list(zip(X1, X2))
X

[(-1.4359812800263534, 4.199987532348802),
 (2.064298131814442, 2.5537868030218163),
 (-0.2535539939207522, 2.9951284356399084),
 (-0.9214140151624111, 2.185259735787001),
 (0.5533208453536919, 2.030173953885489),
 (1.9113351490640125, 2.7999187502850265),
 (0.355748226320544, 2.7761078775747703),
 (1.6488317692657382, 2.1941486248496385),
 (0.7859100037299711, 1.201918057083422),
 (-1.4271970055532468, 2.103159012731747),
 (-0.8037460164449384, 1.560978112765135),
 (0.8175361419450755, 1.6668405742643655),
 (1.216948324988673, 2.066953776035555),
 (-0.8592322310261523, 0.7322851851425309),
 (0.6335823268417408, 1.8636182218716615),
 (1.3311041818969338, 2.235854537436865),
 (2.8395278646114708, 1.450338526378557),
 (0.4092467815194047, 1.4853959419184835),
 (0.14453726422289132, 2.8906978510862475),
 (-1.8692413440511053, 1.5108096390383308)]

In [9]:
greg.fit(X)

StandardScaler()

In [10]:
greg.scale_

array([1.25693913, 0.75416913])

### `.mean_`

In [11]:
greg.mean_

array([0.35707806, 2.12516806])

In [12]:
np.allclose(greg.mean_[0], X1.mean())

True

In [13]:
np.allclose(greg.mean_[1], X2.mean())

True

### `.var_`

In [14]:
greg.var_

array([1.57989597, 0.56877107])

In [15]:
np.allclose(greg.var_[0], X1.var())

True

In [16]:
np.allclose(greg.var_[1], X2.var())

True

### `.n_samples_seen_`

In [17]:
greg.n_samples_seen_

20

## Methods

### `._reset()`

If I ever want to "cancel" the fit, I can do that with `.reset()`:

In [18]:
greg._reset()

In [19]:
greg.n_samples_seen_

AttributeError: 'StandardScaler' object has no attribute 'n_samples_seen_'

In [20]:
greg.fit(X)

StandardScaler()

### `.transform()`

This of course does the main job I want the StandardScaler to do!

In [21]:
greg.transform(X)

array([[-1.42652838e+00,  2.75113287e+00],
       [ 1.35823608e+00,  5.68332394e-01],
       [-4.85808770e-01,  1.15353486e+00],
       [-1.01714717e+00,  7.96793136e-02],
       [ 1.56127520e-01, -1.25958621e-01],
       [ 1.23654126e+00,  8.94694133e-01],
       [-1.05799073e-03,  8.63121808e-01],
       [ 1.02769791e+00,  9.14656474e-02],
       [ 3.41171612e-01, -1.22419490e+00],
       [-1.41953976e+00, -2.91831685e-02],
       [-9.23532452e-01, -7.48094724e-01],
       [ 3.66332845e-01, -6.07725067e-01],
       [ 6.84098578e-01, -7.71899556e-02],
       [-9.67676368e-01, -1.84691049e+00],
       [ 2.19982229e-01, -3.46805281e-01],
       [ 7.74919092e-01,  1.46766125e-01],
       [ 1.97499605e+00, -8.94798669e-01],
       [ 4.15045758e-02, -8.48313850e-01],
       [-1.69093942e-01,  1.01506382e+00],
       [-1.77122293e+00, -8.14616240e-01]])

In [22]:
greg.transform(X)[0, 0]

-1.4265283812919012

In [23]:
(X[0][0] - greg.mean_[0]) / greg.var_[0]**0.5

-1.4265283812919012

### `.inverse_transform()`

If I ever need to recover my initial values, I can use `.inverse_transform()`:

In [24]:
greg.inverse_transform(greg.transform(X))

array([[-1.43598128,  4.19998753],
       [ 2.06429813,  2.5537868 ],
       [-0.25355399,  2.99512844],
       [-0.92141402,  2.18525974],
       [ 0.55332085,  2.03017395],
       [ 1.91133515,  2.79991875],
       [ 0.35574823,  2.77610788],
       [ 1.64883177,  2.19414862],
       [ 0.78591   ,  1.20191806],
       [-1.42719701,  2.10315901],
       [-0.80374602,  1.56097811],
       [ 0.81753614,  1.66684057],
       [ 1.21694832,  2.06695378],
       [-0.85923223,  0.73228519],
       [ 0.63358233,  1.86361822],
       [ 1.33110418,  2.23585454],
       [ 2.83952786,  1.45033853],
       [ 0.40924678,  1.48539594],
       [ 0.14453726,  2.89069785],
       [-1.86924134,  1.51080964]])

In [25]:
np.allclose(greg.inverse_transform(greg.transform(X)), X)

True