# Scikit-Learn and OOP

In [1]:
import numpy as np

Very often the way you'll interact with `sklearn` will take the form of instantiating an object of a given class. And very often that object will have methods like `.fit()`, `.predict()`, `.score()`, and `.transform()`.

Let's look at a case of this that is already very familiar:

In [2]:
from sklearn.preprocessing import StandardScaler

We'll start by bringing in a StandardScaler object:

In [3]:
ss = StandardScaler()

Of course, we could have called it anything we wanted:

In [4]:
greg = StandardScaler()

Are these two objects the same?

In [5]:
ss == greg

False

Of course not! I can have as many StandardScaler objects as I want.

What attributes and methods are available for a Standard Scaler object? Let's check out the code on [GitHub](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/preprocessing/_data.py)!

## Attributes

### `.scale_`

In [6]:
greg.scale_

AttributeError: 'StandardScaler' object has no attribute 'scale_'

Some attributes and methods don't arise until we've fitted the object.

In [9]:
X1, X2 = np.random.normal(size=20), np.random.normal(loc=2, size=20)

In [10]:
X = list(zip(X1, X2))
X

[(1.4486144268329002, 3.5587873291904337),
 (0.5899961079633507, 0.583575978849977),
 (-0.36446677629367485, -1.654397557446833),
 (-0.8200513515494632, 0.7979549243174489),
 (2.8380943925069277, 1.4087611344386914),
 (0.5354624576771265, 0.516080015144275),
 (0.8981831000752887, 0.34355932186831084),
 (0.12339512149599292, 1.650741534922807),
 (-0.08496548773778145, 1.5256276898278267),
 (1.9094215321972436, -0.5926097094871139),
 (-0.18439148764504223, 2.100823957469769),
 (0.9193763150426615, 2.9153729037410643),
 (-2.9791621698900714, 4.140553971316042),
 (1.8329521965125493, 2.41415426959914),
 (-0.6188817292057505, 0.8389358594524006),
 (0.7922814171699555, 0.331444103610919),
 (-0.9538451753736248, 3.405910005164107),
 (-0.004557123694882684, 2.5093786252767467),
 (-1.5591055952848278, 2.813827793598626),
 (-2.023972549850196, 1.8482317541578543)]

In [11]:
greg.fit(X)

StandardScaler(copy=True, with_mean=True, with_std=True)

In [12]:
greg.scale_

array([1.36688468, 1.42487457])

### `.mean_`

In [13]:
greg.mean_

array([0.11471888, 1.5728357 ])

In [14]:
np.allclose(greg.mean_[0], X1.mean())

True

In [15]:
np.allclose(greg.mean_[1], X2.mean())

True

### `.var_`

In [16]:
greg.var_

array([1.86837374, 2.03026754])

In [17]:
np.allclose(greg.var_[0], X1.var())

True

In [18]:
np.allclose(greg.var_[1], X2.var())

True

### `.n_samples_seen_`

In [19]:
greg.n_samples_seen_

20

## Methods

### `._reset()`

If I ever want to "cancel" the fit, I can do that with `.reset()`:

In [20]:
greg._reset()

In [21]:
greg.n_samples_seen_

AttributeError: 'StandardScaler' object has no attribute 'n_samples_seen_'

In [22]:
greg.fit(X)

StandardScaler(copy=True, with_mean=True, with_std=True)

### `.transform()`

This of course does the main job I want the StandardScaler to do!

In [23]:
greg.transform(X)

array([[ 0.97586546,  1.39377295],
       [ 0.34770836, -0.69427846],
       [-0.35056773, -2.26492445],
       [-0.68386913, -0.54382385],
       [ 1.99239595, -0.11515018],
       [ 0.30781205, -0.74164821],
       [ 0.57317507, -0.86272602],
       [ 0.00634746,  0.05467558],
       [-0.14608721, -0.03313134],
       [ 1.31298761, -1.51974458],
       [-0.21882634,  0.37055069],
       [ 0.58867982,  0.94221431],
       [-2.26345433,  1.80206618],
       [ 1.25704336,  0.59045097],
       [-0.53669532, -0.51506276],
       [ 0.49569839, -0.87122868],
       [-0.78175143,  1.28648117],
       [-0.08726121,  0.65728096],
       [-1.22455427,  0.87094831],
       [-1.56464657,  0.1932774 ]])

In [24]:
greg.transform(X)[0, 0]

0.9758654567510187

In [25]:
(X[0][0] - greg.mean_[0]) / greg.var_[0]**0.5

0.9758654567510187

### `.inverse_transform()`

If I ever need to recover my initial values, I can use `.inverse_transform()`:

In [None]:
greg.inverse_transform(greg.transform(X))

In [None]:
np.allclose(greg.inverse_transform(greg.transform(X)), X)