# Object Oriented Programming

## Agenda
2. Describe what a class is in relation to Object Oriented Programming
3. Write a class definition, instantiate an object, define/inspect parameters, define/call class methods, define/code __init__ 
4. Overview of Inheritance
5. Important data science tools through the lens of objects: Standard Scaler and one-hot-encoder

## 2.  Describe what a class is in relation to Object Oriented Programming

Python is an object-oriented programming language. You'll hear people say that "everything is an object" in Python. What does this mean?

Go back to the idea of a function for a moment. A function is a kind of abstraction whereby an algorithm is made repeatable. So instead of coding:

In [2]:
print(3**2 + 10)
print(4**2 + 10)
print(5**2 + 10)

19
26
35


or even:

In [3]:
for x in range(3, 6):
    print(x**2 + 10)

19
26
35


I can write:

In [4]:
def square_and_add_ten(x):
    return x**2 + 10

Now imagine a further abstraction: Before, creating a function was about making a certain algorithm available to different inputs. Now I want to make that function available to different **objects**.

An object is what we get out of this further abstraction. Each object is an instance of a **class** that defines a bundle of attributes and functions (now, as proprietary to the object type, called *methods*), the point being that **every object of that class will automatically have those proprietary attributes and methods**.

A class is like a blueprint that describes how to create a specific type of object.

![blueprint](img/blueprint.jpeg)


Even Python integers are objects. Consider:

In [5]:
x = 3

We can see what type of object a variable is with the built-in type operator:

In [6]:
type(x)

int

By setting x equal to an integer, I'm imbuing x with the attributes and methods of the integer class.

In [7]:
x.bit_length()

2

In [8]:
x.__float__()

3.0

For more details on this general feature of Python, see [here](https://jakevdp.github.io/WhirlwindTourOfPython/03-semantics-variables.html).

# Exercise

## Look up a different type and find either a class or attribute that you did not know existed

There is a nice library, inspect, which can be used to look at the different attributes and methods associated with builtin objects.


In [9]:
import inspect

example = 1
inspect.getmembers(example)

[('__abs__', <method-wrapper '__abs__' of int object at 0x102d8a5a0>),
 ('__add__', <method-wrapper '__add__' of int object at 0x102d8a5a0>),
 ('__and__', <method-wrapper '__and__' of int object at 0x102d8a5a0>),
 ('__bool__', <method-wrapper '__bool__' of int object at 0x102d8a5a0>),
 ('__ceil__', <function int.__ceil__>),
 ('__class__', int),
 ('__delattr__', <method-wrapper '__delattr__' of int object at 0x102d8a5a0>),
 ('__dir__', <function int.__dir__()>),
 ('__divmod__', <method-wrapper '__divmod__' of int object at 0x102d8a5a0>),
 ('__doc__',
  "int([x]) -> integer\nint(x, base=10) -> integer\n\nConvert a number or string to an integer, or return 0 if no arguments\nare given.  If x is a number, return x.__int__().  For floating point\nnumbers, this truncates towards zero.\n\nIf x is not a number or if base is given, then x must be a string,\nbytes, or bytearray instance representing an integer literal in the\ngiven base.  The literal can be preceded by '+' or '-' and be surround

Below, there are four different built in types. Each person will get a type.  
Use inspect to find methods or attributes that either you:
  - didn't know existsed
  - forgot existed
  - find especially useful

In [10]:
import numpy as np

w = [1,2,3]
x = {1:1, 2:2}
y = 'A string'
z = 1.5

types = ['w', 'x', 'y', 'z']

mccalister = ['Adam', 'Amanda','Chum', 'Dann', 
 'Jacob', 'Jason', 'Johnhoy', 'Karim', 
'Leana','Luluva', 'Matt', 'Maximilian', ]

while len(mccalister) >= 3:
    new_choices = np.random.choice(mccalister, 3, replace=False)
    type_choice = np.random.choice(types, 1)
    types.remove(type_choice)
    print(new_choices, type_choice)
    for choice in new_choices:
        mccalister.remove(choice)


['Adam' 'Jason' 'Leana'] ['w']
['Karim' 'Amanda' 'Chum'] ['x']
['Matt' 'Johnhoy' 'Maximilian'] ['y']
['Jacob' 'Dann' 'Luluva'] ['z']


# 3. Write a class definition, instantiate an object, define/inspect parameters, define/call class methods 

## Classes

We can define **new** classes of objects altogether by using the keyword `class`:

In [11]:
class Car:
    """Transportation object"""
    pass # This called a stub. It will allow us to create an empty class without and error

In [12]:
# Instantiate a car object
ferrari =  Car()
type(ferrari)

__main__.Car

In [13]:
# We can give desceribe the ferrari as having four wheels

ferrari.wheels = 4
ferrari.wheels

4

In [14]:
# But wouldn't it be nice to not have to do that every time? 
# We assume the blueprint of a car will have include the 4 wheels specification
# and assign it as an attribute when building the class

In [15]:
class Car:
    """Automotive object"""
    
    wheels = 4                      # These are attributes of *every* car.


In [16]:
civic = Car()
civic.wheels

4

In [17]:
#  Then we can add more attributes
class Car:
    """Automotive object"""
    
    wheels = 4                      # These are attributes of *every* car.
    doors = 4


In [18]:
ferrari = Car()
ferrari.doors

4

In [19]:
# But a ferrari does not have 4 doors! 
# These attributes can be overwritten 

ferrari.doors = 2
ferrari.doors

2

### Methods

We can also write functions that are associated with each class.  
As said above, a function associated with a class is called a method.

In [20]:
#  Then we can add more attributes
class Car:
    """Automotive object"""
    
    wheels = 4                      # These are attributes of *every* car.
    doors = 4

    def honk(self):                   # These are methods we can call on *any* car.
        print('Beep beep')
    

In [21]:
ferrari = civic = Car()
ferrari.honk()
civic.honk()


Beep beep
Beep beep


Wait a second, what's that `self` doing? 

## Magic Methods

It is common for a class to have magic methods. These are identifiable by the "dunder" (i.e. **d**ouble **under**score) prefixes and suffixes, such as `__init__()`. These methods will get called **automatically**, as we'll see below.

For more on these "magic methods", see [here](https://www.geeksforgeeks.org/dunder-magic-methods-python/).

When we create an instance of a class, Python invokes the __init__ to initialize the object.  Let's add __init__ to our class.


In [22]:
#  Then we can add more attributes
class Car:
    """Automotive object"""
    
    WHEELS = 4                      # Capital letters mean wheels is a constant
    
    def __init__(self, doors, sedan):
        
        self.doors = doors
        self.sedan = sedan
        

    def honk(self):                   # These are methods we can call on *any* car.
        print('Beep beep')
    

By adding doors and moving to init, we need to pass parameters when instantiating the object.

In [23]:
civic = Car(4, True)
civic.doors

4

We can also pass default arguments if there is a value for a certain parameter which is very common.

In [24]:
#  Then we can add more attributes
class Car:
    """Automotive object"""
    
    WHEELS = 4                     
    
    # default arguments included now in __init__
    def __init__(self, doors=4, sedan=False):
        
        self.doors = doors
        self.sedan = sedan
        

    def honk(self):                  
        print('Beep beep')
    

In [25]:
civic = Car(sedan=True)

#### Positional vs. Named arguments

In [26]:
# we can pass our arguments without names
civic = Car(4, True)



In [27]:
# or with names
civic = Car(doors=4, sedan=True)


In [28]:
# or with a mix
civic = Car(4, sedan=True)


In [29]:
# but only when positional precides named
civic = Car(doors = 4, True)

SyntaxError: positional argument follows keyword argument (<ipython-input-29-6046029021d3>, line 2)

In [30]:
# The self argument allows our methods to update our attributes.

#  Then we can add more attributes
class Car:
    """Automotive object"""
    
    WHEELS = 4                     
    
    # default arguments included now in __init__
    def __init__(self, doors=4, sedan=False, driver_mood='peaceful'):
        
        self.doors = doors
        self.sedan = sedan
        self.driver_mood = driver_mood
        

    def honk(self):                  
        print('Beep beep')
        self.driver_mood = 'pissed'
    

In [31]:
civic = Car()
print(civic.driver_mood)
civic.honk()
print(civic.driver_mood)

peaceful
Beep beep
pissed


# Pair

 Let's bring our knowledge together, and in pairs, code out the following:

We have an attribute `moving` which indicates, with a boolean, whether the car is moving or not.  

Fill in the functions stop and go to change the attribute `moving` to reflect the car's present state of motion after the method is called.  Also, include a print statement that indicates the car has started moving or has stopped.

Make sure the method works by calling it, then printing the attribute.


In [32]:
#  Then we can add more attributes
class Car:
    """Automotive object"""
    
    # default arguments included now in __init__
    def __init__(self, doors=4, sedan=False, driver_mood='peaceful'):
        
        self.doors = doors
        self.sedan = sedan
        self.driver_mood = driver_mood
        
    def honk(self):                   # These are methods we can call on *any* car.
        print('Beep beep')
        
    def go(self):
        pass
    
    def stop(self):
        pass

In [82]:
#__SOLUTION__
class Car:
    """Automotive object"""
    
     # default arguments included now in __init__
    def __init__(self, doors=4, sedan=False, driver_mood='peaceful', moving=False):
        
        self.doors = doors
        self.sedan = sedan
        self.moving = moving
        self.driver_mood = driver_mood

    def honk(self):                   # These are methods we can call on *any* car.
        print('Beep beep')
        
    def go(self):
        self.moving = True
        print('Whoa, that\'s some acceleration!')
    
    def stop(self):
        self.moving = False
        print('Screeech!')

In [83]:
# run this code to make sure your 
civic = Car()
print(civic.moving)

civic.go()
print(civic.moving)

civic.stop()
print(civic.moving)

False
Whoa, that's some acceleration!
True
Screeech!
False


## 4. Overview of inheritance

We can also define classes in terms of *other* classes, in which cases the new classes **inherit** the attributes and methods from the classes in terms of which they're defined.

Suppose we decided we want to create an electric car class.

In [84]:
#  Then we can add more attributes
class ElectricCar(Car):
    """Automotive object"""
    
    # default arguments included now in __init__
    def __init__(self, hybrid=False):
        super().__init__()
        self.hybrid = True 

In [85]:
prius = ElectricCar()
prius.honk()

Beep beep


In [100]:
#  And we can overwrite methods and parent attributes
class ElectricCar(Car):
    """Automotive object"""
    
    # default arguments included now in __init__
    def __init__(self, hybrid=False):
        
        # Prius owners are calmer than the average car owner
        super().__init__(driver_mood='serene')
        
        self.hybrid = True
        
    # overwrite inheritd methods
    
    def go(self):
        
        print('Whirrrrrr')
        self.moving = True

In [101]:
prius = ElectricCar()
print(prius.moving)
prius.go()
prius.moving
print(prius.driver_mood)

False
Whirrrrrr
serene


## 5. Important data science tools through the lens of objects: 

We are becomming more and more familiar with a series of methods with names such as fit or fit_transform.

After instantiating an instance of a Standard Scaler, Linear Regression model, or One Hot Encoder, we use fit to learn about the dataset and save what is learned. What is learned is saved in the attributes.

### 1. Standard Scaler 

The standard scaler takes a series and, for each element, computes the absolute value of the difference from the point to the mean of the series, and divides by the standard deviation.

$\Large z = \frac{|x - \mu|}{s}$

What attributes and methods are available for a Standard Scaler object? Let's check out the code on [GitHub](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/preprocessing/_data.py)!

## Attributes

### `.scale_`

In [349]:
from sklearn.preprocessing import StandardScaler

# instantiate a standard scaler object
ss = StandardScaler()

# We can instantiate as many scaler objects as we want
maxs_scaler = StandardScaler()

In [350]:
# Let's create a dataframe with two series

series_1 = np.random.normal(3,1,1000)
print(series_1.mean())
print(series_1.std())

3.0432788793378815
1.0030391214164902


In [351]:
ss.fit(series_1.reshape(-1,1))

# standard deviation is saved in the attribute scale_
ss.scale_

array([1.00303912])

In [352]:
# mean is saved into the attribut mean
ss.mean_

array([3.04327888])

In [353]:
# Knowledge Check

# What value should I put into the standard scaler to make the equality below return 0

ss.transform([])

ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

In [354]:
#__SOLUTION__
ss.transform([ss.mean_])

array([[0.]])

In [355]:
# we can then use these attributes to transform objects
np.random.seed(42)
random_numbers = np.random.normal(3,1, 2)
random_numbers

array([3.49671415, 2.8617357 ])

In [356]:
ss.transform(random_numbers.reshape(-1,1))

array([[ 0.4520614 ],
       [-0.18099312]])

In [357]:
# We can also use a scaler on a DataFrame
series_1 = np.random.normal(3,1,1000)
series_2 = np.random.uniform(0,100, 1000)
df_2 = pd.DataFrame([series_1, series_2]).T
ss_df = StandardScaler()
ss_df.fit_transform(df_2)


array([[ 0.63918361, -1.63325007],
       [ 1.53240185,  1.50265028],
       [-0.260668  , -1.56258467],
       ...,
       [ 0.56254398, -1.61544876],
       [ 1.40620165, -1.36827099],
       [ 0.92178475, -0.56807826]])

In [358]:
ss_df.transform([[5, 50]])

array([[ 2.01911307, -0.00948621]])

## Exercise One-hot Encoder

Another object that you will use often is OneHotEncoder from sklearn. It is recommended over pd.get_dummies() because it can trained, with the learned informed stored in the attributes of the object.

In [359]:
from sklearn.preprocessing import OneHotEncoder

In [360]:
np.random.seed(42)
# Let's create a dataframe that has days of the week and number of orders. 

days = np.random.choice(['m','t', 'w','th','f','s','su'], 1000)
orders = np.random.randint(0,1000,1000)

df = pd.DataFrame([days, orders]).T
df.columns = ['days', 'orders']
df.head()

Unnamed: 0,days,orders
0,su,758
1,th,105
2,f,562
3,su,80
4,w,132


Let's interact with an important parameters which we can pass when instantiating the OneHotEncoder object:` drop`.  

By dropping column, we avoid the [dummy variable trap](https://en.wikipedia.org/wiki/Dummy_variable_(statistics)).  

By passing `drop = True`, sklearn drops the first category it happens upon.  In this case, that is 'su'.  But what if we want to drop 'm'.  We can pass an array like object in as parameter to specify which column to drop.




In [361]:
# Instantiate the OHE object with a param that tells it to drop Monday
ohe = None

In [362]:
#__SOLUTION__
# Instantiate a OneHotEncoder object

ohe = OneHotEncoder(drop=['m'])

In [363]:
# Now, fit_transform the days column of the dataframe

ohe_array = None

In [364]:
#__SOLUTION__
ohe_matrix = ohe.fit_transform(df[['days']])

In [365]:
# look at __dict__ and checkout drop_idx_
# did it do what you wanted it to do?
ohe.__dict__

{'categories': 'auto',
 'sparse': True,
 'dtype': numpy.float64,
 'handle_unknown': 'error',
 'drop': array(['m'], dtype=object),
 'categories_': [array(['f', 'm', 's', 'su', 't', 'th', 'w'], dtype=object)],
 'drop_idx_': array([1])}

In [366]:
# check out the categories_ attribute
ohe.categories_

[array(['f', 'm', 's', 'su', 't', 'th', 'w'], dtype=object)]

In [367]:
# Check out the object itself
ohe_matrix

<1000x6 sparse matrix of type '<class 'numpy.float64'>'
	with 844 stored elements in Compressed Sparse Row format>

It is a sparse matrix, which is a matrix that is composed mostly of zeros

In [368]:
# We can convert it to an array like so
oh_df = pd.DataFrame.sparse.from_spmatrix(ohe_matrix)

In [369]:
# Now, using the categories_ attribute, set the column names to the correct days of the week
# you can use drop_idx_ for this as well



In [370]:
#__SOLUTION__
ohe_columns = list(ohe.categories_[0])
ohe_columns.pop(int(ohe.drop_idx_))
oh_df.columns = ohe_columns
oh_df.head()
oh_df.columns = ohe_columns
oh_df.head()

Unnamed: 0,f,s,su,t,th,w
0,0.0,0.0,1.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,1.0,0.0
2,1.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,1.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,1.0


In [371]:
# Now, add the onehotencoded columns to the original df, and drop the days column


In [372]:
#__SOLUTION__
# Now, add the onehotencoded columns to the original df, and drop the days column

df = df.join(oh_df).drop('days', axis=1)
df.head()

Unnamed: 0,orders,f,s,su,t,th,w
0,758,0.0,0.0,1.0,0.0,0.0,0.0
1,105,0.0,0.0,0.0,0.0,1.0,0.0
2,562,1.0,0.0,0.0,0.0,0.0,0.0
3,80,0.0,0.0,1.0,0.0,0.0,0.0
4,132,0.0,0.0,0.0,0.0,0.0,1.0
