# Python Object Oriented Programming


  

The basic idea behind OOP isn't difficult to understand. However, the *motivation* is often unclear to newcomers and difficult to explain.  

To help motivate OOP, I'll start with the familiar topic of *functions*.


## What are Functions?

Functions take inputs known as "arguments" and return an output.  

Functions in python are largely the same as functions in other languages. Python has useful builtin functions like max and range. But there are plenty of times when a programmer needs to define his own custom functions.  

I assume most people here are familiar with writing functions. As a reminder, here's the basic form in python:



In [2]:
def func_name(arg1, arg2, argN):
    '''
    Documentation here.
    '''
    do_stuff_with_args
    return output

## Why use Functions?

It's important to understand the motivation behind functions because without it you'll probably end up writing some terrible code. The motivation to use functions can be boiled down to two reasons:  
* Reusability
* Abstraction

### Reusability

Functions let you re-use logic you've already written. This makes your programs shorter, which makes them easier to read and write. It also means any change in logic only needs to be updated once in the function rather than scattered throughout your code.  

Suppose we wanted to calculate the factorial of 10. We might do this:


In [3]:
#factorial of 10
output = 1
for i in range(1, 10+1):
    output *= i
print(output)

3628800


Great! That's the answer we're looking for.  

But now imagine printing the factorial of 3, 10, and 7 without using functions.

In [4]:
#Factorials of 1, 10, and 7 without functions
#factorial of 3
output = 1
for i in range(1, 3+1):
    output *= i
print(output)

#factorial of 10
output = 1
for i in range(1, 10+1):
    output *= i
print(output)

#factorial of 7
output = 1
for i in range(1, 7+1):
    output *= i
print(output)

6
3628800
5040


It takes a long time to write, and is very error prone. Imagine we wanted to fix our factorial formula to handle cases of negative integers. Now we have to update the code in three separate places.  

Instead, we could write a single function and call it three times:

In [5]:
def fact(n):
    '''
    Returns n factorial. n is a non-negative integer.
    '''
    output = 1
    for i in range(1, n+1):
        output*= i
    return output

#Print the factorial of several numbesr
print(fact(3))
print(fact(10))
print(fact(7))

6
3628800
5040


Much faster, and less error prone!  

### Abstraction

Abstraction in this context means thinking about the function's high level purpose rather than low level implementation details. Put another way, it means thinking about the function's interface rather than what goes on under the hood.  

A function's interface is sometimes known as an **Application Programming Interface**, or API. It can be boiled down to:  
* Inputs. What arguments do I need to supply the function?
* Outputs. What outputs can I expect the function to return?  

As long as you understand a function's interface, you can use it without knowing what happens under the hood. For example, we'll later learn that reading a CSV can be as simple as:

In [None]:
table = pd.read_csv('filename')

There's a lot that goes into reading and parsing a CSV file, but it's much easier to think at this level without worrying about detailed implementation. Here are a few more examples of abstraction:
* Driving a Tesla vs a gas powered car. The API is the same (steering wheel, pedals etc.). The implementation is radically different.  
* Ordering a burger at a restaurant. Inputs: money, order. Outputs: burger, change. Kitchens probably differ in various restaurants, but users have no idea.  

A good function should perform a single logical task. It would be bad if, for example, the steering wheel both turned the car and changed the volume.  

Abstractions help us think at different levels at different times. Our brains have only so much RAM. We need abstraction to be able to code effectively. Are we thinking at the atomic level, cellular level, organism level, societ level? If you're working on a random forest, it might not be helpful to think of each tree, each branch of each tree, or each leaf of each branch all at once.  

Abstraction is often going to be nested. Returning to the restaurant example, when you order fries you don't worry about where the deep frier is. When the employee in the kitchen pushes a button to lower the frozen fries into the boiling grease, he doesn't worry about what software the machine is running on.
Practically, this means your functions will often call other functions. For example, if you wanted to write a combinatorial function (e.g. "n choose k"), it makes sense to use our factorial function.

In [7]:
def choose(n , k):
    '''
    Returns the number of unique combinations
    of k elements selected from a population of n members.
    Formula is n!/(k!*(n-k)!)
    '''
    return fact(n)/(fact(k)*(fact(n-k)))

print(choose(10, 8))


45


We were able to code up *n choose k*, a fairly complex idea, with two relatively straightforward functions.  


## Isn't This About OOP?

Back to Object Oriented Programming.  

OOP takes functions to the next level. That is, it allows for even greater levels of **abstraction** and **reusability** than standalone functions.  

OOP *bundles several related functions and data together into what is called an "object."* Object Oriented Programming is a way of thinking about our programs as a set of objects that interact with each other. It’s meant to mimic objects in the real world, which is a very natural way for humans to think. Objects have attributes or facts about the object. Objects also can perform various functions, which we call methods.  

Since we're at 84.51, let's use the example of a Logistic Regression and a Decision Tree. Using ordinary functions, our program might look like this:

In [32]:
#Here are the model parameters and data
model_1 = 'logistic'
param_1 = 0.1 #Regularization term

model_2 = 'decision_tree'
param_2 = 5 #Max depth of tree

X = [[1, 2, 3], 
     [2, 4, 6], 
     [3, 6, 9], 
     [4, 8, 12], 
     [5, 10, 15]] #Some fake training data
y = [0,0,0,1,1] #Some fake target data



And suppose we train the model using a train_model function like this:

In [10]:
#assume "train_model" is a function that runs whatever model we indicate
coefficients_1 = train_model(model_1, param_1, X, y)
coefficients_2 = train_model(model_2, param_2, X, y)

Note that the first set of coefficients are actual coefficients from a logistic. But the second set is a trained decision tree. This is already a bit odd, but just assume the function can handle it for now.  

Now let's make some predictions:

In [None]:
pred_1 = predict_model(coefficients_1, X)
pred_2 = predict_model(coefficients_2, X)

Presumably the output here will be a vector of predictions that are similar to y.  

At this point we might want to compare the training error rates of the two models. Presumably now we have to pass the results, pred_1 and pred_2, to some other function. Now was were the parameters associated with each model? We might have forgotten by now... and what if we wanted to run 3, 4 or dozens of models?  

The point I'm making is there are lots of functions, constants, lists, and outputs all floating around in this common space. These are all highly conceptually related, so it might make sense to make as single "LogisticRegresion" object that contains:
* Model parameters (like regularization terms)
* A "fit" method, which is a function that learns the relationship between X and y
* A vector of learned coefficients that resulted from the training
* A "predict" method, which is a function that predicts y_hat based on new X observations
* A training (or testing) model score that tells us how well our model is doing

As it turns out, this exists in scikit-learn.

In [33]:
from sklearn.linear_model import LogisticRegression

lr = LogisticRegression('l2', C = 1)
lr.fit(X, y)
y_hat = lr.predict(X)

array([0, 0, 1, 1, 1])

In [46]:
lr.score(X, y)

0.80000000000000004

In summary, the LogisticRession is an **Object**.  
It has the following "**methods**" (which are functions attached to the object):
* fit --trains a model
* predict --runs a model on new data
* score --gives an accuracy score for the model
* and many more...

It also has the following "**attributes**" (which are data attached to the object):
* C --a regularization parameter
* coef_ --the learned model coefficients
* classes_ --the unique classes in the classification problem
* and many more...


## Classes vs Objects

In the interest of code re-use, I'm going to borrow a "car" example that I've used before.  

### Car Class

A “Class” is a blueprint for a specific object. It’s like the idea of a car. I can say “I love cars” and you know generally what I’m talking about. I’m not talking about a specific car, but the idea of cars in general.  

When defining a Car class, we define what attributes is has and what methods (functionality) it can perform. For example, it will have the attributes make, model, color, mileage, location, and fuel_level among others. It will be able to perform methods (functions) like drive, stop, and honk. We access these “attributes” and “methods” using dot-notation. To get the mileage we would type:  

    Car.mileage

To honk the horn we would use:  

    Car.honk()

Note we can distinguish attributes and methods because methods use parentheses. Methods are simply functions, which is why they use parentheses like functions.  

### Car Object
So far we’ve only talked about Cars in general (i.e. the car “class”). Now let’s talk about creating a specific car (i.e. a car “object”).  
To create a specific car with particular make, model etc. we would type the following:  

```python

    kevins_car = Car(make = “honda”, model = “civic”, color = “black”, mileage = 65000, location = [-84.51, 39.10], fuel_level = 0.75)

```

To create a much cooler car, it might look like this:  

```python

    elons_car = Car(make = “tesla”, model = “s”, color = “red” …etc.)

```

We now have two specific car objects, each of which is a member of the Car class. In particular, we have one standard car called kevins_car and one amazing car called elons_car.  
To make kevins_car honk we type:  

```python

    kevins_car.honk()

```

Some methods like honk() don’t change any of the object’s attributes. It simply honks, and that’s that. Other methods might actually change the attributes of the car. For example if we type:  
```python

    kevins_car.drive()

```

That will change the value of kevins_car.location. It should also increase the value of kevins_car.mileage and decrease the value of kevins_car.fuel_level.



In [43]:
class Car():
    def __init__(self, make, model, color, mileage, location, fuel_level):
        self.make = make
        self.model = model
        self.color = color
        self.mileage = mileage
        self.location = location
        self.fuel_level = fuel_level
        
    def honk(self):
        print("HONK!")
  
    def drive(self):
        #Simplify and assume car always drives east
        self.location[0] -= 0.1
        self.mileage += 1
        self.fuel_level -= 0.1


kevins_car = Car(make = "honda", model = "civic", color = "black", mileage = 65000,
                 location = [-84.51, 39.10], fuel_level = 0.75)

print("Statistics before driving:")
print(kevins_car.mileage)
print(kevins_car.fuel_level)
print(kevins_car.location)

print("Statistics after driving")
kevins_car.drive()
print(kevins_car.mileage)
print(kevins_car.fuel_level)
print(kevins_car.location)

Statistics before driving:
65000
0.75
[-84.51, 39.1]
Statistics after driving
65001
0.65
[-84.61, 39.1]


## Inheritance and Subclasses

OOP allows you to modify existing classes through a concept called **inheritance**.  

With inheritance, you have a superclass that has certain functionality and a subclass that modifies or adds some functionality, but otherwise retains all the characteristics of the superclass.  

To continue with our car example, if we wanted to make a CompactCar or FullSizeCar class, rather than re-build the entire class from the ground up, we can "inherit" from the basic Car class. This automatically gives us all of the functionality of a car. We can then modify just the functionality we need.

In [45]:
class CompactCar(Car):
    '''
    A subclass of car that is smaller, and therefore has
    a quieter honk method.
    '''
    def honk(self):
        print("Beep!")
        
        
        

class FullSizeCar(Car):
    '''
    A subclass of car that is larger, and therefore has
    a louder honk method.
    '''

    def honk(self):
        print("AOOOGGAAA!!!")
        

little_car = CompactCar(make = "honda", model = "civic", color = "black", mileage = 65000,
                 location = [-84.51, 39.10], fuel_level = 0.75)

big_car = FullSizeCar(make = "honda", model = "civic", color = "black", mileage = 65000,
                 location = [-84.51, 39.10], fuel_level = 0.75)


kevins_car.honk()
little_car.honk()
big_car.honk()

HONK!
Beep!
AOOOGGAAA!!!
