# Object Oriented Programming in Python

Noah Markowitz

February 2019

## What is Objected Oriented Programming

Objected Oriented Programming (OOP):
* Way to make logical groups of variables and functions reusable for generalized tasks
* A way to build flexible and reusable code
* It works like a recipe where you can supply the data and make it go through a series of specific commands

As opposed to Object-Oriented programming, there's *Imperative Coding*

**OOP**

```
class PrintList:
    def __init__(self,numberlist):
        self.numberlist = numberlist
    
    def print_list(self):
        for item in self.numberlist:
            print(f"Item {item}")

A = PrintList([1,2,3])
A.print_list()
```

**Imperative**


```
our_list = [1,2,3]
for item in our_list:
    print(f"Item {item}")
```


As the quantity of data gets larger, with the OOP style code, it becomes a bit more practical rather than making loops and loops

### Reviewing Functions

In this exercise, we will review functions, as they are key building blocks of object-oriented programs. For this, we will create a simple function `average_numbers()` which averages a list of numbers. 

In [1]:
# Create function that returns the average of an integer list
def average_numbers(num_list): 
    avg = sum(num_list)/float(len(num_list)) # divide by length of list
    return avg

# Take the average of a list: my_avg
my_avg = average_numbers([1,2,3,4,5,6])

# Print out my_avg
print(my_avg)

3.5


In [2]:
# Create a list that contains two lists: matrix
matrix = [[1,2,3,4], [5,6,7,8]]

# Print the matrix list
print(matrix)

[[1, 2, 3, 4], [5, 6, 7, 8]]


### Internals of Numpy

Essentially Numpy is the basis of pandas and many other packages

Numpy arrays, unlike lists, can have multiple axes, they can be a n-dimensional matrix as opposed to a 1D list.

Axis 0 - Rows
Axis 1 - Columns

Each axis is like a python list

Create a function that returns a NumPy array
In this exercise, we'll continue working with the numpy package and our previous structures.
We'll create a NumPy array of the float (numerical) data type so that we can work with a multi-dimensional data objects, much like columns and rows in a spreadsheet.

In [3]:
# Import numpy as np
import numpy as np

# List input: my_matrix
my_matrix = [[1,2,3,4], [5,6,7,8]] 

# Function that converts lists to arrays: return_array
def return_array(matrix):
    array = np.array(matrix, dtype = float)
    return array
    
# Call return_array on my_matrix, and print the output
print(return_array(my_matrix))

[[1. 2. 3. 4.]
 [5. 6. 7. 8.]]


### Intro to Objects

What if we want to create the same `nd.array()` multiple times or reference it in different pieces of code? Use classes. 

`class` - Reusable chunk of code that has functions and variables

Vocabulary changes a bit with this
* Attribute/Class = Variable
* Method = Function

Classes are like a cookie cutter, and an object is the cookie made from the cookie cutter. 
* We can use the cookie cutter to make multiple cookies and each can have different fillings or toppings
* However all cookies share similar properties
* Class is a template for an object

Creating a class
We're going to be working on building a class, which is a way to organize functions and variables in Python.

In [4]:
# Create a class: DataShell
class DataShell: 
    pass

`pass` means no context or values are being put into the class yet. It will skip over and go to the next executable piece of code. 

Remember: **Objects are instances of Classes. Classes are templates**, not the other way around

---


## Deep Dive into Classes and Objects

Classes have a constructor function `__init__` to create the instance. Functions defined within the class definition are methods

Parts of a python class are:
* Constructor
* Methods
* Attributes

Here's what a fully constructed class looks like
```
class DataShell:
    #constructor
    def __init__(self, filename):
        self.filename = filename

    def create_datashell(self):
        self.array = np.genfromtxt(self.filename, delimiter=',', dtype=None)
        return self.array

    def rename_column(self, old_colname, new_colname):
        for index, value in enumerate(self.array[0]):
            if value == old_colname.encode('UTF-8'):
                self.array[0][index] = new_colname
        return self.array

    def show_shell(self):
        print(self.array)

    def five_figure_summary(self, col_pos):
        statistics = stats.describe(self.array[1:,col_pos].astype(np.float))
        return f"Five-figure stats of column {col_position}: {statistics}"
```

### Creating an Instance

As we learned earlier, a class is like a blueprint: we can make many copies of our class. When we do this, we say that we are instantiating our class. These instances are called objects.

Here is an example of class instantiation:

`object_name = ClassName()`

In [5]:
# Create empty class: DataShell
class DataShell:
  
    # Pass statement
    pass

# Instantiate DataShell: my_data_shell
my_data_shell = DataShell()

# Print my_data_shell
print(my_data_shell)

<__main__.DataShell object at 0x1069a70b8>


### Initializing a class 

The `__init__` method and the `self` parameter for objects

The `__init__` method is the constructor. Called automatically when object is created and therefore sets values for the object. It's like the container for all the ingredients for the recipe of an object

`self` represents the instance of the class. It references the current instance of the class

EX:

```
class Dinosaur:
    def __init__(self):
        self.tail = 'Yes'
```

In [6]:
# Create class: DataShell
class DataShell:
  
	# Initialize class with self argument
    def __init__(self):
      
        # Pass statement
        pass

# Instantiate DataShell: my_data_shell
my_data_shell = DataShell()

# Print my_data_shell
print(my_data_shell)

<__main__.DataShell object at 0x1069a73c8>


#### Instance Variables

Class instances are useful in that we can store values in them at the time of instantiation. We store these values in instance variables. This means that we can have many instances of the same class whose instance variables hold different values!

In [7]:
# Create class: DataShell
class DataShell:
  
	# Initialize class with self and integerInput arguments
    def __init__(self, integerInput):
      
		# Set data as instance variable, and assign the value of integerInput
        self.data = integerInput

# Declare variable x with value of 10
x = 10    

# Instantiate DataShell passing x as argument: my_data_shell
my_data_shell = DataShell(x)

# Print my_data_shell
print(my_data_shell.data)

10


Above `data` is an instance variable

Notice that instance variables live in the body of the initialization method, as they are initialized when the object is instantiated. 

Also important to notice that they are preceded by self., as this is referring to the instance itself.

Now we'll create multiple isntance variables. We will declare two instance variables: `identifier` and `data`

In [8]:
# Create class: DataShell
class DataShell:
  
	# Initialize class with self, identifier and data arguments
    def __init__(self, identifier, data):
      
		# Set identifier and data as instance variables, assigning value of input arguments
        self.identifier = identifier
        self.data = data

# Declare variable x with value of 100, and y with list of integers from 1 to 5
x = 100
y = [1, 2, 3, 4, 5]

# Instantiate DataShell passing x and y as arguments: my_data_shell
my_data_shell = DataShell(x,y)

# Print my_data_shell.identifier
print(my_data_shell.identifier)

# Print my_data_shell.data
print(my_data_shell.data)

100
[1, 2, 3, 4, 5]


#### Class and Instance Variables

Static Variables - Variables that don't change no matter what we do to members of class

Instance Variable - Set when the class is called and passed

**Class Variables**

We saw that we can specify different instance variables. These are a form of static variables

But, what if we want any instance of a class to hold the same value for a specific variable? Enter class variables.

Class variables must not be specified at the time of instantiation and instead, are declared/specified at the class definition phase.

In [9]:
# Create class: DataShell
class DataShell:
  
    # Declare a class variable family, and assign value of "DataShell"
    family = "DataShell"
    
    # Initialize class with self, identifier arguments
    def __init__(self, identifier):
      
        # Set identifier as instance variable of input argument
        self.identifier = identifier

# Declare variable x with value of 100
x = 100

# Instantiate DataShell passing x as argument: my_data_shell
my_data_shell = DataShell(x)

# Print my_data_shell class variable family
print(my_data_shell.family)

DataShell


**Overriding Class Variables**

Sometimes our object instances have class variables whose values are not correct, and hence, not useful. For this reason it makes sense to modify our object's class variables.

In [10]:
# Create class: DataShell
class DataShell:
  
    # Declare a class variable family, and assign value of "DataShell"
    family = "DataShell"
    
    # Initialize class with self, identifier arguments
    def __init__(self, identifier):
      
        # Set identifier as instance variables, assigning value of input arguments
        self.identifier = identifier

# Declare variable x with value of 100
x = 100

# Instantiate DataShell passing x and y as arguments: my_data_shell
my_data_shell = DataShell(x)

# Print my_data_shell class variable family
print(my_data_shell.family)

# Override the my_data_shell.family value with "NotDataShell"
my_data_shell.family = "NotDataShell"

# Print my_data_shell class variable family once again
print(my_data_shell.family)

DataShell
NotDataShell


### Methods in Classes

Methods are functions within classes

Below a simple method `print_static()` is defined and run

In [11]:
# Create class: DataShell
class DataShell:
  
	# Initialize class with self argument
    def __init__(self):
        pass
      
	# Define class method which takes self argument: print_static
    def print_static(self):
        # Print string
        print("You just executed a class method!")
        
# Instantiate DataShell taking no arguments: my_data_shell
my_data_shell = DataShell()

# Call the print_static method of your newly created object
my_data_shell.print_static()

You just executed a class method!


Now another

In [12]:
# Create class: DataShell
class DataShell:
  
	# Initialize class with self and dataList as arguments
    def __init__(self, dataList):
      	# Set data as instance variable, and assign it the value of dataList
        self.data = dataList
        
	# Define class method which takes self argument: show
    def show(self):
        # Print the instance variable data
        print(self.data)

# Declare variable with list of integers from 1 to 10: integer_list   
integer_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
        
# Instantiate DataShell taking integer_list as argument: my_data_shell
my_data_shell = DataShell(integer_list)

# Call the show method of your newly created object
my_data_shell.show()

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


A more interesting one

In [13]:
# Create class: DataShell
class DataShell:
  
	# Initialize class with self and dataList as arguments
    def __init__(self, dataList):
      	# Set data as instance variable, and assign it the value of dataList
        self.data = dataList
        
	# Define method that prints data: show
    def show(self):
        print(self.data)
        
    # Define method that prints average of data: avg 
    def avg(self):
        # Declare avg and assign it the average of data
        avg = sum(self.data)/float(len(self.data))
        # Print avg
        print(avg)
        
# Instantiate DataShell taking integer_list as argument: my_data_shell
my_data_shell = DataShell(integer_list)

# Call the show and avg methods of your newly created object
my_data_shell.show()
my_data_shell.avg()

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
5.5


## Fancy classes, fancy objects

Now to apply classes to data

Using the `return` statement

In [14]:
# Create class: DataShell
class DataShell:
  
	# Initialize class with self and dataList as arguments
    def __init__(self, dataList):
      	# Set data as instance variable, and assign it the value of dataList
        self.data = dataList
        
	# Define method that returns data: show
    def show(self):
        return self.data
        
    # Define method that prints average of data: avg 
    def avg(self):
        # Declare avg and assign it the average of data
        avg = sum(self.data)/float(len(self.data))
        # Return avg
        return avg

integer_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Instantiate DataShell taking integer_list as argument: my_data_shell
my_data_shell = DataShell(integer_list)

# Print output of your object's show method
print(my_data_shell.show())

# Print output of your object's avg method
print(my_data_shell.avg())

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
5.5


In [15]:
# Load numpy as np and pandas as pd
import numpy as np
import pandas as pd

# Create class: DataShell
class DataShell:
  
    # Initialize class with self and inputFile
    def __init__(self, inputFile):
        self.file = inputFile
        
    # Define generate_csv method, with self argument
    def generate_csv(self):
        self.data_as_csv = pd.read_csv(self.file)
        return self.data_as_csv

# Instantiate DataShell with us_life_expectancy as input argument
data_shell = DataShell('us_life_expectancy.csv')

# Call data_shell's generate_csv method, assign it to df
df = data_shell.generate_csv()

# Print df
print(df.head())

         country code  year  life_expectancy
0  United States  USA  1880        39.410000
1  United States  USA  1890        45.209999
2  United States  USA  1901        49.299999
3  United States  USA  1902        50.500000
4  United States  USA  1903        50.599998


### Data as Attributes

In the previous coding exercise you wrote a method within your DataShell class that returns a Pandas Dataframe.

In this one, we will cook the data into our class, as an instance variable. This is so that we can do fancy things later, such as renaming columns, as well as getting descriptive statistics.

In [16]:
# Import numpy as np, pandas as pd
import numpy as np
import pandas as pd

# Create class: DataShell
class DataShell:
  
    # Define initialization method
    def __init__(self, filepath):
        # Set filepath as instance variable  
        self.filepath = filepath
        # Set data_as_csv as instance variable
        self.data_as_csv = pd.read_csv(filepath)

# Instantiate DataShell as us_data_shell
us_data_shell = DataShell('us_life_expectancy.csv')

# Print your object's data_as_csv attribute
print(us_data_shell.data_as_csv.head())

         country code  year  life_expectancy
0  United States  USA  1880        39.410000
1  United States  USA  1890        45.209999
2  United States  USA  1901        49.299999
3  United States  USA  1902        50.500000
4  United States  USA  1903        50.599998


Now your classes have the ability of storing data as instance variables, which means you can exercute methods on them!

In [17]:
# Create class DataShell
class DataShell:
  
    # Define initialization method
    def __init__(self, filepath):
        self.filepath = filepath
        self.data_as_csv = pd.read_csv(filepath)
    
    # Define method rename_column, with arguments self, column_name, and new_column_name
    def rename_column(self, column_name, new_column_name):
        self.data_as_csv.columns = self.data_as_csv.columns.str.replace(column_name, new_column_name)

# Instantiate DataShell as us_data_shell with argument us_life_expectancy
us_data_shell = DataShell('us_life_expectancy.csv')

# Print the datatype of your object's data_as_csv attribute
print(us_data_shell.data_as_csv.dtypes)

# Rename your objects column 'code' to 'country_code'
us_data_shell.rename_column('code', 'country_code')

# Again, print the datatype of your object's data_as_csv attribute
print(us_data_shell.data_as_csv.dtypes)

country             object
code                object
year                 int64
life_expectancy    float64
dtype: object
country             object
country_code        object
year                 int64
life_expectancy    float64
dtype: object


#### Self-Describing DataShells

In this exercise you will add functionality to your DataShell class such that it returns information about itself.

In [18]:
# Create class DataShell
class DataShell:

    # Define initialization method
    def __init__(self, filepath):
        self.filepath = filepath
        self.data_as_csv = pd.read_csv(filepath)

    # Define method rename_column, with arguments self, column_name, and new_column_name
    def rename_column(self, column_name, new_column_name):
        self.data_as_csv.columns = self.data_as_csv.columns.str.replace(column_name, new_column_name)
        
    # Define get_stats method, with argument self
    def get_stats(self):
        # Return a description data_as_csv
        return self.data_as_csv.describe()
    
# Instantiate DataShell as us_data_shell
us_data_shell = DataShell('us_life_expectancy.csv')

# Print the output of your objects get_stats method
print(us_data_shell.get_stats())

              year  life_expectancy
count   117.000000       117.000000
mean   1956.752137        66.556684
std      34.398252         9.551079
min    1880.000000        39.410000
25%    1928.000000        58.500000
50%    1957.000000        69.599998
75%    1986.000000        74.772003
max    2015.000000        79.244003


### OOP Best Practices

3 good things to do 

1. Read other peoples code

2. Use PEP 8 style

3. Use combo of the two

One of the best ways to get good at coding is to see other peoples code

Documentation is also critical

## Inheritance

About reusability. Classes can be copied with subtle or major changes. 

Ex: Parent class is "Dinosaur" and child classes that inherit the properties of "Dinosaur" can be "TRex" and "Pterodactyl"

Inheritance is useful because you can extend the functionality of your current class without overwriting it.

**Easy Key to tell Inheritance**

* A Pterodactyl is-a Dinosaur
* A Tyrannosaurus is-a Dinosaur
* Is a Pterodactyl a dinosaur? Yes,pterodactyl inherits from dinosaur.
* Is a Tyranosaurus a pterodactyl? No, but they're both dinosaurs.
* Is a dinosaur a pterodactyl? No, so it doesn't work the other way, either

Below is an example of inheritance using animals. There's a general class, `Animal`, and `Mammal` and `Reptile` are Animals. 

In [19]:
# Create a class Animal
class Animal:
	def __init__(self, name):
		self.name = name

# Create a class Mammal, which inherits from Animal
class Mammal(Animal):
	def __init__(self, name, animal_type):
		self.animal_type = animal_type

# Create a class Reptile, which also inherits from Animal
class Reptile(Animal):
	def __init__(self, name, animal_type):
		self.animal_type = animal_type

# Instantiate a mammal with name 'Daisy' and animal_type 'dog': daisy
daisy = Mammal('Daisy', 'dog')

# Instantiate a reptile with name 'Stella' and animal_type 'alligator': stella
stella = Reptile('Stella', 'alligator')

# Print both objects
print(daisy)
print(stella)

<__main__.Mammal object at 0x1158d3f60>
<__main__.Reptile object at 0x1158d3e48>


In [20]:
# Create a class Vertebrate
class Vertebrate:
    spinal_cord = True
    def __init__(self, name):
        self.name = name

# Create a class Mammal, which inherits from Vertebrate
class Mammal(Vertebrate):
    def __init__(self, name, animal_type):
        self.animal_type = animal_type
        self.temperature_regulation = True

# Create a class Reptile, which also inherits from Vertebrate
class Reptile(Vertebrate):
    def __init__(self, name, animal_type):
        self.animal_type = animal_type
        self.temperature_regulation = False

# Instantiate a mammal with name 'Daisy' and animal_type 'dog': daisy
daisy = Mammal('Daisy', 'dog')

# Instantiate a reptile with name 'Stella' and animal_type 'alligator': stella
stella = Reptile('Stella', 'alligator')

# Print stella's attributes spinal_cord and temperature_regulation
print("Stella Spinal cord: " + str(stella.spinal_cord))
print("Stella temperature regularization: " + str(stella.temperature_regulation))

# Print daisy's attributes spinal_cord and temperature_regulation
print("Daisy Spinal cord: " + str(daisy.spinal_cord))
print("Daisy temperature regularization: " + str(daisy.temperature_regulation))

Stella Spinal cord: True
Stella temperature regularization: False
Daisy Spinal cord: True
Daisy temperature regularization: True


**Now Applying Inheritance to DataShell**

In [21]:
# Load numpy as np and pandas as pd
import numpy as np
import pandas as pd

# Create class: DataShell
class DataShell:
    def __init__(self, inputFile):
        self.file = inputFile

# Instantiate DataShell as my_data_shell
my_data_shell = DataShell('us_life_expectancy.csv')

# Print my_data_shell
print(my_data_shell)

<__main__.DataShell object at 0x1158ccc18>


In [22]:
# Load numpy as np and pandas as pd
import numpy as np
import pandas as pd

# Create class: DataShell
class DataShell:
    def __init__(self, inputFile):
        self.file = inputFile

# Create class CsvDataShell, which inherits from DataShell
class CsvDataShell(DataShell):
    # Initialization method with arguments self, inputFile
    def __init__(self, inputFile):
        # Instance variable data
        self.data = pd.read_csv(inputFile)

# Instantiate CsvDataShell as us_data_shell, passing us_life_expectancy as argument
us_data_shell = CsvDataShell('us_life_expectancy.csv')

# Print us_data_shell.data
print(us_data_shell.data.head())

         country code  year  life_expectancy
0  United States  USA  1880        39.410000
1  United States  USA  1890        45.209999
2  United States  USA  1901        49.299999
3  United States  USA  1902        50.500000
4  United States  USA  1903        50.599998


## Composition

Taking elements of several classes to create a "Frankenstein class". One example is using pandas objects as data in classes

In [23]:
# Define abstract class DataShell
class DataShell:
    # Class variable family
    family = 'DataShell'
    # Initialization method with arguments, and instance variables
    def __init__(self, name, filepath): 
        self.name = name
        self.filepath = filepath

# Define class CsvDataShell      
class CsvDataShell(DataShell):
    # Initialization method with arguments self, name, filepath
    def __init__(self, name, filepath):
        # Instance variable data
        self.data = pd.read_csv(filepath)
        # Instance variable stats
        self.stats = self.data.describe()

# Instantiate CsvDataShell as us_data_shell
us_data_shell = CsvDataShell("US", 'us_life_expectancy.csv')

# Print us_data_shell.stats
print(us_data_shell.stats)

              year  life_expectancy
count   117.000000       117.000000
mean   1956.752137        66.556684
std      34.398252         9.551079
min    1880.000000        39.410000
25%    1928.000000        58.500000
50%    1957.000000        69.599998
75%    1986.000000        74.772003
max    2015.000000        79.244003


In [24]:
# Define abstract class DataShell
class DataShell:
    family = 'DataShell'
    def __init__(self, name, filepath): 
        self.name = name
        self.filepath = filepath

# Define class CsvDataShell
class CsvDataShell(DataShell):
    def __init__(self, name, filepath):
        self.data = pd.read_csv(filepath)
        self.stats = self.data.describe()

# Define class TsvDataShell
class TsvDataShell(DataShell):
    # Initialization method with arguments self, name, filepath
    def __init__(self, name, filepath):
        # Instance variable data
        self.data = pd.read_table(filepath)
        # Instance variable stats
        self.stats = self.data.describe()

# Instantiate CsvDataShell as us_data_shell, print us_data_shell.stats
us_data_shell = CsvDataShell("US", 'us_life_expectancy.csv')
print(us_data_shell.stats)

# Instantiate TsvDataShell as france_data_shell, print france_data_shell.stats
france_data_shell = TsvDataShell('France', 'france_life_expectancy.csv')
print(france_data_shell.stats)

              year  life_expectancy
count   117.000000       117.000000
mean   1956.752137        66.556684
std      34.398252         9.551079
min    1880.000000        39.410000
25%    1928.000000        58.500000
50%    1957.000000        69.599998
75%    1986.000000        74.772003
max    2015.000000        79.244003
       country,code,year,life_expectancy
count                                200
unique                               200
top            France,FRA,1957,69.045998
freq                                   1
