# DSDJ OOP Workshop

<br>

# Part 1 - Theory and Examples

## 1.1 Procedural programming

### What is procedural code?

**Procedural code is as much a mindset as it is a literal coding style.**

**The idea is that your code is a set of instructions that run one after another.**

### Let's look at an example

Our goal is to calculate the best student from 3 given students. Each student has 3 exam grades.
We define the best student as the one with the highest average grade.

Then, we want to add two new students to the group and find the best and worst student.
We define the worst student as the one with the lowest average grade.

In [17]:
student_grades = {'ahmad':[92, 81, 86], 'rahul':[39, 58, 81], 'kelly':[100, 71, 80]}

average_grades = {}

for student in student_grades:
    grades = student_grades[student]
    average_grades[student] = sum(grades)/len(grades)
    
best_student = max(average_grades, key=average_grades.get)

print('The best student is ' + best_student)

student_grades['joe'] = [94, 96, 91]

average_grades['joe'] = sum(student_grades['joe'])/len(student_grades['joe'])

student_grades['afra'] = [94, 23, 0]

average_grades['afra'] = sum(student_grades['afra'])/len(student_grades['afra'])
    
best_student = max(average_grades, key=average_grades.get)

print('The best student is ' + best_student)

worst_student = min(average_grades, key=average_grades.get)

print('The worst student is ' + worst_student)


The best student is ahmad
The best student is joe
The worst student is afra


### Problems with this code:
-  Can't be reused
 - what if we want to do the same procedure with 5 new students?
 - we need to copy-paste code and change parts of it, hoping we don't make any mistakes along the way
<br>
<br>
- Can't be tested
 - what if my worst_student calculation is wrong? how would I know?
 - different functionalities can't be isolated... they're potentially all intertwined since everything has a global scope
<br>
<br>
-  Can't be refactored easily
 - if we want to change my average calculation, we need to change it 3 times
<br>
<br>
-  Can't be debugged easily
 - if we end up with a bug, we have to manually trace it back through the entire program
<br>
<br>
-  The code becomes extremely complex very quickly... it continually increases in complexity over time (potentially exponentially)
<br>
<br>
-  If it can't be reused or tested, then it can't be automated.
<br>
<br>
-  It's too specific! At best, it solves exactly 1 problem... it will never scale.

## 1.2 Object-orient programming

### What is object-orient code?

**Object-oriented programming is a paradigm that says that our program should consist of well-defined objects that can be modified and interact with each other in predictable ways.**

**There should be a pre-defined interface that says what an object can and can't do and what data it has associated with it.**

**Instead of having a list of complex procedures to follow, we'll create and modify objects that hide the complexity interally.**

**Then, our procedural code will simply describe how objects are created, modified, and interact.**

<br>

### Let's look at an example

In [68]:
class Students():
    def __init__(self, student_grades={}):
        self.student_grades = student_grades
        self.average_grades = {}
        self.best_student = None
        self.worst_student = None
        self._calculate()
        
    def _calculate(self):
        """calls internal functions to calculate and update self.average_grades, self.best_student, and self.worst_student"""
        self._calc_ave_grades()
        self._calc_best_student()
        self._calc_worst_student()

    def _calc_ave_grades(self):
        """calculates the average grades for all students and updates self.average_grades"""
        for student in self.student_grades:
            grades = self.student_grades[student]
            self.average_grades[student] = sum(grades)/len(grades)
    
    def _calc_best_student(self):
        """finds the student with the highest average grade and stores result in self.best_student"""
        self.best_student = max(self.average_grades, key=self.average_grades.get)

    def _calc_worst_student(self):
        """finds the student with the lowest average grade and stores result in self.best_student"""
        self.worst_student = min(self.average_grades, key=self.average_grades.get)
        
    def add_student(self, name, grades):
        """adds a new student and recalculates grade averages, best student, and worst student"""
        self.student_grades[name] = grades
        self._calculate()
    
    def print_best_student(self):
        """prints the best student"""
        print('The best student is ' + self.best_student + '.')
        
    def print_worst_student(self):
        """prints the worst student"""
        print('The worst student is ' + self.worst_student + '.')

#### By clearly defining our Students object, our procedural code become extremely simple:

In [70]:
my_students = Students({'ahmad':[92, 81, 86], 'rahul':[39, 58, 81], 'kelly':[100, 71, 80]})

my_students.print_best_student()

my_students.add_student('joe', [94, 96, 91])

my_students.add_student('afra', [94, 23, 0])

my_students.print_best_student()

my_students.print_worst_student()

The best student is ahmad.
The best student is joe.
The worst student is afra.


### What's the difference?
15 lines of code were reduced to 6 lines of code.
Our untestable code was broken into smaller, testable chunks.
<br>

Now, we're just creating a students object and interacting with it in a pre-defined way.

There is no risk that we accidently incorrectly calculate the average grade or best student because that functionality has been abstracted away.

All of the complexity has been moved into the class and abstracted out of the procedural code.

We simply have an interface that interacts with a students object.

It would now be easy to create an API for adding students to the group and finding the best student.

This interface can easily be extended with new functionality for all students.

We can also create new student groups... for example, we may one one Student object to represent a class of 3rd graders and another Student object to represent a class of 4th graders.

The procedures will not increase in complexity because they are defined by the API - it is well defined (by the class templates) how we can and cannot interact with each object...

We have abstracted away the complexity and encapsulated the data and functions needed for each object as attribute and methods in the class.

In [7]:
third_graders = Students({'ahmad':[92, 81, 86], 'rahul':[39, 58, 81], 'kelly':[100, 71, 80]})
fourth_graders = Students({'ashish':[78, 85, 80], 'jim':[55, 85, 80], 'li':[40, 61, 50]})

third_graders.print_student('best')
fourth_graders.print_student('best')

The best student is ahmad.
The best student is ashish.


If I want to do the same thing with the procedural code, I now have to copy-past my entire code and try to add prefixes to everything...

<br>
<br>

## 1.3 Definitions and OOP advantages

### Object
Def: group of functions (aka methods) and variables (aka attributes) bundled together

### Class
Def: template used to create objects

### Instantiation
Def: the process of creating an object from a class and some input parameter(s)

### OOP Advantages:
-  moves complexity away from procedure structure into object structure
-  allows many similar object to be created from the same template
-  ensure all objects are consistent since the same template is used (no risk of writing an incorrect procedure)
-  keeps procedures simple by creating well-defined system
-  easily testable with unit tests

<br>
## 1.4 Four Pillars of OOP

### 1 - Encapsulation

Def: Hiding the implementation details from the end user and reducing variable scope to the object level

Benefits:
-  Protects user from accidently accessing or modifying local variables
-  Ties variables and functions together
    
Example:
    
    player_1_score = 7
    player_1_time = 243
    player_1_final_level = 3
    
becomes...
    
    player_1.score
    player_1.time
    player_1.final_level
    
Now, it is clear which score belongs with which player.
<br>

### 2 - Abstraction

Def: The process of hiding implementation details from the end user

For example:
-  Creating functions for common code patterns
-  Creating groups of functions and variables for common, but more complex code patterns (these are called objects)

Benefits:
-  Ensures consistency across all procedures (don't try to copy-paste)
-  Allows changes to be made in many places at once by simply changing the abstraction instead of each procedure
-  Makes complex procedures easily testable
    
Example:
    
    data_1 = [3, 4, 1, 5]
    data_2 = [5, 1, 1, 8]
    data_3 = [2, 2, 2, 3]
    
    data_1_average = sum(data_1)/len(data_1)
    data_2_average = sum(data_2)/len(data_2)
    data_3_average = sum(data_3)/len(data_3)
    
becomes...

    def my_average(data):
        return sum(data)/len(data)
        
    data_1_average = my_average(data_1)
    data_2_average = my_average(data_2)
    data_3_average = my_average(data_3)
    
We no longer need to worry that all 3 average calculations are done the same way.
We can also simultaneously change all of them if we choose without changing our procedural code.
We can easily test that our my_average function works correctly by creating unit tests.

Abstractions can be done for much more complex patterns.
<br>    

### 3- Inheritence

Def: The passing of methods and attributes from one class to another. A child (or derived) class will inherit the properties of the parent (or base) class.

Example:

    class Person:
        def __init__(self, name):
            self.name = name

    class Teacher(Person):
        def __init__(self, name, subject):
            Person.__init__(self, name)
            self.subject = subject
            
    class Student(Person):
        def __init__(self, name, grade_level):
            Person.__init__(self, name)
            self.grade_level = grade_level
           
<br>

### 4 - Polymorphism

Def: The ability to have a common interface for different data types. 

Example 1:

    >>> len([1, 2, 3, 4])
    4
    
    >>> len('hello!')
    6
    
The function can be applied to different data types to delivers fundamentally different functionality.
This is an example of overloading a function.


Example 2:

    >>> 4 + 5
    9
    
    >>> 'abc' + 'xyz'
    'abcxyz'
    
This is an example of overloading an operator. If you wanted, you could extend the '+' operator to work with your classes as well.


Example 3:

    class Dog():
        def __init__(self, name):
            pass
        
        def speak(self):
            print('woof')

    class Cat():
        def __init__(self):
            pass
            
        def speak(self):
            print('meow')

    fido = Dog()
    sprinkle = Cat()
    
    >>> for animal in [fido, sprinkle]:
    >>>    animal.speak()
    woof
    meow

Here, we have the same interface for Dog and Cat.


Example 4:

    model_1 = sklearn.ensemble.RandomForestClassifier()
    model_2 = sklearn.svm.SVC()
    ...
    model_1.fit(feature_df, target_df)
    model_2.fit(feature_df, target_df)
    
Same interface for multiple ML models.
You can even create your own models that follow the sklearn API and use them interchangably with sklearn!
    
<br>

<br>
# Part 2 - Interactive coding challenge

## 2.1 LinkedList basics

Create a LinkedList class that contains the following methods:
1. insert_first - inserts a new node at the head position
2. print_list - prints list, starting at head

## But first... what is a linked list? - whiteboard example

In [50]:
class Node():
    def __init__(self, data=None, next_node=None):
        pass

class LinkedList():
    def __init__(self, head=None):
        pass
    
    def insert_first(self, data):
        pass
    
    def print_list(self):
        pass

In [51]:
my_list = LinkedList()
my_list.insert_first(7)
my_list.insert_first(6)
my_list.insert_first(5)
my_list.insert_first(4)
my_list.insert_first(3)
my_list.insert_first(2)
my_list.insert_first(1)

In [52]:
my_list.print_list()

1
2
3
4
5
6
7


<br>
## 2.2 LinkedList challenge

Implement an method to find the "kth to last" element of a singly linked list.

Example:

    1->2->3->4->5
    1st to last = 5
    2nd to last = 4
    3rd to last = 3
    4th to last = 2
    5th to last = 1


In [65]:
class Node():
    def __init__(self, data=None, next_node=None):
        pass

class LinkedList():
    def __init__(self, head=None):
        pass
    
    def insert_first(self, data):
        pass
    
    def print_list(self):
        pass
            
    def kth_to_last(self, k=0):
        pass
        
    def get_length(self):
        pass

In [66]:
my_list = LinkedList()
my_list.insert_first(7)
my_list.insert_first(6)
my_list.insert_first(5)
my_list.insert_first(4)
my_list.insert_first(3)
my_list.insert_first(2)
my_list.insert_first(1)

In [67]:
my_list.kth_to_last(2).data

6