### Why Object-Oriented Programming?

Object-oriented programming has a few benefits over procedural programming, which is the programming style you most likely first learned. As you'll see in this lesson,

* object-oriented programming allows you to create large, modular programs that can easily expand over time;
* object-oriented programs hide the implementation from the end-user.
Consider Python packages like Scikit-learn, pandas, and NumPy. These are all Python packages built with object-oriented programming. Scikit-learn, for example, is a relatively large and complex package built with object-oriented programming. This package has expanded over the years with new functionality and new algorithms.

### What you will learn in this lesson
* First, we'll cover the fundamentals of object-oriented programming, including:
    * Procedural vs. object-oriented programming
    * Classes, objects, methods and attributes
    * Coding a class
    * Magic methods
    * Inheritance
* Then we'll use object-oriented programming to make a Python package. In this section, we will:
    * Make our own package
    * Take a tour of the source code inside some scikit-learn packages
    * Learn how to put a package on PyPi so that it can be easily installed

### Objects are defined by characteristics and actions

![image.png](attachment:0da63270-d4cf-45f9-a446-5b9e5cd72414.png)

### Object-Oriented Programming (OOP) Vocabulary
* __class__ - a blueprint consisting of methods and attributes
* __object__ - an instance of a class. It can help to think of objects as something in the real world like a yellow pencil, a small dog, a blue shirt, etc. However, as you'll see later in the lesson, objects can be more abstract.
* __attribute__ - a descriptor or characteristic. Examples would be color, length, size, etc. These attributes can take on specific values like blue, 3 inches, large, etc.
* __method__ - an action that a class or object could take
* __OOP__ - a commonly used abbreviation for object-oriented programming
* __encapsulation__ - one of the fundamental ideas behind object-oriented programming is called encapsulation: you can combine functions and data all into a single entity. In object-oriented programming, this single entity is called a class. Encapsulation allows you to hide implementation details much like how the scikit-learn package hides the implementation of machine learning algorithms.

    In English, you might hear an attribute described as a property, description, feature, quality, trait, or characteristic. All of these are saying the same thing.

Here is a reminder of how a class, object, attributes and methods relate to each other.

In [7]:
class Shirt:
    def __init__(self, shirt_color, shirt_size, shirt_style, shirt_price):
        self.color = shirt_color
        self.size = shirt_size
        self.style = shirt_style 
        self.price = shirt_price 

    def change_price(self, new_price):
        self.price = new_price

    def discount(self, discount):
        return self.price * (1-discount)


new_shirt = Shirt('red', 'S', 'short sleeve', 15)



In [8]:
print(new_shirt.color)
print(new_shirt.size)
print(new_shirt.style)
print(new_shirt.price)

red
S
short sleeve
15


In [11]:
new_shirt.change_price(10)
print(new_shirt.price)

10


In [13]:
new_shirt.discount(.30)


7.0

In [18]:
tshirt_collection = []

shirt_one = Shirt('red', 'S', 'short sleeve', 15)
shirt_two = Shirt('blue', 'M', 'short sleeve', 10)
shirt_three = Shirt('Green', 'S', 'short sleeve', 35)
shirt_four = Shirt('yellow', 'M', 'short sleeve', 20)

tshirt_collection.append(shirt_one)
tshirt_collection.append(shirt_two)
tshirt_collection.append(shirt_three)
tshirt_collection.append(shirt_four)

for i in range(len(tshirt_collection)):
    print(tshirt_collection[i].color)

red
blue
Green
yellow


In [19]:
class Shirt:
    def __init__(self, shirt_color, shirt_size, shirt_style, shirt_price):
        self.color = shirt_color
        self.size = shirt_size
        self.style = shirt_style 
        self.price = shirt_price 

    def change_price(self, new_price):
        self.price = new_price

    def discount(self, discount):
        return self.price * (1-discount)


new_shirt = Shirt('red', 'S', 'short sleeve', 15)

In [21]:

shirt_one = Shirt('red', 'S', 'long-sleeve', 25)

print(shirt_one.price)
shirt_one.change_price(10)
print(shirt_one.price)
print(shirt_one.discount(.12))

shirt_two = Shirt('orange', 'L', 'short-sleeve', 10)

total = shirt_one.price + shirt_two.price

total_discount =  shirt_one.discount(.14) + shirt_two.discount(.06) 
print(round(total_discount))

25
10
8.8
18


# OOP Syntax Exercise - Part 2

Now that you've had some practice instantiating objects, it's time to write your own class from scratch. This lesson has two parts. In the first part, you'll write a Pants class. This class is similar to the shirt class with a couple of changes. Then you'll practice instantiating Pants objects

In the second part, you'll write another class called SalesPerson. You'll also instantiate objects for the SalesPerson.

For this exercise, you can do all of your work in this Jupyter notebook. You will not need to import the class because all of your code will be in this Jupyter notebook.

Answers are also provided. If you click on the Jupyter icon, you can open a folder called 2.OOP_syntax_pants_practice, which contains this Jupyter notebook ('exercise.ipynb') and a file called answer.py.

# Pants class

Write a Pants class with the following characteristics:
* the class name should be Pants
* the class attributes should include
 * color
 * waist_size
 * length
 * price
* the class should have an init function that initializes all of the attributes
* the class should have two methods
 * change_price() a method to change the price attribute
 * discount() to calculate a discount

In [24]:
class Pants:
    def __init__(self, color, waist_size, length, price):
        self.color = color 
        self.waist_size = waist_size
        self.length = length 
        self.price = price 
    def change_price(self, new_price):
        self.price = new_price

    def discount(self, discount):
        return self.price * (1 - discount)

    

In [25]:
def check_results():
    pants = Pants('red', 35, 36, 15.12)
    assert pants.color == 'red'
    assert pants.waist_size == 35
    assert pants.length == 36
    assert pants.price == 15.12
    
    pants.change_price(10) == 10
    assert pants.price == 10 
    
    assert pants.discount(.1) == 9
    
    print('You made it to the end of the check. Nice job!')

check_results()

You made it to the end of the check. Nice job!


# SalesPerson class

The Pants class and Shirt class are quite similar. Here is an exercise to give you more practice writing a class. **This exercise is trickier than the previous exercises.**

Write a SalesPerson class with the following characteristics:
* the class name should be SalesPerson
* the class attributes should include
 * first_name 
 * last_name
 * employee_id
 * salary
 * pants_sold
 * total_sales
* the class should have an init function that initializes all of the attributes
* the class should have four methods
 * sell_pants() a method to change the price attribute
 * calculate_sales() a method to calculate the sales
 * display_sales() a method to print out all the pants sold with nice formatting
 * calculate_commission() a method to calculate the salesperson commission based on total sales and a percentage

In [42]:
class SalesPerson:
    def __init__(self, first_name, last_name, employee_id, salary):
        self.first_name = first_name
        self.last_name = last_name
        self.employee_id = employee_id
        self.salary = salary
        self.pants_sold = []
        self.total_sales = 0
        
    def sell_pants(self, pants_object):
        self.pants_sold.append(pants_object)
        
    def calculate_sales(self):
        total =0
        for pants in self.pants_sold:
            total += pants.price 
        self.total_sales = total 
        return total
            

    def display_sales(self):
        for pants in self.pants_sold:
            print(f"color: {pants.color}, waist_size: {pants.waist_size}, lenght: {pants.length}, price: {pants.price}")

    def calculate_commission(self, percentage ):
        sales_total = self.calculate_sales()
        return sales_total * percentage
        
        
    

pants_one = Pants('red', 35, 36, 15.12)
pants_two = Pants('blue', 40, 38, 24.12)
pants_three = Pants('tan', 28, 30, 8.12)

salesperson = SalesPerson('Amy', 'Gonzalez', 2581923, 40000)

salesperson.sell_pants(pants_one)    
salesperson.sell_pants(pants_two)
salesperson.sell_pants(pants_three)

salesperson.display_sales()

In [44]:
def check_results():
    pants_one = Pants('red', 35, 36, 15.12)
    pants_two = Pants('blue', 40, 38, 24.12)
    pants_three = Pants('tan', 28, 30, 8.12)
    
    salesperson = SalesPerson('Amy', 'Gonzalez', 2581923, 40000)
    
    assert salesperson.first_name == 'Amy'
    assert salesperson.last_name == 'Gonzalez'
    assert salesperson.employee_id == 2581923
    assert salesperson.salary == 40000
    assert salesperson.pants_sold == []
    assert salesperson.total_sales == 0
    
    salesperson.sell_pants(pants_one)
    salesperson.pants_sold[0] == pants_one.color
    
    salesperson.sell_pants(pants_two)
    salesperson.sell_pants(pants_three)
    
    assert len(salesperson.pants_sold) == 3
    assert round(salesperson.calculate_sales(),2) == 47.36
    assert round(salesperson.calculate_commission(.1),2) == 4.74
    
    print('Great job, you made it to the end of the code checks!')
    
check_results()

Great job, you made it to the end of the code checks!


__This lesson is going to introduce the idea of packages through a Gaussian Distribution example.__


Include in this Python package will be :

* Read in dataset
* Calculate mean
* Calculate standard deviation
* Plot histogram
* Plot probability density function
* add two gaussian distributions


In [4]:
import math 
import matplotlib.pyplot as plt 

class Gaussian():

    def __init__(self, mu=0, sigma=1):

        self.mean = mu
        self.stdev = sigma
        self.data = []

    def calculate_mean(self):

        avg = 1.0 * sum(self.data)/len(self.data)
        self.mean = avg
        return self.mean 

    def calculate_stdev(self, sample=True):
        if sample:
            n = len(self.data)-1
        else:
            n = len(self.data)
        mean = self.mean
        sigma = 0
        for d in self.data:
            sigma += (d-mean) ** 2
        sigma = math.sqrt(sigma/n)
        self.stdev = sigma 
        
        return self.stdev

    def read_data_file(self, file_name, sample=True):
        with open(file_name) as file:
            data_list = []
            line = file.readline()
            while line:
                data_list.append(int(line))
                line = file.readline()
        file.close()

        self.data = data_list
        self.mean = self.calculate_mean()
        self.stdev = self.calculate_stdev(sample)

    def plot_histogram(self):
        plt.hist(self.data)
        plt.title("Histogram Data")
        plt.xlabel("Data")
        plt.ylabel("count")

    def pdf(self, x):

        return (1.0 / (self.stdev * math.sqrt(2*math.pi))) * math.exp(-0.5*((x-self.mean)/self.stdev)**2)
        
    def plot_histogram_pdf(self, n_spaces = 50):

        """Method to plot the normalized histogram of the data and a plot of the 
        probability density function along the same range
        
        Args:
            n_spaces (int): number of data points 
        
        Returns:
            list: x values for the pdf plot
            list: y values for the pdf plot
            
        """
        
        #TODO: Nothing to do for this method. Try it out and see how it works.
        
        mu = self.mean
        sigma = self.stdev

        min_range = min(self.data)
        max_range = max(self.data)
        
         # calculates the interval between x values
        interval = 1.0 * (max_range - min_range) / n_spaces

        x = []
        y = []
        
        # calculate the x values to visualize
        for i in range(n_spaces):
            tmp = min_range + interval*i
            x.append(tmp)
            y.append(self.pdf(tmp))

        # make the plots
        fig, axes = plt.subplots(2,sharex=True)
        fig.subplots_adjust(hspace=.5)
        axes[0].hist(self.data, density=True)
        axes[0].set_title('Normed Histogram of Data')
        axes[0].set_ylabel('Density')

        axes[1].plot(x, y)
        axes[1].set_title('Normal Distribution for \n Sample Mean and Sample Standard Deviation')
        axes[0].set_ylabel('Density')
        plt.show()

        return x, y
        

    
        

In [5]:
# Unit tests to check your solution

import unittest

class TestGaussianClass(unittest.TestCase):
    def setUp(self):
        self.gaussian = Gaussian(25, 2)

    def test_initialization(self): 
        self.assertEqual(self.gaussian.mean, 25, 'incorrect mean')
        self.assertEqual(self.gaussian.stdev, 2, 'incorrect standard deviation')

    def test_pdf(self):
        self.assertEqual(round(self.gaussian.pdf(25), 5), 0.19947,\
         'pdf function does not give expected result') 

    def test_meancalculation(self):
        self.gaussian.read_data_file('files/numbers.txt', True)
        self.assertEqual(self.gaussian.calculate_mean(),\
         sum(self.gaussian.data) / float(len(self.gaussian.data)), 'calculated mean not as expected')

    def test_stdevcalculation(self):
        self.gaussian.read_data_file('files/numbers.txt', True)
        self.assertEqual(round(self.gaussian.stdev, 2), 92.87, 'sample standard deviation incorrect')
        self.gaussian.read_data_file('files/numbers.txt', False)
        self.assertEqual(round(self.gaussian.stdev, 2), 88.55, 'population standard deviation incorrect')
                
tests = TestGaussianClass()

tests_loaded = unittest.TestLoader().loadTestsFromModule(tests)

unittest.TextTestRunner().run(tests_loaded)

....
----------------------------------------------------------------------
Ran 4 tests in 0.005s

OK


<unittest.runner.TextTestResult run=4 errors=0 failures=0>

### Magic Methods

Below you'll find the same code from the previous exercise except two more methods have been added: an __add__ method and a __repr__ method. Your task is to fill out the code and get all of the unit tests to pass. You'll find the code cell with the unit tests at the bottom of this Jupyter notebook.

As in previous exercises, there is an answer key that you can look at if you get stuck. Click on the "Jupyter" icon at the top of this notebook, and open the folder 4.OOP_code_magic_methods. You'll find the answer.py file inside the folder.

In [4]:
import math 
import matplotlib.pyplot as plt

class Gaussian():
    """Gaussian distribution class for calculating and visualizing a 
    gaussian distribution.
    
    Attributes:
    mean(float) : Representing the mean value of the distribution
    stdev(float) : Representing the standard deviation of the distribution
    data_list : list of floats extracted from the data file"""

    def __init__(self, mu=0, sigma=1):
        self.mean = mu
        self.stdev = sigma
        self.data = []
    
    
    def clculate_mean(self):
        self.mean = sum(self.data)/len(self.data)
        return self.mean


    def calculate_stdev(self, sample=True):
        """method to calculate the standard deviation of the dataset
        Args : sample(bool) : whether the data represents a sample or population
        Returns : floats : standard deviation of the data set"""
        
        n = len(self.data)
        if sample :
            n = n-1
        mean = self.mean
        variance = sum((x - mean)**2 for x in self.data)/(n)
        self.stdev = math.sqrt(variance)
        return self.stdev

    def read_data_files(self, file_name, sample=True):
        """Method to read in data from txt file. The txt file should have one number float per line, the mean and standard deviation are calculated
        Args : file_name : name of a file read from 
        Returns : None """
        with open(file_name) as file:
            data_list = []
            for line in file:
                data_list.append(int(line.strip()))
        self.data = data_list
        self.mean = self.calculate_mean()
        self.stdev = self.calculate_stdev()

    def plot_histogram(self):
        """Method to output a histogram of the instance variable data"""
        plt.hist(self.data)
        plt.title('Histogram of Data')
        plt.xlabel('Data')
        plt.ylabel('Count')
        
    def pdf(self, x):
        """Probability density function calculator for the gaussian distribution."""
        return (1.0 / (self.stdev * math.sqrt(2*math.pi))) * math.exp(-0.5*((x - self.mean) / self.stdev) ** 2)


    def plot_histogram_pdf(self, n_spaces = 50):

        """Function to plot the normalized histogram of the data and a plot of the 
        probability density function along the same range
        
        Args:
            n_spaces (int): number of data points 
        
        Returns:
            list: x values for the pdf plot
            list: y values for the pdf plot
        """
        
        mu = self.mean
        sigma = self.stdev

        min_range = min(self.data)
        max_range = max(self.data)
        
         # calculates the interval between x values
        interval = 1.0 * (max_range - min_range) / n_spaces

        x = []
        y = []
        
        # calculate the x values to visualize
        for i in range(n_spaces):
            tmp = min_range + interval*i
            x.append(tmp)
            y.append(self.pdf(tmp))

        # make the plots
        fig, axes = plt.subplots(2,sharex=True)
        fig.subplots_adjust(hspace=.5)
        axes[0].hist(self.data, density=True)
        axes[0].set_title('Normed Histogram of Data')
        axes[0].set_ylabel('Density')

        axes[1].plot(x, y)
        axes[1].set_title('Normal Distribution for \n Sample Mean and Sample Standard Deviation')
        axes[0].set_ylabel('Density')
        plt.show()

        return x, y
    
    
        
    

In [5]:
# Unit tests to check your solution

import unittest

class TestGaussianClass(unittest.TestCase):
    def setUp(self):
        self.gaussian = Gaussian(25, 2)

    def test_initialization(self): 
        self.assertEqual(self.gaussian.mean, 25, 'incorrect mean')
        self.assertEqual(self.gaussian.stdev, 2, 'incorrect standard deviation')

    def test_pdf(self):
        self.assertEqual(round(self.gaussian.pdf(25), 5), 0.19947,\
         'pdf function does not give expected result') 

    def test_meancalculation(self):
        self.gaussian.read_data_file('numbers.txt', True)
        self.assertEqual(self.gaussian.calculate_mean(),\
         sum(self.gaussian.data) / float(len(self.gaussian.data)), 'calculated mean not as expected')

    def test_stdevcalculation(self):
        self.gaussian.read_data_file('numbers.txt', True)
        self.assertEqual(round(self.gaussian.stdev, 2), 92.87, 'sample standard deviation incorrect')
        self.gaussian.read_data_file('numbers.txt', False)
        self.assertEqual(round(self.gaussian.stdev, 2), 88.55, 'population standard deviation incorrect')

    def test_add(self):
        gaussian_one = Gaussian(25, 3)
        gaussian_two = Gaussian(30, 4)
        gaussian_sum = gaussian_one + gaussian_two
        
        self.assertEqual(gaussian_sum.mean, 55)
        self.assertEqual(gaussian_sum.stdev, 5)

    def test_repr(self):
        gaussian_one = Gaussian(25, 3)
        
        self.assertEqual(str(gaussian_one), "mean 25, standard deviation 3")
        
tests = TestGaussianClass()

tests_loaded = unittest.TestLoader().loadTestsFromModule(tests)

unittest.TextTestRunner().run(tests_loaded)

E.E.FE
ERROR: test_add (__main__.TestGaussianClass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/folders/k9/xlvqj9kj2mjg567zvbhp94l00000gp/T/ipykernel_1292/1165813698.py", line 31, in test_add
    gaussian_sum = gaussian_one + gaussian_two
TypeError: unsupported operand type(s) for +: 'Gaussian' and 'Gaussian'

ERROR: test_meancalculation (__main__.TestGaussianClass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/folders/k9/xlvqj9kj2mjg567zvbhp94l00000gp/T/ipykernel_1292/1165813698.py", line 18, in test_meancalculation
    self.gaussian.read_data_file('numbers.txt', True)
AttributeError: 'Gaussian' object has no attribute 'read_data_file'

ERROR: test_stdevcalculation (__main__.TestGaussianClass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/folders/k9/xlvqj9kj2mjg567zvb

<unittest.runner.TextTestResult run=6 errors=3 failures=1>