# Object Oriented Programing:
1. Syntax
2. Build a Python Package

## Syntax
1. procedural vs object-oriented programming
2. classes, objects, methods and attributes
3. coding a class
4. magic methods
5. inheritance

## Build a Python Package
1. making a package
2. tour of scikit-learn source code
3. putting your package on PyPi

### Overview: Why Object-Oriented Programming?
Object-oriented programming has a few benefits over procedural programming, which is the programming style you most likely first learned. As you'll see in this lesson,
* OOP allows you to create large, modular programs that can easily expand over time
* OOPs hide the implementation from the end-user

Consider Python packages like [Scikit-learn](https://github.com/scikit-learn/scikit-learn), [pandas](https://pandas.pydata.org/), and [NumPy](http://www.numpy.org/). These are all Python packages built with object-oriented programming. Scikit-learn, for example, is a relatively large and complex package built with object-oriented programming. This package has expanded over the years with new functionality and new algorithms.

When you train a machine learning algorithm with Scikit-learn, you don't have to know anything about how the algorithms work or how they were coded. You can focus directly on the modeling.

Here's an example taken from the [Scikit-learn website](http://scikit-learn.org/stable/modules/svm.html):
```python
from sklearn import svm
X = [[0, 0], [1, 1]]
y = [0, 1]
clf = svm.SVC()
clf.fit(X, y)
```

How does Scikit-learn train the SVM model? You don't need to know because the implementation is hidden with object-oriented programming. If the implementation changes, you as a user of Scikit-learn might not ever find out. Whether or not you SHOULD understand how SVM works is a different question.

In this lesson, you'll practice the fundamentals of object-oriented programming. By the end of the lesson, you'll have built a Python package using object-oriented programming.

This lesson uses classroom workspaces that contain all of the files and functionality you will need. You can also find the files in the [data scientist nanodegree term 2 GitHub repo](https://github.com/udacity/DSND_Term2/tree/master/lessons/ObjectOrientedProgramming) or [my repo here](https://github.com/ChristopherDaigle/udacity_nano_ds/tree/master/udacity_dsnd_two/lessons/ObjectOrientedProgramming).

## Syntax: Procedural vs Object-Oriented Programming
* Procedural programming is a form of programming that executes stages of a program in a linear fashion
* Object-Oriented Programming executes based on the attributes of objects

**Objects are defined by characteristics and actions:**
Example:<br>
* Sales Person as Object:

**Characteristics (Attributes)**
> * Name
> * Address
> * Phone Number
> * Hourly Pay<br>

**Actions (Method)**
> * Sell item
> * Take item

* Shirt as Object:

**Attributes**
> * Color
> * Size
> * Style
> * Price

**Method**
> * Change Price

**Characteristics and Actions in English Grammar**
Another way to think about characteristics and actions is in terms of English grammar
> * **characteristic** would be a noun
> * **action** would be a verb.

Let's pick something from the real-world: a dog. A few characteristics could be the dog's weight, color, breed, and height. These are all nouns. What actions would a dog take? A dog can bark, run, bite and eat. These are all verbs.

## Syntax: Vocabulary
* class - a blueprint consisting of methods and attributes
* object - an instance of a class. It can help to think of objects as something in the real world like a yellow * * pencil, a small dog, a blue shirt, etc. However, as you'll see later in the lesson, objects can be more abstract.
* attribute - a descriptor or characteristic. Examples would be color, length, size, etc. These attributes can take on specific values like blue, 3 inches, large, etc.
* method - an action that a class or object could take
* OOP - a commonly used abbreviation for object-oriented programming
* encapsulation - one of the fundamental ideas behind object-oriented programming is called encapsulation: you can combine functions and data all into a single entity. In object-oriented programming, this single entity is called a class. Encapsulation allows you to hide implementation details much like how the scikit-learn package hides the implementation of machine learning algorithms.

In English, you might hear an attribute described as a property, description, feature, quality, trait, or characteristic. All of these are saying the same thing.

Defining a shirt:
```python
class Shirt:

    def __init__(self, shirt_color, shirt_size, shirt_style, shirt_price):
        self.color = shirt_color
        self.size = shirt_size
        self.style = shirt_style
        self.price = shirt_price
    
    def change_price(self, new_price):
    
        self.price = new_price
        
    def discount(self, discount):

        return self.price * (1 - discount)```

### Set and Get methods

The shirt class has a method to change the price of the shirt: shirt_one.change_price(20). In Python, you can also change the values of an attribute with the following syntax:
```python
shirt_one.price = 10
shirt_one.price = 20
shirt_one.color = 'red'
shirt_one.size = 'M'
shirt_one.style = 'long_sleeve'
```

This code accesses and changes the price, color, size and style attributes directly. *Accessing attributes directly would be frowned upon in many other languages* **but not in Python.** Instead, the *general object-oriented programming convention is to use methods to access attributes or change attribute values.* These methods are called set and get methods or setter and getter methods.

A **get method** is for obtaining an attribute value. A **set method** is for changing an attribute value. If you were writing a Shirt class, the code could look like this:
```python
class Shirt:

    def __init__(self, shirt_color, shirt_size, shirt_style, shirt_price):
        self._price = shirt_price

    def get_price(self):
      return self._price

    def set_price(self, new_price):
      self._price = new_price
```
Instantiating and using an object might look like:
```python
shirt_one = Shirt('yellow', 'M', 'long-sleeve', 15)
print(shirt_one.get_price())
shirt_one.set_price(10)
```

### Set and Get Methods: Explained
In the class definition,
```python
def __init__(self, shirt_color, shirt_size, shirt_style, shirt_price):
    self._price = shirt_price
```

the underscore in front of price (i.e. `self._price = shirt_price`) is a somewhat controversial Python convention. In other languages like C++ or Java, price could be explicitly labeled as a private variable. This would *prohibit an object from accessing the price attribute directly* like `shirt_one._price = 15`.

However, Python does not distinguish between private and public variables like other languages. Therefore, there is some controversy about using the underscore convention as well as get and set methods in Python. Why use get and set methods in Python when Python wasn't designed to use them?

At the same time, you'll find that some Python programmers develop object-oriented programs using get and set methods anyway. Following the Python convention, the underscore in front of price is to let a programmer know that price should only be accessed with get and set methods rather than accessing price directly with `shirt_one._price`. However, a programmer could still access `_price` directly because there is nothing in the Python language to prevent the direct access.

**To reiterate**, a programmer could technically still do something like `shirt_one._price = 10`, and the code would work. But *accessing price directly, in this case, would not be following the intent of how the Shirt class was designed.*

One of the benefits of set and get methods is that, as previously mentioned in the course, you can hide the implementation from your user. Maybe originally a variable was coded as a list and later became a dictionary. With set and get methods, you could easily change how that variable gets accessed. Without set and get methods, you'd have to go to every place in the code that accessed the variable directly and change the code.

You can read more about get and set methods in Python on this [Python Tutorial site](https://www.python-course.eu/python3_properties.php).

### Note about Attributes
There are some drawbacks to accessing attributes directly versus writing a method for accessing attributes.

In terms of object-oriented programming, the rules in Python are a bit looser than in other programming languages. As previously mentioned, in some languages, like C++, you can explicitly state whether or not an object should be allowed to change or access an attribute's values directly. Python does not have this option.

Why might it be better to change a value with a method instead of directly? Changing values via a method gives you more flexibility in the long-term. What if the units of measurement change, like the store was originally meant to work in US dollars and now has to handle Euros?

---
**Example: Dollars versus Euros**
If you've changed attribute values directly, you'll have to go through your code and find all the places where US dollars were used, like:
```python
shirt_one.price = 10 # US dollars
```
and then manuall change to Euros
```python
shirt_one.price = 8 # Euros
```
---
If you had used a method, then you would only have to change the method to convert from dollars to Euros

---
```python
def change_price(self, new_price):
#     self.price = new_price # USD
    self.price = new_price * 0.81 # Euros

shirt_one.change_price(10)
print(shirt_one.price)
```
**OUTPUT**<br>
`8.1`

---

### Modularized Code
If you were developing a software program, you would want to modularize this code.

You would put the Shirt class into its own Python script called, say, `shirt.py`. And then in another Python script, you would import the Shirt class with a line like: `from shirt import Shirt`

## Commenting Object-Oriented Code
A docstring is a type of comment that describes how a Python module, function, class or method works. Docstrings, therefore, are not unique to object-oriented programming. This section of the course is merely reminding you to use docstrings and to comment your code. It's not just going to help you understand and maintain your code. It will also make you a better job candidate.

Use both in-line comments and document level comments as appropriate.

Check out [this link](http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) to read more about docstrings.

---
### Docstrings and Object-Oriented Code
Below is an example of a class with docstrings and a few things to keep in mind:
* Make sure to indent your docstrings correctly or the code will not run. A docstring should be indented one indentation underneath the class or method being described.
* You don't have to define 'self' in your method docstrings. It's understood that any method will have self as the first method input.

```python
class Pants:
    """The Pants class represents an article of clothing sold in a store
    """

    def __init__(self, color, waist_size, length, price):
        """Method for initializing a Pants object

        Args: 
            color (str)
            waist_size (int)
            length (int)
            price (float)

        Attributes:
            color (str): color of a pants object
            waist_size (str): waist size of a pants object
            length (str): length of a pants object
            price (float): price of a pants object
        """

        self.color = color
        self.waist_size = waist_size
        self.length = length
        self.price = price

    def change_price(self, new_price):
        """The change_price method changes the price attribute of a pants object

        Args: 
            new_price (float): the new price of the pants object

        Returns: None

        """
        self.price = new_price

    def discount(self, percentage):
        """The discount method outputs a discounted price of a pants object

        Args:
            percentage (float): a decimal representing the amount to discount

        Returns:
            float: the discounted price
        """
        return self.price * (1 - percentage)
    ```

# Creating a Python package - Analyze a Gaussian Distribution
- Read dataset
- Calculate mean
- Calculate standard deviation
- Plot histogram
- Plot probability density function (PDF)

<br><br>
* **Gaussian Distribution PDF**:
$$f\left(x\ \middle| \ \mu , \ \sigma^{2} \right)=\frac{1}{\sqrt{2\pi\sigma^{2}}}e^{\frac{-\left(x-\mu\right)^{2}}{2\sigma^{2}}}$$<br>

* **Binomial Distribution**:
The binomial distribution is used when there are exactly two mutually exclusive outcomes of a trial. These outcomes are appropriately labeled "success" and "failure". The binomial distribution is used to obtain the probability of observing $k$ successes in $n$ trials, with the probability of success on a single trial denoted by $p$. The binomial distribution assumes that $p$ is fixed for all trials

$$f\left(k,\ n,\ p\right) = \frac{n!}{k!\left(n-k\right)!}p^{k}\left(1-p\right)^{\left(n-k\right)}$$

Where $p$ is the probability of an outcome, $n$ is the number of observations, $k$ is the outcome of interest

> * mean: $\mu = n\cdot p$<br>
> * variance: $\sigma^{2}=n\cdot p \cdot \left(1-p\right)$
> * standard deviation: $\sqrt{n\cdot p \cdot \left(1-p\right)}$<br>

> Example:<br>
Let $k=2$, $n=2$, amd $p=0.5$
$$f\left(k=2,\ n=2,\ p=2\right)=\frac{2!}{2!\left(2-2\right)!}\left(0.5\right)^{2}\left(1-0.5\right)^{\left(2-2\right)}=\frac{2}{2}\left(0.25\right)\left(1\right)=0.25$$

## ASIDE: Further Resources
If you would like to review the Gaussian (normal) distribution and binomial distribution, here are a few resources:

This free Udacity course, [Intro to Statistics](https://www.udacity.com/course/intro-to-statistics--st101), has a lesson on Gaussian distributions as well as the Binomial distribution.

This free course, [Intro to Descriptive Statistics](https://www.udacity.com/course/intro-to-descriptive-statistics--ud827), also has a Gaussian distributions lesson.

**Here are the wikipedia articles:**
* [Gaussian Distributions Wikipedia](https://en.wikipedia.org/wiki/Normal_distribution)
* [Binomial Distributions Wikipedia](https://en.wikipedia.org/wiki/Normal_distribution)

## Gaussian PDF Code:
```python
import math
import matplotlib.pyplot as plt

class Gaussian():
    """
    Gaussian distribution class for calculating and 
    visualizing a Gaussian distribution.
    Attributes:
        mean (float) representing the mean value of the distribution
        stdev (float) representing the standard deviation of the distribution
        data_list (list of floats) a list of floats extracted from the data file
    """
    def __init__(self, mu = 0, sigma = 1):
        
        self.mean = mu
        self.stdev = sigma
        self.data = []


    def calculate_mean(self):
        """
        Method to calculate the mean of the data set.
        Args: 
            None
        
        Returns: 
            float: mean of the data set    
        """
        self.mean = sum(self.data) / len(self.data)
        
        return self.mean


    def calculate_stdev(self, sample=True):
        """
        Method to calculate the standard deviation of the data set.
        Args: 
            sample (bool): whether the data represents a sample or population
        Returns: 
            float: standard deviation of the data set    
        """
        mu = self.mean
        diff = [(x - mu) ** 2 for x in self.data]
        quantity = sum(diff)
        
        n = len(self.data)
        if sample == True:
            n -= 1
        else:
            pass
        
        variance = quantity / n
        
        self.stdev = math.sqrt(variance)
        
        return self.stdev


    def read_data_file(self, file_name, sample=True):
        """
        Method to read in data from a txt file. The txt file should have
        one number (float) per line. The numbers are stored in the data attribute. 
        After reading in the file, the mean and standard deviation are calculated
        Args:
            file_name (string): name of a file to read from
        Returns:
            None
        """
        
        # This code opens a data file and appends the data to a list called data_list
        with open(file_name) as file:
            data_list = []
            line = file.readline()
            while line:
                data_list.append(int(line))
                line = file.readline()
        file.close()
        
        self.data = data_list
        self.mean = self.calculate_mean()
        self.stdev = self.calculate_stdev(sample)


    def plot_histogram(self):
        """
        Method to output a histogram of the instance variable data using 
        matplotlib pyplot library.
        Args:
            None
        Returns:
            None
        """
        x_label = 'Data'
        y_label = 'Frequency'
        plt.hist(self.data)
        plt.title('Histogram of Data')
        plt.xlabel(x_label)
        plt.ylabel(y_label)


    def pdf(self, x):
        """
        Probability density function calculator for the gaussian distribution.
        Args:
            x (float): point for calculating the probability density function
        Returns:
            float: probability density function output
        """
        (1.0 / (self.stdev * math.sqrt(2*math.pi))) * math.exp(-0.5*((x - self.mean) / self.stdev) ** 2)
        
        denom_quantity = 2 * math.pi * self.stdev ** 2
        denom = math.sqrt(denom_quantity)
        
        frac = 1 / denom
        
        exp_frac_num = (-1) * (x - self.mean) ** 2
        exp_frac_denom = 2 * self.stdev ** 2
        exp_frac = exp_frac_num / exp_frac_denom
        
        prob_dens_fct = frac * math.exp(exp_frac)
        
        return prob_dens_fct
```

## Magic Methods
Magic methods let you customize and override default python behavior

It's not possible to add to `Gaussian()`s together, but if we adapt the method of doing so by changing python's default behavior for this class, we can!
```python
    def __add__(self, other):
        """
        Magic method to add together two Gaussian distributions
            When summing two Gaussian distributions, the mean value is the sum
                of the means of each Gaussian.
            When summing two Gaussian distributions, the standard deviation is the
                square root of the sum of square ie sqrt(stdev_one ^ 2 + stdev_two ^ 2)
        Args:
            other (Gaussian): Gaussian instance
        Returns:
            Gaussian: Gaussian distribution 
        """
        result = Gaussian()

        result.mean = self.mean + other.mean
        result.stdev = math.sqrt(self.stdev ** 2 + other.stdev ** 2)
        
        return result


    def __repr__(self):
        """
        Magic method to output the characteristics of the Gaussian instance
        Args:
            None
        Returns:
            string: characteristics of the Gaussian
        """
        return "mean {}, standard deviation {}".format(self.mean, self.stdev)
```

# Inheritance
The ability to generalize an object, then share some elements with different classes easily

For example, a clothing class with color, size, price, etc. generalizes for shirts, socks, pants, etc.

Each of those classes could inherit information from the clothing class instead. We can more easily adjust each of the "child" classes if we desire. Like adding material to the clothing class will affect all of those who inherited attributes from the clothing class (socks, pants, etc)

Example of inheritance:
```python
class Clothing:
    
    def __init__(self, color, size, style, price):
        self.color = color
        self.size = size
        self.style = style
        self.price = price


    def change_price(self, price):
        self.price = price


    def calculate_discount(self, discount):
        return self.price * (1 - discount)
```
---
And here. we see the Shirt and Pants class (children classes) inherit the attibutes of the Clothing class (parent class):

```python
class Shirt(Clothing):
    
    def __init__(self, color, size, style, price, long_or_short):
        
        Clothing.__init__(self, color, size, style, price)
        self.long_or_short = long_or_short


    def double_price(self):
        self.price = 2 * self.price

        
class Pants(Clothing):
    
    def __init__(self, color, size, style, price, waist):
        
        Clothing.__init__(self, color, size, style, price)
        self.waist = waist
        
        
    def calculate_discount(self, discount):
        return self.price * (1 - discount / 2)
```

### Another example of inheritance:
Parent class
```python
class Clothing:

    def __init__(self, color, size, style, price):
        self.color = color
        self.size = size
        self.style = style
        self.price = price
        
    def change_price(self, price):
        self.price = price
        
        
    def calculate_discount(self, discount):
        return self.price * (1 - discount)
    
    
    def calculate_shipping(self, weight, rate):
        return weight * rate
```
---
Child classes:
```python
class Shirt(Clothing):
    
    def __init__(self, color, size, style, price, long_or_short):
        
        Clothing.__init__(self, color, size, style, price)
        self.long_or_short = long_or_short
    
    def double_price(self):
        self.price = 2*self.price
    
class Pants(Clothing):

    def __init__(self, color, size, style, price, waist):
        
        Clothing.__init__(self, color, size, style, price)
        self.waist = waist
        
    def calculate_discount(self, discount):
        return self.price * (1 - discount / 2)


class Blouse(Clothing):
    
    def __init__(self, color, size, style, price, country_of_origin):
        
        Clothing.__init__(self, color, size, style, price)
        self.country_of_origin = country_of_origin
        
        
    def triple_price(self):
        return 3 * self.price
```