# Week 03 - Project

## Guidelines and Docstrings

* https://peps.python.org/pep-0008/
* https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html

### Arguments vs Parameters

Used interchangeably but technically

<pre>
def my_function(parameter):
    return parameter
    
my_function(argument)
</pre>

In [None]:
# google docstring https://www.programiz.com/python-programming/docstrings example 5
def add_binary(a, b):
    '''
    Returns the sum of two decimal numbers in binary digits.

            Parameters:
                    a (int): A decimal integer
                    b (int): Another decimal integer

            Returns:
                    binary_sum (str): Binary string of the sum of a and b
    '''

    binary_sum = bin(a+b)[2:]
    return binary_sum

print(add_binary.__doc__)

In [None]:
# click right after add_binary and then shift+tab (repeat tab key to open the full docstring)
add_binary

## Testing

* Assert
* Unittest
* Pytest

In [None]:
# show add_binary, binary, and use assert to validate
print(add_binary(3, 2))
print(bin(5))
assert add_binary(3, 2) == bin(5)[2:]

In [None]:
# sklearn dataset to dataframe to csv using pandas
# import pandas as pd
# from sklearn.datasets import load_iris

# data = load_iris()
# df = pd.DataFrame(data=data.data, columns=data.feature_names)
# df.to_csv('iris.csv', index=False)

In [None]:
# pandas read_csv (more in Week 4)
import pandas as pd

df = pd.read_csv('iris.csv')
df.head()

In [None]:
# just get the data from sklearn datasets
import pandas as pd
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y = True)
X[0:5]

In [None]:
# create our own reader
from csv import reader

def read_csv(csv_file):
    '''
    Returns a list of values from a csv file.

            Parameters:
                    csv_file (file): A csv file

            Returns:
                    values from the csv file as type string
    '''
    
    dataset = list()
    with open(csv_file, 'r') as file:
        csv_reader = reader(file)
        next(csv_reader)
        for row in csv_reader:
            if not row:
                continue
            dataset.append(row)
    return dataset

csv_file = 'iris.csv'
data = read_csv(csv_file)
data[0:5]

In [None]:
def string_to_float(data):
    '''
    Returns float values from a list of string values.

            Parameters:
                    data list (str): A list of string values that look like floats 

            Returns:
                    temp list (float): A list of floats
    '''
    temp = []
    for line in data:
        temp.append([float(el) for el in line])
        
    return temp

csv_file = 'iris.csv'
data = read_csv(csv_file)
data = string_to_float(data)
data[0:5]

## Exponents

($x^\frac{1}{n} = \sqrt[n]{x}$)

$x^\frac{1}{n}$

In [None]:
# write your function, include docstring
...

$\sqrt[n]{x}$

In [None]:
# write your function, include docstring
...

## Measures of Center

### Mean
$\frac{1}{N} \sum_{i=1}^{N} x_i$

In [None]:
# write your function, include docstring
...

### Median

$
\left\{
    \begin{array}\\
        x[\frac{n+1}{2}] & \mbox{if n is odd} \\
        \frac{x[\frac{n}{2}] + x[\frac{n}{2}+1]}{2} & \mbox{if n is even} 
    \end{array}
\right.
$

In [None]:
# write your function, include docstring
...

### Mode

https://www.mathsisfun.com/mode.html

## Measures of Spread

In [None]:
# write your function, include docstring
...

### Variance

if population: $\sigma^2 = \frac{1}{N}\sum({x}-\mu)^2$

else if sample: $s^2 = \frac{\sum(x-\bar{x})^2}{n-1}$

else unknown

parameters: list, df 

In [1]:
# write your function, include docstring
...

### Standard Deviation

if population: $\sigma = \sqrt{\frac{1}{N}\sum(x-\mu)^2}$

else if sample: $s = \sqrt{\frac{\sum(x-\bar{x})^2}{n - 1}}$

else unknown

parameters: list, df

In [None]:
# write your function, include docstring
...

## Scaling

### Standardization

$\large{\frac{x - \bar{x}}{\sigma}}$

parameter: list

In [None]:
# write your function, include docstring
...

### Normalization (Min Max Scaling)

$\large{\frac{x - x_{min}}{x_{max} - x_{min}}}$

parameter: list

In [None]:
# write your function, include docstring
...

## Linear Regression

### Simple Linear Regression

$y = \beta_0 + \beta_1 x_1$

In [1]:
# write your function, include docstring
...

### Mean Squared Error

$\frac{1}{n}\sum(y-\hat{y})^2$

In [None]:
# write your function, include docstring
...

### Root Mean Squared Error

$\sqrt{\frac{1}{n}\sum(y-\hat{y})^2}$

In [None]:
# write your function, include docstring
...

### R Squared

* $SS_{res} = \sum{(y - \hat{y})^2}$
* $SS_{tot} = \sum{(y - \bar{y})^2}$
* $R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$

In [None]:
# write your function, include docstring
...