# Before your start:
- Read the README.md file
- Comment as much as you can and use the resources in the README.md file
- Happy learning!

In [1]:
import numpy as np
import pandas as pd

# Challenge 1 - Inside a function
You are already familiar with several built in functions on python as well as libraries. Let's try to figure out how they are made in pure python. 

In [162]:
L = [1,2,3,4,5,6,7,8]


Create a function that returns the count of a list of numbers

In [14]:
def count(my_list):
    return len(my_list)
count(L)

8

Create a function that returns the mean of a list of numbers

In [15]:
def mean_list(my_list):
    return np.mean(L)
mean_list(L)

4.5

Create a function that returns the standard deviation of a list of numbers

In [17]:
def std_list(my_list):
    return np.std(L)
std_list(L)

2.29128784747792

Create a function that returns the count, mean and standard deviation of a given list in dataframe format.  

In [232]:
def data_frame(my_list):
    d = {'Count': len(my_list), 'Mean': np.mean(my_list), 'STD': np.std(my_list)}
    return pd.DataFrame(d, index = ['list'])
data_frame(L)

Unnamed: 0,Count,Mean,STD
list,8,4.5,2.291288


# Challenge 2 - String Cleaning
When working with textual data you will often have to clean it before being able to analyse it. Create a function that receives a string and returns that string in all lower case and without special characters. 

In [233]:
s = 'Your code HERE:'
def str_lower(stri):
    return stri.lower().strip(':').strip('#').strip('$').strip('&').strip('@')
str_lower(s)

'your code here'

# Challenge 2 -Bonus
Create a function that receives a string and returns the string in alphabetical order. 

In [200]:
s = 'Robson Silva'
def str_alpha(stri):
    return ' '.join(sorted(stri.strip(' ')))
str_alpha(s)

'  R S a b i l n o o s v'

# Challenge 3 - Applying Functions to DataFrames

In this challenge, we will look at how to transform cells or entire columns at once.

First, let's load a dataset. We will download the famous Iris classification dataset in the cell below.

In [88]:
columns = ['sepal_length', 'sepal_width', 'petal_length','petal_width','iris_type']
iris = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data", names=columns)

In [90]:
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,iris_type
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


Let's start off by using built-in functions. Try to apply the numpy mean function and describe what happens in the comments of the code.

In [201]:
# The numpy method returns the mean of each column with number.
np.mean(iris)

sepal_length    5.843333
sepal_width     3.054000
petal_length    3.758667
petal_width     1.198667
dtype: float64

Next, we'll apply the standard deviation function in numpy (`np.std`). Describe what happened in the comments.

In [202]:
# The numpy method returns the standard deviation of each column with number.
np.std(iris)

sepal_length    0.825301
sepal_width     0.432147
petal_length    1.758529
petal_width     0.760613
dtype: float64

The measurements are in centimeters. Let's convert them all to inches. First, we will create a dataframe that contains only the numeric columns. Assign this new dataframe to `iris_numeric`.

In [219]:
iris_numeric = iris.iloc[:,0:4]
iris_numeric.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


Next, we will write a function that converts centimeters to inches in the cell below. Recall that 1cm = 0.393701in.

In [220]:
def cm_to_in(x):
    return x*0.3937001

Now convert all columns in `iris_numeric` to inches in the cell below. We like to think of functional transformations as immutable. Therefore, save the transformed data in a dataframe called `iris_inch`.

In [225]:
def convert(df):
    df_new = df.copy()
    for i in df_new:
        for j in range(len(df_new[i])):
            df_new[i][j] = cm_to_in(df_new[i][j])
    return df_new

In [227]:
iris_inch = convert(iris_numeric)
iris_inch.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width
0,2.007871,1.37795,0.55118,0.07874
1,1.92913,1.1811,0.55118,0.07874
2,1.85039,1.25984,0.51181,0.07874
3,1.81102,1.22047,0.59055,0.07874
4,1.9685,1.41732,0.55118,0.07874


We have just found that the original measurements were off by a constant. Define the global constant `error` and set it to 2. Write a function that uses the global constant and adds it to each cell in the dataframe. Apply this function to `iris_numeric` and save the result in `iris_constant`.

In [230]:
error = 2
def add_constant(x):
    x = x + error

def add_error(df,error):
    df_new = df.copy()
    for i in df:
        for j in range(len(df[i])):
            df_new[i][j] = df[i][j] + error 
    return df_new    
    
iris_constant = add_error(iris_numeric,error)
iris_constant.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width
0,7.1,5.5,3.4,2.2
1,6.9,5.0,3.4,2.2
2,6.7,5.2,3.3,2.2
3,6.6,5.1,3.5,2.2
4,7.0,5.6,3.4,2.2


# Bonus Challenge - Applying Functions to Columns

Read more about applying functions to either rows or columns [here](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html) and write a function that computes the maximum value for each row of `iris_numeric`

In [231]:
iris_numeric.apply(max, axis=1).head()

0    5.1
1    4.9
2    4.7
3    4.6
4    5.0
dtype: float64

Compute the combined lengths for each row and the combined widths for each row using a function. Assign these values to new columns `total_length` and `total_width`.

In [243]:
def total(df):
    df_new = df.copy()
    df_new['total_lenght'] = df_new['sepal_length'] + df_new['petal_length']
    df_new['total_width'] = df_new['sepal_width'] + df_new['petal_width']
    return df_new

In [244]:
final_iris = total(iris_numeric)

In [245]:
final_iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,total_lenght,total_width
0,5.1,3.5,1.4,0.2,6.5,3.7
1,4.9,3.0,1.4,0.2,6.3,3.2
2,4.7,3.2,1.3,0.2,6.0,3.4
3,4.6,3.1,1.5,0.2,6.1,3.3
4,5.0,3.6,1.4,0.2,6.4,3.8
