# Week 11 Warm-Up Type-Along
## Using `map()`, `apply()`, `applymap()`

Source: https://towardsdatascience.com/introduction-to-pandas-apply-applymap-and-map-5d3e044e93ff


#### First import pandas and numpy build a DataFrame

numpy is behind a lot of the fundamentals of pandas and offers a lot of fast mathematical calculations. 
>- numpy documentation: https://numpy.org/devdocs/user/whatisnumpy.html

In [3]:
import os, pandas as pd, numpy as np

df = pd.DataFrame({'A': [1,2,3,4],
                   'B': [10,20,30,40],
                   'C': [20,40,60,80]
                  }, 
                     index = ['Row 1', 'Row 2', 'Row 3', 'Row 4']
                  )

In [4]:
df.head()

Unnamed: 0,A,B,C
Row 1,1,10,20
Row 2,2,20,40
Row 3,3,30,60
Row 4,4,40,80


## Using `map()` to modify a Series

#### First, add 10 to all numbers in column 'A'
>- Store the results in a new column named, `'A+10'`

In [11]:
df['A+10'] = df['A'].map(lambda x: x + 10)

df

Unnamed: 0,A,B,C,A+10
Row 1,1,10,20,11
Row 2,2,20,40,12
Row 3,3,30,60,13
Row 4,4,40,80,14


#### We can also use conditional logic
>- Add a new column that stores a `0` for all values in 'A' below 3 and `1` for all values 3 and above
>- Name the new column, `flag3`

In [14]:
df['flag3'] = df['A'].map(lambda x: 0 if x < 3 else 1)

df

Unnamed: 0,A,B,C,A+10,flag3
Row 1,1,10,20,11,0
Row 2,2,20,40,12,0
Row 3,3,30,60,13,1
Row 4,4,40,80,14,1


### If you have more conditions or more complicated request define a custom function
>- Then call your function within `map()`

In [16]:
def myfilter(x):
    
    if x < 2:
        return 0
    elif x == 3:
        return 3
    else:
        return 1

In [19]:
df['flagX'] = df['A'].map(myfilter)

df

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX
Row 1,1,10,20,11,0,0,0
Row 2,2,20,40,12,0,1,1
Row 3,3,30,60,13,1,3,3
Row 4,4,40,80,14,1,1,1


In [28]:
df.drop(columns = ['flag4'])

df

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX,rowTot
Row 1,1,10,20,11,0,0,0,31
Row 2,2,20,40,12,0,1,1,62
Row 3,3,30,60,13,1,3,3,93
Row 4,4,40,80,14,1,1,1,124


# Using `apply()` to apply a function along an axis of the DataFrame or on values of a series

## Let's use `apply()` to sum all the values across the columns
>- Name the column `rowTot`
>- We will only sum columns 'A', 'B', 'C'

### First, define a function

In [25]:
def mySum(value):
    return value.sum()

### Then, use `apply()` to apply your function across the columns
>- `axis = 1` tells apply() to work across the columns

In [31]:
df['rowTot'] = df[['A','B','C']].apply(mySum, axis=1)

df

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX,rowTot
Row 1,1,10,20,11,0,0,0,31
Row 2,2,20,40,12,0,1,1,62
Row 3,3,30,60,13,1,3,3,93
Row 4,4,40,80,14,1,1,1,124


#### You can also use lambda function becuase are custom function was not that complicated

In [32]:
df['rowTot'] = df[['A','B','C']].apply(lambda x: x.sum(), axis = 1)

df

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX,rowTot
Row 1,1,10,20,11,0,0,0,31
Row 2,2,20,40,12,0,1,1,62
Row 3,3,30,60,13,1,3,3,93
Row 4,4,40,80,14,1,1,1,124


### We can also use `apply()` to sum across (or down if you prefer) the rows
>- We will label the sum row as `colTot`
>- Passing `axis = 0` tells apply() to work down the rows

In [33]:
df.loc['colTot'] = df.apply(lambda x: x.sum(), axis = 0)

df

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX,rowTot
Row 1,1,10,20,11,0,0,0,31
Row 2,2,20,40,12,0,1,1,62
Row 3,3,30,60,13,1,3,3,93
Row 4,4,40,80,14,1,1,1,124
colTot,10,100,200,50,2,5,5,310


# We use `applymap()` to apply functions across every element in a DataFrame

## Square everything in the DataFrame using `np.square`
>- Check here for other numpy math functions: https://numpy.org/doc/stable/reference/routines.math.html

In [34]:
df

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX,rowTot
Row 1,1,10,20,11,0,0,0,31
Row 2,2,20,40,12,0,1,1,62
Row 3,3,30,60,13,1,3,3,93
Row 4,4,40,80,14,1,1,1,124
colTot,10,100,200,50,2,5,5,310


In [37]:
df.applymap(np.square)

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX,rowTot
Row 1,1,100,400,121,0,0,0,961
Row 2,4,400,1600,144,0,1,1,3844
Row 3,9,900,3600,169,1,9,9,8649
Row 4,16,1600,6400,196,1,1,1,15376
colTot,100,10000,40000,2500,4,25,25,96100


#### Can also use a 'lambda' function

In [38]:
df.applymap(lambda x: x**2)

Unnamed: 0,A,B,C,A+10,flag3,flag4,flagX,rowTot
Row 1,1,100,400,121,0,0,0,961
Row 2,4,400,1600,144,0,1,1,3844
Row 3,9,900,3600,169,1,9,9,8649
Row 4,16,1600,6400,196,1,1,1,15376
colTot,100,10000,40000,2500,4,25,25,96100
