**Applying operations over Pandas Dataframe**

https://chrisalbon.com/python/data_wrangling/pandas_apply_operations_to_dataframes/

**Preliminaries**

In [1]:
import pandas as pd
import numpy as np

**Create a dataframe**

In [5]:
data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 
        'year': [2012, 2012, 2013, 2014, 2014], 
        'reports': [4, 24, 31, 2, 3],
        'coverage': [25, 94, 57, 62, 70]}

df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df

Unnamed: 0,coverage,name,reports,year
Cochice,25,Jason,4,2012
Pima,94,Molly,24,2012
Santa Cruz,57,Tina,31,2013
Maricopa,62,Jake,2,2014
Yuma,70,Amy,3,2014


**Create a capitalization lambda function**

In [3]:
capitalizer = lambda x : x.upper()

**Apply the capitalizer over the column 'name'**

apply() can apply a function over any axis of the dataframe

In [6]:
df['name'].apply(capitalizer)

Cochice       JASON
Pima          MOLLY
Santa Cruz     TINA
Maricopa       JAKE
Yuma            AMY
Name: name, dtype: object

**Map the capitalizer lambda function over each element in the series 'name'**

In [7]:
df['name'].map(capitalizer)

Cochice       JASON
Pima          MOLLY
Santa Cruz     TINA
Maricopa       JAKE
Yuma            AMY
Name: name, dtype: object

**Apply a square root function to every single cell in the entire dataframe**

applymap() applies a function to every single element in the dataframe

In [9]:
#Drop the string variable 'name' so that applymap() can run

df.drop('name',axis =1,inplace = True)

#Return the square root of every cell in the dataframe

df.applymap(np.sqrt)

Unnamed: 0,coverage,reports,year
Cochice,5.0,2.0,44.855323
Pima,9.69536,4.898979,44.855323
Santa Cruz,7.549834,5.567764,44.866469
Maricopa,7.874008,1.414214,44.877611
Yuma,8.3666,1.732051,44.877611


**Applying a function over a dataframe**

Create a function that multiples all non_strings by 100

In [10]:
def times100(x):
    if type(x) is str:
        return x
    elif x:
        return x * 100
    else:
        return
    

**Apply the times100() function over cell in the dataframe**

In [11]:
df.applymap(times100)

Unnamed: 0,coverage,reports,year
Cochice,2500,400,201200
Pima,9400,2400,201200
Santa Cruz,5700,3100,201300
Maricopa,6200,200,201400
Yuma,7000,300,201400
