https://towardsdatascience.com/apply-functions-to-pandas-dataframe-using-map-apply-applymap-and-pipe-9571b1f1cb18

DataFrame.pipe

DataFrame.apply     Apply a function along input axis of DataFrame.

DataFrame.applymap  Apply a function elementwise on a whole DataFrame.

Series.map          Apply a mapping correspondence on a Series.

What pipe does is to allow you to pass a callable with the expectation that the object that called pipe is the object that gets passed to the callable.

With apply we assume that the object that calls apply has subcomponents that will each get passed to the callable that was passed to apply.  In the context of a groupby the subcomponents are slices of the dataframe that called groupby where each slice is a dataframe itself.  This is analogous for a series groupby.

The main difference between what you can do with a pipe in a groupby context is that you have available to the callable the entire scope of the the groupby object.  For apply, you only know about the local slice.

<img src = "./Images/MapPipe.png">

In [42]:
# python version 3.9
# pandas version 1.4.1
import pandas as pd
df = pd.DataFrame({'name':['John Doe', 'Mary Re', 'Harley Me'],
                   'gender':[1,2,0],
                   'age':[80, 38, 12],
                   'height': [161.0, 173.5, 180.5],
                   'weight': [62.3, 55.7, 80.0]
                   })
df

Unnamed: 0,name,gender,age,height,weight
0,John Doe,1,80,161.0,62.3
1,Mary Re,2,38,173.5,55.7
2,Harley Me,0,12,180.5,80.0


In [43]:
d = df.copy(deep=True)
gender_map = {0: 'Unknown', 1:'Male', 2:'Female'}
d['gender'] = d['gender'].map(gender_map)
d

Unnamed: 0,name,gender,age,height,weight
0,John Doe,Male,80,161.0,62.3
1,Mary Re,Female,38,173.5,55.7
2,Harley Me,Unknown,12,180.5,80.0


In [44]:
d=df.copy(deep=True)
gender_map = {0: 'Unknown', 1:'Male', 2:'Female'}
s = pd.Series(gender_map) # mapping series
d['gender'] = d['gender'].map(s)
d

Unnamed: 0,name,gender,age,height,weight
0,John Doe,Male,80,161.0,62.3
1,Mary Re,Female,38,173.5,55.7
2,Harley Me,Unknown,12,180.5,80.0


In [45]:
d = df.copy(deep=True)
d['age_group'] = d['age'].map(lambda x: 'Adult' if x >= 21 else 'Child')
d

Unnamed: 0,name,gender,age,height,weight,age_group
0,John Doe,1,80,161.0,62.3,Adult
1,Mary Re,2,38,173.5,55.7,Adult
2,Harley Me,0,12,180.5,80.0,Child


In [46]:
def get_age_group(age, threshold):
    if age >= int(threshold):
        age_group = 'Adult'
    else:
        age_group = 'Child'
    return age_group
d = df.copy(deep=True)
# keyword argument
d['age_group'] = d['age'].apply(get_age_group, threshold = 21)
d

Unnamed: 0,name,gender,age,height,weight,age_group
0,John Doe,1,80,161.0,62.3,Adult
1,Mary Re,2,38,173.5,55.7,Adult
2,Harley Me,0,12,180.5,80.0,Child


In [47]:
d = df.copy(deep=True)
# keyword argument
d['age_group'] = d['age'].apply(get_age_group,args=(21,))
d

Unnamed: 0,name,gender,age,height,weight,age_group
0,John Doe,1,80,161.0,62.3,Adult
1,Mary Re,2,38,173.5,55.7,Adult
2,Harley Me,0,12,180.5,80.0,Child


In [48]:
def get_age_group(age, lower_threshold, upper_threshold):
    if age >= int(upper_threshold):
        age_group = 'Senior'
    elif age <= int(lower_threshold):
        age_group = 'Child'
    else:
        age_group = 'Adult'
    return age_group

d = df.copy(deep=True)
d['age_group'] = df['age'].apply(get_age_group, args = (20,65))
d

Unnamed: 0,name,gender,age,height,weight,age_group
0,John Doe,1,80,161.0,62.3,Senior
1,Mary Re,2,38,173.5,55.7,Adult
2,Harley Me,0,12,180.5,80.0,Child


In [49]:
def get_last_name(x):
    return pd.Series(x.split(' ')[-1]) # function returns a Series
#type(df['name'].apply(get_last_name))
df['name'].apply(get_last_name)

Unnamed: 0,0
0,Doe
1,Re
2,Me


In [50]:
import numpy as np
df[['height', 'weight']].apply(np.round, axis = 0)

Unnamed: 0,height,weight
0,161.0,62.0
1,174.0,56.0
2,180.0,80.0


In [51]:
def find_average_weight(df):
    return df['weight'].mean()
type(df.pipe(find_average_weight))

float

In [52]:
def report_average_weight(df):
    avg_weight = df['weight'].mean()
    return f'The average weight is {avg_weight}'
df.pipe(report_average_weight)

'The average weight is 66.0'

In [56]:
tst =[]
import pandas as pd
def fn(me):
    tst.append('A')
    tst.append('B')
    return me

mepps=[["Mark",1,2],["Craig",2,3]]
me1 = pd.DataFrame(mepps,columns=["Name","Value1","Value2"])
print(fn(me1))
print(tst)

    Name  Value1  Value2
0   Mark       1       2
1  Craig       2       3
['A', 'B']
