# Map( ), Apply( ), Applymap( ) With the Lambda Function

from Evelyn Li

References: 
    
https://medium.com/@evelynli_30748/map-apply-applymap-with-the-lambda-function-5e83028be759

In [1]:
import numpy as np
import pandas as pd

In [2]:
multi = lambda x, y: (x * y)

In [3]:
multi(5, 6)

30

# Map () 

The map() method only works on panda series where different types of operation can be applied to the items in the series. 
- When you apply the map(function) method on a series, the map() function takes each element in the series and applies the function to it, and returns the transformed series.

In [4]:
from sklearn.datasets import load_diabetes, load_iris

In [5]:
data = load_iris()

In [6]:
features = pd.DataFrame(data= data['data'], columns= data['feature_names'])

In [7]:
features.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [8]:
#example : let's say, we would like to change the measurement of the sepal length from cm to mm, 
# this is what we can do with the map function and put a function call cm_to_mm inside. 
def cm_to_mm(cm):
    mm = cm * 10
    
    return mm

In [9]:
features['sepal length (cm)'].map(cm_to_mm).head() #this way, we have used this function on this pandas series 

0    51.0
1    49.0
2    47.0
3    46.0
4    50.0
Name: sepal length (cm), dtype: float64

In [10]:
#what if we want to do it on two columns? 
features[['sepal length (cm)',"sepal width (cm)"]].map(cm_to_mm).head()

AttributeError: 'DataFrame' object has no attribute 'map'

<br><p style="font-family: Arial; font-size:3.75em;color:red; font-style:bold">
And you can see, this does not work</p><br><br>

- This is when we will need the apply() function.
- The apply () method works on panda series and data frames with a variety of functions easily applied depending on the datatype.
- Similar to map(), when you use the apply() method on a series or a data frame, the function takes each element in the series and apply the function onto the element, then returns the transformed series or data frame.

In [11]:
#what if we want to do it on two columns? 
features[['sepal length (cm)',"sepal width (cm)"]].apply(cm_to_mm).head()

#and it works!!!!

Unnamed: 0,sepal length (cm),sepal width (cm)
0,51.0,35.0
1,49.0,30.0
2,47.0,32.0
3,46.0,31.0
4,50.0,36.0


<br><p style="font-family: Arial; font-size:3.75em;color:red; font-style:bold">
Change the entire data frame?</p><br><br>

- with apply()
- with applymap()

In [12]:
# with apply()
features[['sepal length (cm)','sepal width (cm)',
          'petal length (cm)','petal width (cm)']].apply(cm_to_mm).head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,51.0,35.0,14.0,2.0
1,49.0,30.0,14.0,2.0
2,47.0,32.0,13.0,2.0
3,46.0,31.0,15.0,2.0
4,50.0,36.0,14.0,2.0


# Applymap ()

The applymap() method works on the entire pandas data frame where the input function is applied to every element individually. In other words, applymap() is appy() + map()!

In [13]:
#so this is what we do, we use applymap()
features.applymap(cm_to_mm).head()

#Let me save this dataframe into a new data frame and rename it 

iris = features.applymap(cm_to_mm).head()
iris.columns = ['sepal length (mm)', 'sepal width (mm)', 
                'petal length (mm)', 'petal width (mm)'] 
iris.head()

Unnamed: 0,sepal length (mm),sepal width (mm),petal length (mm),petal width (mm)
0,51.0,35.0,14.0,2.0
1,49.0,30.0,14.0,2.0
2,47.0,32.0,13.0,2.0
3,46.0,31.0,15.0,2.0
4,50.0,36.0,14.0,2.0


# Combining Lambda Function With Apply( )

In [14]:
#instead of the cm_to_mm function, I will just the lambda function to tranform the dataset 
features[['sepal length (cm)']].apply(lambda x: x*10).head()

Unnamed: 0,sepal length (cm)
0,51.0
1,49.0
2,47.0
3,46.0
4,50.0


In [15]:
# Now try it with the entire dataframe 
features.applymap(lambda x: x*10).head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,51.0,35.0,14.0,2.0
1,49.0,30.0,14.0,2.0
2,47.0,32.0,13.0,2.0
3,46.0,31.0,15.0,2.0
4,50.0,36.0,14.0,2.0


In [18]:
# create a list of all the columns' name 
col_name = ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

#create a new column call interation using lambda and apply function 
features['interaction'] = features.apply(lambda x : x[col_name[0]]*x[col_name[1]]*\
                                         x[col_name[2]]* x[col_name[3]], axis = 1)

features.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),interaction
0,5.1,3.5,1.4,0.2,4.998
1,4.9,3.0,1.4,0.2,4.116
2,4.7,3.2,1.3,0.2,3.9104
3,4.6,3.1,1.5,0.2,4.278
4,5.0,3.6,1.4,0.2,5.04


- The apply function will allow us to pass a function over an axis of our DataFrame:

lambda functions are disposable, and often just use the variable x as a placeholder for whatever they're operating on:

- because we’re operating on our rows, x becomes a row each time our function is applied.
- we can specify which columns we want to operate on, noting that those column values will be multiplied by the same column in that row.

In [25]:
# let's say we have a dataframe with names 
name = pd.DataFrame(data = ['Braund, Mr. Owen Harris',
 'Cumings, Mrs. John Bradley (Florence Briggs Thayer)',
 'Heikkinen, Miss. Laina',
 'Futrelle, Mrs. Jacques Heath (Lily May Peel)',
 'Allen, Mr. William Henry',
 'Moran, Mr. James',
 'McCarthy, Mr. Timothy J',
 'Palsson, Master. Gosta Leonard',
 'Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)',
 'Nasser, Mrs. Nicholas (Adele Achem)'], columns = ['Name'] )

#Take a look at the Data 
name.head()

Unnamed: 0,Name
0,"Braund, Mr. Owen Harris"
1,"Cumings, Mrs. John Bradley (Florence Briggs Th..."
2,"Heikkinen, Miss. Laina"
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)"
4,"Allen, Mr. William Henry"


In [29]:
#Let's say, I want to extract the Titles of each name, we can do this: 

name['Name'].apply(lambda x: x.split(" ")[1].replace(".", ""))

#save this output to "title"
name['Title'] = name['Name'].apply(lambda x: x.split(" ")[1].replace(".", ""))
name['FirstName'] = name['Name'].apply(lambda x: x.split(" ")[0].replace(",", ""))
name['LastName'] = name['Name'].apply(lambda x: x.split(" ")[2].replace(" ", ""))

#take a look at out dataframe 
name = pd.DataFrame(name, columns=['Title', 'FirstName', 'LastName','Name'])
name = name[['Title', 'FirstName', 'LastName']]
name

Unnamed: 0,Title,FirstName,LastName
0,Mr,Braund,Owen
1,Mrs,Cumings,John
2,Miss,Heikkinen,Laina
3,Mrs,Futrelle,Jacques
4,Mr,Allen,William
5,Mr,Moran,James
6,Mr,McCarthy,Timothy
7,Master,Palsson,Gosta
8,Mrs,Johnson,Oscar
9,Mrs,Nasser,Nicholas
