# Applying Functions To Series and DataFrames

In [1]:
import pandas as pd

In [2]:
dict = {
    "email": ["CoreyMSchafer@gmail.com", "JaneDoe@gmail.com", "JohnDoe@gmail.com"],
    "first": ["Corey", "Jane", "John"],
    "last": ["Schafer", "Doe", "Doe"]
}

In [3]:
df = pd.DataFrame(dict)

In [4]:
df

Unnamed: 0,email,first,last
0,CoreyMSchafer@gmail.com,Corey,Schafer
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe


## The apply method
- **apply method applies a function to every element in a Series**

In [5]:
def my_func(input):
    return input.upper()

In [6]:
# Let's apply my_func method to every element in the email column:
df['email'] = df['email'].apply(my_func)
df
# Note that df['email'] gives a Series.

Unnamed: 0,email,first,last
0,COREYMSCHAFER@GMAIL.COM,Corey,Schafer
1,JANEDOE@GMAIL.COM,Jane,Doe
2,JOHNDOE@GMAIL.COM,John,Doe


In [7]:
# Applying the default "len" function to every element in "first" column
print("The length of string 'hello' is:", len('hello'), '\n\n')
print("The length of every element in 'first' column is:")
df['first'].apply(len)

The length of string 'hello' is: 5 


The length of every element in 'first' column is:


0    5
1    4
2    4
Name: first, dtype: int64

In [8]:
# Applying the lambda function
df['email'] = df['email'].apply(lambda x: x.lower())
df

Unnamed: 0,email,first,last
0,coreymschafer@gmail.com,Corey,Schafer
1,janedoe@gmail.com,Jane,Doe
2,johndoe@gmail.com,John,Doe


- ### Note that apply function applies to a Series. So, if it is applied to a DataFrame, it will get applied to every column (by default) as a DataFrame is indeed a Series of rows or columns.

In [9]:
# Getting the element which comes first in the alphabetical order from each column.
# The function which achieves this is: pd.Series.min which takes a Series (column in our case)
# and returns the element with minimum numeric value. If applied to a string Series, returns the element which comes
# first in the alphabetical order
df.apply(pd.Series.min)
# By default, apply method applies to the columns if a DataFrame is passed which means the complete syntax is like:
# df.apply(pd.Series.min, axis="rows")
# axis = "rows" means all rows of 1 column are extracted or in other words, A whole column is extracted

email    coreymschafer@gmail.com
first                      Corey
last                         Doe
dtype: object

In [10]:
# Getting element with smallest length from each row.
# To achieve this, we have to apply min function to all columns of every row.
df.apply(lambda x: x.min(), axis = "columns")
# axis = "columns" means all columns of 1 row are extracted in other words, A whole row is extracted

0    Corey
1      Doe
2      Doe
dtype: object

## The map method
**1. It applies a function to every element in a DataFrame**

**2. It substitutes values in a Series**

In [11]:
# Getting length of every element in the DataFrame
df.map(len)

Unnamed: 0,email,first,last
0,23,5,7
1,17,4,3
2,17,4,3


In [12]:
# Converting all the elements of a DataFrame to uppercase
df_UPPER = df.map(str.upper)
df_UPPER

Unnamed: 0,email,first,last
0,COREYMSCHAFER@GMAIL.COM,COREY,SCHAFER
1,JANEDOE@GMAIL.COM,JANE,DOE
2,JOHNDOE@GMAIL.COM,JOHN,DOE


In [13]:
# Substituting values in a Series
df['first'].map( {'Corey': 'Korey'} )
# Substitutes every element in 'first' data series or column having value 'Corey' to 'Korey'

0    Korey
1      NaN
2      NaN
Name: first, dtype: object

- **Note that if any element doesn't match the substitution criteria, it is substituted by NaN**

In [14]:
# Substituting only those values in a Series which match the substitution criteria
df['first'].replace({'Corey': 'Korey'})

0    Korey
1     Jane
2     John
Name: first, dtype: object

In [19]:
# Listing unique values
df['last'].unique()

array(['Schafer', 'Doe'], dtype=object)