# Week-4

### Contents

* functions
    * `map()`
    * `apply()`
    * `lambda` functions
* Pandas DataFrame functions
    - `pandas.apply()`
    - `pandas.applymap()`


# Functions

Tutorial: https://docs.python.org/3.7/tutorial/controlflow.html#defining-functions

The keyword *def* is used for a function definition.<br/> 
After *def*, function name and the parenthesized list of parameters are introduced.<br/>
Any statements to be included in the function must be indented.

In [105]:
def myFunction1(myInput = 1): # 1 is the default value for the input parameter
    i = 1
    while i <= myInput:
        print("Hello")
        i += 1
    print("End")

## Calling a Function

In [106]:
myFunction1() #no input parameters. default input parameter of 1 is used

Hello
End


In [107]:
myFunction1(myInput = 2) #call function with a custom input parameter value

Hello
Hello
End


In [108]:
myFunction1(1) #we really don't need to speficy an input parameter value by calling out its name

Hello
End


In [110]:
def myFunction2(myInput): #no default value is specified
    i = 1
    while i <= myInput:
        print("Hello")
        i += 1
    print("End")

In [111]:
myFunction2() #throws an error. we must specify an input value

TypeError: myFunction2() missing 1 required positional argument: 'myInput'

In [112]:
myFunction2(1)

Hello
End


### Multiple Input Parameters

In [20]:
def myFunction3(a, b=2, c, d=4): #throws and error
    print(a,b,c,d)

SyntaxError: non-default argument follows default argument (<ipython-input-20-41ccd4f12d01>, line 1)

In [26]:
def myFunction4(b=2, d=4, a, c): #throws and error
    print(a,b,c,d)

SyntaxError: non-default argument follows default argument (<ipython-input-26-f34550b78d50>, line 1)

### Remark:
Always put parameters with default values after those without any.<br/> 
Only in that case Python will know which arguments go into which when calling the function. 

In [113]:
def myFunction4(a, c, b=2, d=4):
    print(a,b,c,d)

In [23]:
myFunction4(0,1) #prints a and b as given input values and c and d with default values

0 2 1 4


In [27]:
myFunction4(0,1,3) #a=0 b=3 and c=1 are given values

0 3 1 4


In [28]:
myFunction4(0,1,3,5) #a=0 b=3 c=1 d=5 are given values

0 3 1 5


### Remark:
When calling a function, the order does not matter as long as we specify input parameter names.

In [24]:
myFunction4(c=1, a=55)

55 2 1 4


In [25]:
myFunction4(b=2, d=1, c=23, a=5)

5 2 23 1


In [115]:
 def myFunction5(a=0, b=0, c=0, d=0): #include all default arguments 
    print(a, b, c, d)

In [116]:
myFunction5(1,2) #python will assign input parameter values from the beginning

1 2 0 0


In [34]:
myFunction5(c=2) # specifying an argument we wish to pass a value to

0 0 2 0


### Remark:

Arguments that are non-default are called "positional arguments".<br/>
The default arguments are called "keyword arguments".

• All positional parameters must be located before all keyword parameters.<br/>
• Keyword parameters may occur in any order.<br/>
• The function call must supply at least as many parameters as the function has positional arguments.<br/>
• If the caller supplies more positional parameters than the function has positional arguments, parameters are matched with keyword arguments according to their position.<br/>

### First-class Functions

In Python, functions are so-called “first-class citizens”. This means that they can be
used in the same way as variables: <br/>
They can be assigned to variables, passed to, and returned from functions, as well as stored in collections (e.g. list). 

In [50]:
def myFunction6(x):
    return x ** 2 #returns square of input value

myVariable = myFunction6 (12) #function value is assigned to a variable
print(myVariable)

144


In [72]:
myNewFunction = myFunction6 #function itself is assigned under a different name
myNewFunction(2)

4

### Example:
Let's store functions in a list and call them with an input

In [85]:
def backwards(string):# a function to print any string backwards
    print(string[::-1])

def half(string): #a function to print half of the input string
    print(string[:len(string)//2])


#create a list of functions. only use the name of the functions 
#the first entry in the list is the built-in function "print"
print_functions = [print, backwards, half] 

for func in print_functions: #calling multiple different functions with a single input
    func("Hello World!")

Hello World!
!dlroW olleH
Hello 


## Remark:
The above given functionality is useful if we desire to call multiple different functions by passing a single or a set of the same inputs into.<br/>
If we desire to accomplish the other way around, meaning that if we desire to pass multiple different inputs into the same function we can use the following Python feature.

## Map
A common operation in Python is to call the same function for every element in a collection, such as a list or a tuple and then create a new list with the results. 

Let's first accomplish this with a for loop:

In [117]:
def myFunction6(x):
    return x ** 2 #returns square of input value

myInputsList = [1, 2, 3]
my_newList = []

for x in myInputsList:
    my_newList.append(myFunction6(x)) #call the function with the elements of the given list and create a new list
print(my_newList)

[1, 4, 9]


Now let's accomplish this with map

In [128]:
my_newList = list(map(myFunction6, myInputsList)) #only a single line does the same job!
print(my_newList)

[1, 4, 9]


In [120]:
pow(2,3)

8

Map function can take any iterable collection (i.e. list, set, tuple etc.) It returns a special Map object, which can be converted into a list.

In [100]:
a = [1.1, 1.2]
list(map(round, a)) #round each element of a list and return a list

[1, 1]

## Filter
filter and map are similar to each other. In filter, the input function should to return either True or False. If it returns True then the given element is included in the final list.<br/>

In [154]:
def isOdd(x):
    if x %2 == 1:
        return True
    else:
        return False

myInputsList = [1, 2, 3, 42, 568, 0, 99, 45]
my_newList = []

for x in myInputsList:
    if(isOdd(x)):
        my_newList.append(x) #call the function with the elements of the given list and create a new list
print(my_newList)

[1, 3, 99, 45]


In [155]:
my_newList = list(filter(isOdd, myInputsList)) #only a single line does the same job!
print(my_newList) 

[1, 3, 99, 45]


In [157]:
import numpy as np
import itertools as it

my_list = [1, float("nan") , 3] #define a nan value in a list. let's filter out nan from this list.
my_newList = list(it.filterfalse(np.isnan, my_list)) #itertools.filterfalse() function does the opposite. 
my_newList

[1, 3]

In [1]:
float("nan")

nan

## Lambda Functions
Lambda functions are anonymous functions that are created while the program is running and are not assigned to a name like normal functions. They can be used very similarly to normal functions if assigned to a variable:

In [73]:
def myPowerFunction(x): #we define a python function in a regular way
    return x ** 2
myPower(4)

16

In [78]:
myNewPowerFunction = lambda x: x ** 2 #we can define a python function this way, too. Just use 'lambda' word
myNewFunction(4)

16

In [130]:
myInputsList = [1, 2, 3]
list(map(lambda x: pow(x,2), myInputsList)) #use a buil-in function

[1, 4, 9]

They are restricted to a single expression. Another use is to pass a small function as an argument.

In [92]:
data = [(5, 2, 4), (6, 3, 2), (4, 4, 4), (3, 3, 3), (5, 3, 10)]
sorted(data, key=lambda x: x[2]) #sort according to last element in each tuple. 
#The default value would be a function that sorts based on the first element.

[(6, 3, 2), (3, 3, 3), (5, 2, 4), (4, 4, 4), (5, 3, 10)]

In [82]:
pairs = [(1, 'one'), (2, 'two'), (4, 'four'), (6, 'six')]
pairs.sort(key=lambda pair: pair[1]) #sort it according to second element in each tuple
pairs

[(4, 'four'), (1, 'one'), (6, 'six'), (2, 'two')]

In [2]:
pairs = [(1, 'ox'), (2, 'tweet'), (4, 'fox'), (6, 'fur')]
pairs.sort(key=lambda pair: pair[1]) #sort it according to second element in each tuple
pairs

[(4, 'fox'), (6, 'fur'), (1, 'ox'), (2, 'tweet')]

In [84]:
myInputsList = [1, 2, 3, 42, 568, 0, 99, 45]
myNewList = list(filter(lambda x: x % 2 == 1, myInputsList)) #check if an element is odd. Then filter out only even ones.
print(myNewList)

[1, 3, 99, 45]


We can re-create the list of functions that we created before by using lambda expressions.

In [90]:
print_functions = [
    print, #built-int function
    lambda inputString: print(inputString[::-1]), #create a single line function definition
    lambda inputString: print(inputString[:len(inputString)//2]) #create a single line function definition
] 

for func in print_functions: #calling multiple different functions with a single input
    func("Hello World!")

Hello World!
!dlroW olleH
Hello 


## Pandas DataFrames Functions
Docs: https://pandas.pydata.org/pandas-docs/stable/reference/frame.html

## Pandas.apply()
Apply a function to each row/column in a Dataframe

DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)<br/>
func*: Function to be applied to each column or row.<br/> 
*axis*: Axis along which the function is applied in dataframe. Default value 0.
If axis = 0 then it applies function to each column.
If axis = 1 then it applies function to each row.<br/>
*args*: tuple, list etc. of arguments to passed to function.

More on the help file:

In [14]:
import pandas as pd
help(pd.DataFrame.apply)

Help on function apply in module pandas.core.frame:

apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)
    Apply a function along an axis of the DataFrame.
    
    Objects passed to the function are Series objects whose index is
    either the DataFrame's index (``axis=0``) or the DataFrame's columns
    (``axis=1``). By default (``result_type=None``), the final return type
    is inferred from the return type of the applied function. Otherwise,
    it depends on the `result_type` argument.
    
    Parameters
    ----------
    func : function
        Function to apply to each column or row.
    axis : {0 or 'index', 1 or 'columns'}, default 0
        Axis along which the function is applied:
    
        * 0 or 'index': apply function to each column.
        * 1 or 'columns': apply function to each row.
    
    raw : bool, default False
        Determines if row or column is passed as a Series or ndarray object:
    
        * ``False`` : passes each row or co

Let's create a simple dataframe

In [4]:
import pandas as pd
matrix = [[1,2],
          [3,4],
          [5,6]
         ] #3x2 matrix (list of lists)
 
df = pd.DataFrame(matrix, columns=list(['FirstColumn','SecondColumn']))
df

Unnamed: 0,FirstColumn,SecondColumn
0,1,2
1,3,4
2,5,6


Let's create our own function and apply it to each row or each column in the dataframe.

In [22]:
def get_sum(args = ()): #defining the function input as iterable (i.e. list, tuple etc.)
    sum = 0
    for element in args:
        sum += element
    return sum
get_sum(list([1,2,4])) #returns the sum of elements

7

Let's apply the function to each column and row of the dataframe

In [27]:
ps = df.apply(get_sum) #apply our sum function to each column (i.e return sum of each column)
ps #returns a pandas series

FirstColumn      9
SecondColumn    12
dtype: int64

In [28]:
ps = df.apply(get_sum, axis = 1) #apply our sum function to each row (i.e return sum of each column)
ps #returns a pandas series

0     3
1     7
2    11
dtype: int64

We could achieve the same results with lambda functions

In [41]:
new_df = df.apply(lambda x: sum(x), axis = 1) # returns a pandas series
new_df

0     3
1     7
2    11
dtype: int64

We could achieve the same by a built-in function in Python or in any library

In [42]:
ps = df.apply(sum, axis = 1) #just use Python built-in function
ps

0     3
1     7
2    11
dtype: int64

In [8]:
import numpy as np

ps = df.apply(np.sum) #use sum function from numpy library
ps

FirstColumn      9
SecondColumn    12
dtype: int64

In [9]:
import numpy as np

ps = df.apply(np.sum, axis=1) #use sum function from numpy library
ps

0     3
1     7
2    11
dtype: int64

If we want to modify each entry in the dataframe, we would choose another pandas built-in function ".applymap()"

## Pandas.applymap()
This function applies a function that accepts and returns a scalar to every element of a dataframe. It returns a transformed dataframe.

DataFrame.applymap(self, func)<br/>
*func*: Python function, returns a single value from a single value.

More on the help file.

In [57]:
help(df.applymap)

Help on method applymap in module pandas.core.frame:

applymap(func) -> 'DataFrame' method of pandas.core.frame.DataFrame instance
    Apply a function to a Dataframe elementwise.
    
    This method applies a function that accepts and returns a scalar
    to every element of a DataFrame.
    
    Parameters
    ----------
    func : callable
        Python function, returns a single value from a single value.
    
    Returns
    -------
    DataFrame
        Transformed DataFrame.
    
    See Also
    --------
    DataFrame.apply : Apply a function along input axis of DataFrame.
    
    Notes
    -----
    In the current implementation applymap calls `func` twice on the
    first column/row to decide whether it can take a fast or slow
    code path. This can lead to unexpected behavior if `func` has
    side-effects, as they will take effect twice for the first
    column/row.
    
    Examples
    --------
    >>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]])
    >>> df
        

In [54]:
new_df = df.applymap(lambda x: x ** 2) # returns a new transformed dataframe with powers of each entry
new_df

Unnamed: 0,FirstColumn,SecondColumn
0,1,4
1,9,16
2,25,36


In [10]:
df ** 2 # this would also return the same. but you'll see the necessity of applymap in the following examples

Unnamed: 0,FirstColumn,SecondColumn
0,1,4
1,9,16
2,25,36


In [11]:
df

Unnamed: 0,FirstColumn,SecondColumn
0,1,2
1,3,4
2,5,6


Let's take a look at nba dataframe. 

In [12]:
nba = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv") 
nba_new = nba[:7] #create a new dataframe with 7 rows of nba
nba_new

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0
2,John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University,
3,R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0
4,Jonas Jerebko,Boston Celtics,8.0,PF,29.0,6-10,231.0,,5000000.0
5,Amir Johnson,Boston Celtics,90.0,PF,29.0,6-9,240.0,,12000000.0
6,Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0


In [13]:
nba_new.applymap(lambda x: len(str(x))) #convert each entry into string and return its length 

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,13,14,3,2,4,3,5,5,9
1,11,14,4,2,4,3,5,9,9
2,12,14,4,2,4,3,5,17,3
3,11,14,4,2,4,3,5,13,9
4,13,14,3,2,4,4,5,3,9
5,12,14,4,2,4,3,5,3,10
6,13,14,4,2,4,3,5,3,9


What if you desire to apply a function to each entry of a specific column of a dataframe?

In [15]:
nba_new[['Name']].applymap(lambda x: x.replace(" ", "")) #remove white space from a string

Unnamed: 0,Name
0,AveryBradley
1,JaeCrowder
2,JohnHolland
3,R.J.Hunter
4,JonasJerebko
5,AmirJohnson
6,JordanMickey


In [16]:
nba_new[['Age']].applymap(lambda x:round(x)) #round each entry in Age column

Unnamed: 0,Age
0,25
1,25
2,27
3,22
4,29
5,29
6,21


In [17]:
nba_new

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0
2,John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University,
3,R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0
4,Jonas Jerebko,Boston Celtics,8.0,PF,29.0,6-10,231.0,,5000000.0
5,Amir Johnson,Boston Celtics,90.0,PF,29.0,6-9,240.0,,12000000.0
6,Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0
