# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Flash lesson: lambda functions
Week 3 | Lesson 4.1



### LEARNING OBJECTIVES
*After this lesson, you will be able to:*
- Write and apply one-line **lambda functions**

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline



### Lambda

Lambda is a tool for building functions. We already know how to build functions
using def, but let's do a quick comparison of the two.

Here's building a function using def:
```Python
def square_root(x): return x ** .5
```

Here's building the same function using lambda
```Python
square_root_lambda = lambda x: x ** .5
```

In [None]:
def square_root(x): 
    return x** .5

square_root_lambda = lambda x: x ** .5

In [5]:
square_root(2)

1.4142135623730951

In [6]:
square_root_lambda(2)

1.4142135623730951

### Quck check
Write a normal function to calculate the area of a rectangle.

Then, re-write that function as a lambda function.

In [5]:
#Function here
def area(x, y):
    return ???
    
    
#Lambda function here:
area_lambda = ???

# Test them out!
print area(4,5), area_lambda(4,5)

20 20


### Lambdas are 'anonymous functions'
Lambda functions are useful when you have an operation that your code only calls once.

Some things to remember about lambda:
- it does not contain a return statement
- it is not a named function
- it is a tool for creating anonymous procedures
- it only takes a single expression (so, no loops or if statements)

More information on [Lambda](https://pythonconquerstheuniverse.wordpress.com/2011/08/29/lambda_tutorial/).


For example, in this code:
```Python
def _range(x):
    return np.max(x) - np.min(x)
    
df.pivot_table(index='A', aggfunc = (np.mean, _range))
```


You can replace the `_range` function with a lambda:

```Python
df.pivot_table(index='A', aggfunc = (np.mean, lambda x: np.max(x) - np.min(x)))
```

In [3]:
df = pd.DataFrame(np.random.randint(0,100,size=(8, 4)), columns=list('ABCD'))
df2 = pd.DataFrame([x for x in range(0,2)]*4, columns=['Group'])
df = pd.concat([df, df2], axis = 1)
df

Unnamed: 0,A,B,C,D,Group
0,71,42,70,22,0
1,29,45,82,16,1
2,64,81,29,72,0
3,78,13,44,14,1
4,73,74,4,7,0
5,74,31,19,26,1
6,94,86,29,66,0
7,48,14,22,59,1


In [68]:
df = df.applymap(lambda x: x*2)
df.head()

Unnamed: 0,A,B,C,D,Group
0,128,88,100,22,0
1,176,174,6,42,2
2,18,136,44,188,0
3,0,180,84,48,2
4,168,14,12,186,0


In [69]:
def _range(x):
    return np.max(x) - np.min(x)

df.pivot_table(index = 'Group', aggfunc = (np.mean, _range))

Unnamed: 0_level_0,A,A,B,B,C,C,D,D
Unnamed: 0_level_1,mean,_range,mean,_range,mean,_range,mean,_range
Group,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
0,95.5,150,61.5,128,82.5,162,124,166
2,83.5,176,143.0,106,76.0,180,79,120


In [70]:
df.pivot_table(index='Group', aggfunc = (np.mean, lambda x: np.max(x) - np.min(x)))

Unnamed: 0_level_0,A,A,B,B,C,C,D,D
Unnamed: 0_level_1,mean,<lambda>,mean,<lambda>,mean,<lambda>,mean,<lambda>
Group,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
0,95.5,150,61.5,128,82.5,162,124,166
2,83.5,176,143.0,106,76.0,180,79,120


### Independent practice
Practice writing a few lambda functions and applying them to this dataframe

In [6]:
type(df)

<bound method DataFrame.mean of     A   B   C   D  Group
0  71  42  70  22      0
1  29  45  82  16      1
2  64  81  29  72      0
3  78  13  44  14      1
4  73  74   4   7      0
5  74  31  19  26      1
6  94  86  29  66      0
7  48  14  22  59      1>

In [9]:
df

Unnamed: 0,A,B,C,D,Group
0,71,42,70,22,0
1,29,45,82,16,1
2,64,81,29,72,0
3,78,13,44,14,1
4,73,74,4,7,0
5,74,31,19,26,1
6,94,86,29,66,0
7,48,14,22,59,1


In [14]:
# Problem 1: apply a lambda function that gives the square root of every element
def square_root(n):
    return n**0.5

df.applymap(lambda x: x**.5)
df.applymap(square_root)

Unnamed: 0,A,B,C,D,Group
0,8.42615,6.480741,8.3666,4.690416,0.0
1,5.385165,6.708204,9.055385,4.0,1.0
2,8.0,9.0,5.385165,8.485281,0.0
3,8.831761,3.605551,6.63325,3.741657,1.0
4,8.544004,8.602325,2.0,2.645751,0.0
5,8.602325,5.567764,4.358899,5.09902,1.0
6,9.69536,9.273618,5.385165,8.124038,0.0
7,6.928203,3.741657,4.690416,7.681146,1.0


Problem 2: apply a lambda function that returns 'skewed' for each column if the difference between its mean and median is more than 10% of its standard deviation, otherwise return 'normalish'

v1
- check if difference between mean and median is > 10% of s.d.
- if it is, then return the string 'skewed'
- if it is not, then return the string 'normalish'

v2
- for every column in our dataframe...: df.apply()
- calculate the mean: np.mean(column)
- calculate the median: np.median(column)
- subtract the mean from median: np.mean(column) - np.median(column)
- get absolute value of this: abs(...)
- get sd of column: np.std(column)
- see if result is > 10% of s.d.: abs(...) > .1 * np.std(column)
- if it is, then return the string 'skewed': 'skewed' if above is true
- if it is not, then return the string 'normalish': else 'normalish'

lambda n: 'big' if n > 100 else 'small'

df.apply(
lambda c: 'skewed' if abs(np.mean(c) - np.median(c)) > np.std(c) * 0.1 else 'normalish'
)



In [15]:
df.apply(lambda c: 'skewed' if abs(np.mean(c) - np.median(c)) > np.std(c) * 0.1 else 'normalish')

A           skewed
B           skewed
C           skewed
D           skewed
Group    normalish
dtype: object

In [23]:
df.pivot_table(index='Group', aggfunc = lambda x: 1)

Unnamed: 0_level_0,A,B,C,D
Group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,1,1,1,1
1,1,1,1,1


In [75]:
# Problem 1: apply a lambda function that gives the square root of every element
df.applymap(lambda x: x**.5)

# Problem 2: apply a lambda function that returns 'skewed' for each column if the difference between
# its mean and median is more than 10% of its standard deviation, otherwise return 'normalish'
df.apply(???)

# Problem 3: create a pivot table, indexed on 'Group', with a lambda aggfunc that returns 
# the number of group elements greater than 100
df.pivot_table())

A           skewed
B           skewed
C           skewed
D           skewed
Group    normalish
dtype: object


Unnamed: 0_level_0,A,B,C,D
Group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,2,1,1,2
2,1,3,1,1
