# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Flash lesson: lambda functions



### LEARNING OBJECTIVES
*After this lesson, you will be able to:*
- Write and apply one-line **lambda functions**

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

### Lambda

Lambda is a tool for building functions. We already know how to build functions
using def, but let's do a quick comparison of the two.

In [2]:
def square_root(x): return x ** .5

square_root(2)

1.4142135623730951

In [3]:
(lambda x: x ** .5)(2)

1.4142135623730951

In [4]:
square_root_lambda = lambda x: x ** .5

square_root_lambda(2)

1.4142135623730951

### Quick check
Write a normal function to calculate the area of a rectangle.

Then, re-write that function as a lambda function.

In [5]:
# Function here:
def area(x, y):
    return x * y
    
# Lambda function here:
area_lambda = lambda x, y: x * y

# Test them out!
print(area(4, 5), area_lambda(4, 5))

20 20


### Lambdas are 'anonymous functions'
Lambda functions are useful when you have an operation that your code only calls once.

Some things to remember about lambda:
- it does not contain a return statement
- it is not a named function
- it is a tool for creating anonymous procedures

More information on [Lambda](https://pythonconquerstheuniverse.wordpress.com/2011/08/29/lambda_tutorial/).


In [6]:
[x for x in range(0, 2)] * 4

[0, 1, 0, 1, 0, 1, 0, 1]

In [7]:
# create some data

df = pd.DataFrame(np.random.randint(0, 100, size = (8, 4)),
                  columns = list('ABCD'))

# create a binary column:
df['Group'] = [x for x in range(0, 2)] * 4

df

Unnamed: 0,A,B,C,D,Group
0,26,78,76,77,0
1,60,24,73,67,1
2,67,20,22,9,0
3,75,9,65,14,1
4,43,2,52,97,0
5,26,81,60,51,1
6,11,35,74,70,0
7,84,94,94,71,1


In [8]:
df = df.applymap(lambda x: x * 2)
df.head()

Unnamed: 0,A,B,C,D,Group
0,52,156,152,154,0
1,120,48,146,134,2
2,134,40,44,18,0
3,150,18,130,28,2
4,86,4,104,194,0


In [9]:
def _range(x):
    return np.max(x) - np.min(x)

df.pivot_table(index = 'Group',
               aggfunc = (np.mean, _range, max))

Unnamed: 0_level_0,A,A,A,B,B,B,C,C,C,D,D,D
Unnamed: 0_level_1,_range,max,mean,_range,max,mean,_range,max,mean,_range,max,mean
Group,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
0,112.0,134.0,73.5,152.0,156.0,67.5,108,152,112,176.0,194.0,126.5
2,116.0,168.0,122.5,170.0,188.0,104.0,68,188,146,114.0,142.0,101.5


In [10]:
df.pivot_table(index = 'Group',
               aggfunc = (np.mean,
                          lambda x: np.max(x) - np.min(x)))

Unnamed: 0_level_0,A,A,B,B,C,C,D,D
Unnamed: 0_level_1,<lambda_0>,mean,<lambda_0>,mean,<lambda_0>,mean,<lambda_0>,mean
Group,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
0,112.0,73.5,152.0,67.5,108,112,176.0,126.5
2,116.0,122.5,170.0,104.0,68,146,114.0,101.5


### Lambdas in sorting


In [11]:
people = [
    {'name': 'Xlegic', 'age': 37, 'role': 'student'},
    {'name': 'John', 'age': 45, 'role': 'student'},
    {'name': 'Jasmine', 'age': 20, 'role': 'student'},
    {'name': 'Eva', 'age': 33, 'role': 'instructor'},
]

sorted(people, key = lambda person: person['age'])

[{'name': 'Jasmine', 'age': 20, 'role': 'student'},
 {'name': 'Eva', 'age': 33, 'role': 'instructor'},
 {'name': 'Xlegic', 'age': 37, 'role': 'student'},
 {'name': 'John', 'age': 45, 'role': 'student'}]

In [12]:
sorted("This is a test string from Andrew".split(),
       key = str.lower)

['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']

### Independent practice
Practice writing a few lambda functions and applying them to this dataframe (solutions in repo).

**Problem 1**: apply a lambda function that gives the square root of every element

In [13]:
# Code here:
df.applymap(lambda x: x ** .5)

Unnamed: 0,A,B,C,D,Group
0,7.211103,12.489996,12.328828,12.409674,0.0
1,10.954451,6.928203,12.083046,11.575837,1.414214
2,11.575837,6.324555,6.63325,4.242641,0.0
3,12.247449,4.242641,11.401754,5.291503,1.414214
4,9.273618,2.0,10.198039,13.928388,0.0
5,7.211103,12.727922,10.954451,10.099505,1.414214
6,4.690416,8.3666,12.165525,11.83216,0.0
7,12.961481,13.711309,13.711309,11.916375,1.414214


**Problem 2**: apply a lambda function that returns 'skewed' for each column if the difference between its mean and median is more than 10% of its standard deviation, otherwise return 'normalish'

In [14]:
# Code here:
df.apply(lambda c: 'skewed' if abs(np.mean(c) - np.median(c))\
                               > np.std(c)*.1 
                            else 'normalish')

A           skewed
B           skewed
C           skewed
D           skewed
Group    normalish
dtype: object

**Problem 3**: create a pivot table, indexed on 'Group', with a lambda aggfunc that returns the number of group elements greater than 100

* x in lambda x iterates over the columns of the dataframe
* i in lambda function iterates over each element of the column

In [15]:
# Code here:
df.pivot_table(index = 'Group', 
               aggfunc = (lambda x: len([i for i in x if i > 100])))

Unnamed: 0_level_0,A,B,C,D
Group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,1,1,3,3
2,3,2,4,3
