# Lambda Functions - Lab

## Introduction|

In this lab, you'll get some hands on practice creating and using lambda functions.

## Objectives
You will be able to:
* Understand what lambda functions are and why they are useful
* Use lambda functions to transform data within lists and DataFrames

## Lambda Functions

In [34]:
import pandas as pd
df = pd.read_csv('Yelp_Reviews.csv')
df.head(2)

Unnamed: 0.1,Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
0,1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g
1,2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g


# Simple Arithmetic

Use a lambda function to create a new column called 'stars_squared' by squarring the stars column.

In [35]:
stars_squared = df['stars']**2
#df['stars']
stars_squared

0       25
1        1
2       25
3        1
4       25
5        1
6       25
7       25
8        1
9        1
10      25
11      25
12      16
13      25
14      25
15       4
16      25
17       1
18      25
19      25
20      16
21       1
22      16
23      25
24      16
25       1
26       1
27      16
28      16
29      16
        ..
2580    16
2581    25
2582    16
2583     1
2584    25
2585    16
2586    25
2587     4
2588    25
2589    25
2590    16
2591    25
2592     4
2593    25
2594     9
2595    16
2596     4
2597     1
2598    16
2599     1
2600    25
2601     4
2602    25
2603    16
2604     1
2605    25
2606    25
2607    25
2608     4
2609     1
Name: stars, Length: 2610, dtype: int64

# Dates
Select the month from the date string using a lambda function.

In [36]:
type(df['date'])
df.date.map(lambda x: x[:4])

0       2012
1       2014
2       2014
3       2011
4       2016
5       2016
6       2014
7       2017
8       2017
9       2017
10      2015
11      2012
12      2012
13      2012
14      2012
15      2014
16      2016
17      2015
18      2015
19      2017
20      2016
21      2016
22      2016
23      2012
24      2017
25      2017
26      2010
27      2016
28      2016
29      2016
        ... 
2580    2015
2581    2013
2582    2016
2583    2015
2584    2017
2585    2015
2586    2015
2587    2012
2588    2015
2589    2015
2590    2007
2591    2016
2592    2015
2593    2013
2594    2011
2595    2014
2596    2014
2597    2016
2598    2011
2599    2014
2600    2012
2601    2014
2602    2012
2603    2013
2604    2013
2605    2013
2606    2016
2607    2016
2608    2013
2609    2016
Name: date, Length: 2610, dtype: object

# What is the average number of words for a yelp review?
Do this with a single line of code!

In [37]:
df['text'].map(lambda x: len(x.split())).head().mean()

46.4

# Create a new column for the number of words in the review.

In [62]:
df['word_num'] = df['text'].map(lambda x: len(x.split()))
df['word_num']

0        58
1        30
2        30
3        82
4        32
5        49
6        21
7        70
8       131
9       112
10       19
11       28
12       20
13       15
14       28
15       31
16       31
17       49
18       26
19       47
20       85
21      120
22       28
23       32
24       37
25      122
26       51
27       42
28       80
29       79
       ... 
2580     32
2581     19
2582     85
2583    117
2584     72
2585     22
2586     31
2587     48
2588     95
2589     60
2590     89
2591     25
2592     39
2593     22
2594     46
2595    141
2596     24
2597    148
2598    589
2599    122
2600     11
2601     48
2602    133
2603     37
2604    128
2605     61
2606     43
2607     79
2608    185
2609     42
Name: word_num, Length: 2610, dtype: int64

## Rewrite the following as a lamda function. Create a new column 'Review_Length'

In [73]:
#def rewrite_as_lambda(value):
#    if len(value) < 50:
#        return 'Short'
#    elif len(value) < 80:
#        return 'Medium'
#    else:
#        return 'Long'
#Hint: nest your if, else conditionals

df['review_length'] = lambda x: 'Short' if x < 50 else ('Medium' if x < 80 else 'Long')

# Level Up: Dates Adavanced!
<img src="date_format_map.png" width=500>  

Overwrite the date column by reordering the month and day from YYYY-MM-DD to DD-MM-YYYY. Try to do this using a lambda function.

In [64]:
type(df['date'][0])
df['date'] = pd.to_datetime(df['date'],format='%Y/%m/%d').dt.strftime('%d/%m/%Y')#specify input format '%d-%m-%Y' and output format '%Y-%m-%d' or change output as desired i.e. %d/%m/%Y to give dd/mm/yyyy
df['date'].head(20)

0     13/11/2012
1     23/10/2014
2     05/09/2014
3     25/02/2011
4     15/06/2016
5     23/09/2016
6     23/08/2014
7     16/08/2017
8     18/11/2017
9     18/11/2017
10    10/12/2015
11    12/02/2012
12    12/02/2012
13    12/02/2012
14    12/02/2012
15    03/06/2014
16    18/11/2016
17    05/11/2015
18    05/11/2015
19    21/01/2017
Name: date, dtype: object

## Summary

Great! Hopefully you're getting the hang of lambda functions now! It's important not to overuse them - it will often make more sense to define a function so that it's reusable elsewhere. But whenever you need to quickly apply some simple processing to a collection of data you have a new technique that will help you to do just that. It'll also be useful if you're reading someone elses code that happens to use lambdas.