# Lambda Functions - Lab

## Introduction

In this lab, you'll get some hands-on practice creating and using lambda functions.

## Objectives
In this lab you will: 
* Create lambda functions to use as arguments of other functions   
* Use the `.map()` or `.apply()` method to apply a function to a pandas series or DataFrame

## Lambda Functions

In [1]:
import pandas as pd
df = pd.read_csv('Yelp_Reviews.csv', index_col=0)
df.head(2)

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g
2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g


In [2]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2610 entries, 1 to 4206
Data columns (total 9 columns):
business_id    2610 non-null object
cool           2610 non-null int64
date           2610 non-null object
funny          2610 non-null int64
review_id      2610 non-null object
stars          2610 non-null int64
text           2610 non-null object
useful         2610 non-null int64
user_id        2610 non-null object
dtypes: int64(4), object(5)
memory usage: 203.9+ KB


In [3]:
df.describe()

Unnamed: 0,cool,funny,stars,useful
count,2610.0,2610.0,2610.0,2610.0
mean,0.229119,0.211877,3.724521,0.872414
std,0.713175,0.793281,1.52467,1.862308
min,0.0,0.0,1.0,0.0
25%,0.0,0.0,3.0,0.0
50%,0.0,0.0,4.0,0.0
75%,0.0,0.0,5.0,1.0
max,14.0,16.0,5.0,24.0


## Simple arithmetic

Use a lambda function to create a new column called `'stars_squared'` by squaring the stars column.

In [4]:
# Your code here
df['stars_squared'] = df['stars'].map(lambda x: x**2)
df.head(3)


Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id,stars_squared
1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g,25
2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g,1
4,Ums3gaP2qM3W1XcA5r6SsQ,0,2014-09-05,0,jsDu6QEJHbwP2Blom1PLCA,5,Delicious healthy food. The steak is amazing. ...,0,msQe1u7Z_XuqjGoqhB0J5g,25


## Dates
Select the month from the date string using a lambda function.

In [5]:
df['date'].dtype


dtype('O')

In [9]:
# Your code here
pd.to_datetime(df['date']).map(lambda x: x.strftime('%m'))

1       11
2       10
4       09
5       02
10      06
        ..
689     06
4874    08
564     06
3458    10
4206    08
Name: date, Length: 2610, dtype: object

## What is the average number of words for a yelp review?
Do this with a single line of code!

In [10]:
# Your code here
df['text'].map(lambda x: len(x.split())).mean()

77.06551724137931

## Create a new column for the number of words in the review

In [11]:
# Your code here
df['review_word_count'] = df['text'].map(lambda x: len(x.split()))
df.head(3)

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id,stars_squared,review_word_count
1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g,25,58
2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g,1,30
4,Ums3gaP2qM3W1XcA5r6SsQ,0,2014-09-05,0,jsDu6QEJHbwP2Blom1PLCA,5,Delicious healthy food. The steak is amazing. ...,0,msQe1u7Z_XuqjGoqhB0J5g,25,30


## Rewrite the following as a lambda function

Create a new column `'Review_Length'` by applying this lambda function to the `'Review_num_words'` column. 

In [12]:
# Rewrite the following function as a lambda function
def rewrite_as_lambda(value):
    if len(value) < 50:
        return 'Short'
    elif len(value) < 80:
        return 'Medium'
    else:
        return 'Long'
# Hint: nest your if, else conditionals

df['Review_length'] = df['review_word_count'].map(lambda x: 'Short' if x < 50 
                                                 else ('Medium' if x < 80 else 'Long'))

df.head(2)

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id,stars_squared,review_word_count,Review_length
1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g,25,58,Medium
2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g,1,30,Short


In [13]:
df['Review_length'].value_counts()

Short     1287
Long       769
Medium     554
Name: Review_length, dtype: int64

## Level Up: Dates Advanced!
<img src="images/world_map.png" width="600">  

Print the first five rows of the `'date'` column. 

In [14]:
# Your code here
df['date'].head()

1     2012-11-13
2     2014-10-23
4     2014-09-05
5     2011-02-25
10    2016-06-15
Name: date, dtype: object

Overwrite the `'date'` column by reordering the month and day from `YYYY-MM-DD` to `DD-MM-YYYY`. Try to do this using a lambda function.

In [17]:
# Your code here
df['date'] = pd.to_datetime(df['date']).map(lambda x: x.strftime('%d-%m-%Y'))
df['date'].head()

1     13-11-2012
2     23-10-2014
4     09-05-2014
5     25-02-2011
10    15-06-2016
Name: date, dtype: object

## Summary

Great! Hopefully, you're getting the hang of lambda functions now! It's important not to overuse them - it will often make more sense to define a function so that it's reusable elsewhere. But whenever you need to quickly apply some simple processing to a collection of data you have a new technique that will help you to do just that. It'll also be useful if you're reading someone else's code that happens to use lambdas.