# Lambda Functions - Lab

## Introduction|

In this lab, you'll get some hands on practice creating and using lambda functions.

## Objectives
You will be able to:
* Understand what lambda functions are and why they are useful
* Use lambda functions to transform data within lists and DataFrames

## Lambda Functions

In [22]:
import pandas as pd
df = pd.read_csv('Yelp_Reviews.csv')
df.head(2)

Unnamed: 0.1,Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
0,1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g
1,2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g


In [25]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2610 entries, 0 to 2609
Data columns (total 10 columns):
Unnamed: 0     2610 non-null int64
business_id    2610 non-null object
cool           2610 non-null int64
date           2610 non-null object
funny          2610 non-null int64
review_id      2610 non-null object
stars          2610 non-null int64
text           2610 non-null object
useful         2610 non-null int64
user_id        2610 non-null object
dtypes: int64(5), object(5)
memory usage: 204.0+ KB


In [29]:
print('rows in df...',len(df) )
dfG = df.groupby(['cool','stars','useful']).agg({'business_id':'first','date':'first','funny':'first','review_id':'first','text':'first','user_id':'count'}).reset_index()
print('rows in group ..', len(dfG) )
dfM = pd.merge(df,dfG,on=['cool','stars','useful','business_id'],how='inner')
print('rows in Merge ....',len(dfM))


rows in df... 2610
rows in group .. 128
rows in Merge .... 144


# Simple Arithmetic

Use a lambda function to create a new column called 'stars_squared' by squarring the stars column.

In [33]:
#Your code here
df.columns
df['stars_squared'] = df.stars.map(lambda x : x ** 2)

# Dates
Select the month from the date string using a lambda function.

In [39]:
# Your code here
df.head(2)
df.date.map(lambda x : x[5:7]).head()
df.head(2)

Unnamed: 0.1,Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id,stars_squared
0,1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g,25
1,2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g,1


# What is the average number of words for a yelp review?
Do this with a single line of code!

In [41]:
# Your code here
df.text.map(lambda x : len(x.split())).mean()

77.06551724137931

# Create a new column for the number of words in the review.

In [42]:
#Your code here
df['review_count'] = df.text.map(lambda x : len(x.split())) 
df.head(2)

Unnamed: 0.1,Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id,stars_squared,review_count
0,1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g,25,58
1,2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g,1,30


## Rewrite the following as a lamda function. Create a new column 'Review_Length'

In [5]:
def rewrite_as_lambda(value):
    if len(value) < 50:
        return 'Short'
    elif len(value) < 80:
        return 'Medium'
    else:
        return 'Long'
#Hint: nest your if, else conditionals
#map(lambda x: 'Short' if x < 50 else ('Medium' if x < 80 else 'Long'))

In [56]:
#Your code here
df['Review_Length'] = df.text.map(lambda x : 'Short' if (len(x.split()) < 50) else ('Medium' if (len(x.split()) ) < 80 else  'Long') ) 
df.Review_Length.value_counts()

Short     1287
Long       769
Medium     554
Name: Review_Length, dtype: int64

# Level Up: Dates Adavanced!
<img src="date_format_map.png" width=500>  

Overwrite the date column by reordering the month and day from YYYY-MM-DD to DD-MM-YYYY. Try to do this using a lambda function.

In [50]:
#Your code here
#date.map(lambda x: '{}-{}-{}'.format(x[-2:], x[5:7], x[:4]))
df['date'] = df['date'].dt.strftime('%d-%m-%Y')
#df['DOB1'] = df['DOB'].dt.strftime('%m/%d/%Y')
df.head()

AttributeError: Can only use .dt accessor with datetimelike values

## Summary

Great! Hopefully you're getting the hang of lambda functions now! It's important not to overuse them - it will often make more sense to define a function so that it's reusable elsewhere. But whenever you need to quickly apply some simple processing to a collection of data you have a new technique that will help you to do just that. It'll also be useful if you're reading someone elses code that happens to use lambdas.