# Lambda Functions - Lab

## Introduction

In this lab, you'll get some hands-on practice creating and using lambda functions.

## Objectives

In this lab you will:

* Create lambda functions to use as arguments of other functions   
* Use the `.map()` or `.apply()` method to apply a function to a pandas series or DataFrame

## Lambda Functions

In [1]:
import pandas as pd
df = pd.read_csv('Yelp_Reviews.csv', index_col=0)
df.head(2)

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g
2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g


## Simple arithmetic

Use a lambda function to create a new column called `'stars_squared'` by squaring the stars column.

In [2]:
# Create the 'stars_squared' column
df['stars_squared'] = df['stars'].apply(lambda x: x**2)
print(df)



                 business_id  cool        date  funny               review_id  \
1     pomGBqfbxcqPv14c3XH-ZQ     0  2012-11-13      0  dDl8zu1vWPdKGihJrwQbpw   
2     jtQARsP6P-LbkyjbO1qNGg     1  2014-10-23      1  LZp4UX5zK3e-c5ZGSeo3kA   
4     Ums3gaP2qM3W1XcA5r6SsQ     0  2014-09-05      0  jsDu6QEJHbwP2Blom1PLCA   
5     vgfcTvK81oD4r50NMjU2Ag     0  2011-02-25      0  pfavA0hr3nyqO61oupj-lA   
10    yFumR3CWzpfvTH2FCthvVw     0  2016-06-15      0  STiFMww2z31siPY7BWNC2g   
...                      ...   ...         ...    ...                     ...   
689   BTcY04QFiS1uh-RpkR7rAg     1  2013-06-02      0  6_A58CCY8SHB7r-Wu7-A5g   
4874  t0T_4MM4EUHbCzBTF11FHA     0  2016-08-14      0  KqQwNyfoFiJOw911mrULIg   
564   5XYR6doRa5Nj1JMfSDei6A     1  2016-06-14      0  xlGJkxoIBl8XH8wVsPZpnw   
3458  aLcFhMe6DDJ430zelCpd2A     0  2013-10-02      0  kwiEG_KCpDB6aK5fTSM7iw   
4206  WdBWhGe4Siqg3IYTc4_K4A     0  2016-08-15      0  O0ttxNGxHKtD8Cnnwc_j1g   

      stars                

## Dates
Select the month from the date string using a lambda function.

In [15]:
# Lambda function to extract the month
extract_month = lambda date_str: datetime.strptime(date_str, '%Y-%m-%d').month
extract_month

<function __main__.<lambda>(date_str)>

## What is the average number of words for a yelp review?
Do this with a single line of code.

In [18]:
reviews = [
    "Great service!"]
average_words = sum(len(review.split()) for review in reviews) / len(reviews) if reviews else 0
average_words


2.0

## Create a new column for the number of words in the review

In [26]:
# Add a new column 'word_count' with the number of words in each review
df['word_count'] = df['review_id'].apply(lambda x: len(x.split()))
print(df['word_count'])

1       1
2       1
4       1
5       1
10      1
       ..
689     1
4874    1
564     1
3458    1
4206    1
Name: word_count, Length: 2610, dtype: int64


In [25]:
print(df.columns)


Index(['business_id', 'cool', 'date', 'funny', 'review_id', 'stars', 'text',
       'useful', 'user_id', 'stars_squared', 'Review_length'],
      dtype='object')


## Rewrite the following as a lambda function

Create a new column `'Review_Length'` by applying this lambda function to the `'Review_num_words'` column. 

In [9]:
# Rewrite the following function as a lambda function
def rewrite_as_lambda(value):
    if len(value) < 50:
        return 'Short'
    elif len(value) < 80:
        return 'Medium'
    else:
        return 'Long'
# Hint: nest your if, else conditionals

df['Review_length'] = None


In [27]:
df['Review_length'] = df['review_id'].apply(lambda value: 'Short'
                                         if len(value) < 50 
                                         else ('Medium' if len(value) < 80 else 'Long'))


## Level Up: Dates Advanced
<img src="images/world_map.png" width="600">  

Print the first five rows of the `'date'` column. 

In [10]:
# Your code here
print(df['date'].head())


1     2012-11-13
2     2014-10-23
4     2014-09-05
5     2011-02-25
10    2016-06-15
Name: date, dtype: object


Overwrite the `'date'` column by reordering the month and day from `YYYY-MM-DD` to `DD-MM-YYYY`. Try to do this using a lambda function.

In [12]:
# Your code here
df['date'] = df['date'].apply(lambda x: '-'.join(x.split('-')[1::-1] + x.split('-')[2:]))
df['date']


1       2012-11-13
2       2014-10-23
4       2014-09-05
5       2011-02-25
10      2016-06-15
           ...    
689     2013-06-02
4874    2016-08-14
564     2016-06-14
3458    2013-10-02
4206    2016-08-15
Name: date, Length: 2610, dtype: object

In [14]:
df['date'] = df['date'].apply(lambda x: '-'.join(x.split('-')[2:][::-1] + x.split('-')[:2]))
df['date']


1       11-13-2012
2       10-23-2014
4       09-05-2014
5       02-25-2011
10      06-15-2016
           ...    
689     06-02-2013
4874    08-14-2016
564     06-14-2016
3458    10-02-2013
4206    08-15-2016
Name: date, Length: 2610, dtype: object

## Summary

Hopefully, you're getting the hang of lambda functions now! It's important not to overuse them - it will often make more sense to define a function so that it's reusable elsewhere. But whenever you need to quickly apply some simple processing to a collection of data you have a new technique that will help you to do just that. It'll also be useful if you're reading someone else's code that happens to use lambdas.