# Lambda Functions - Lab

## Introduction

In this lab, you'll get some hands on practice creating and using lambda functions.

## Objectives
You will be able to:
* Understand what lambda functions are and why they are useful
* Use lambda functions to transform data within lists and DataFrames

## Lambda Functions

In [3]:
import pandas as pd
df = pd.read_csv('Yelp_Reviews.csv')
df.head(2)

Unnamed: 0.1,Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
0,1,pomGBqfbxcqPv14c3XH-ZQ,0,2012-11-13,0,dDl8zu1vWPdKGihJrwQbpw,5,I love this place! My fiance And I go here atl...,0,msQe1u7Z_XuqjGoqhB0J5g
1,2,jtQARsP6P-LbkyjbO1qNGg,1,2014-10-23,1,LZp4UX5zK3e-c5ZGSeo3kA,1,Terrible. Dry corn bread. Rib tips were all fa...,3,msQe1u7Z_XuqjGoqhB0J5g


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2610 entries, 0 to 2609
Data columns (total 10 columns):
Unnamed: 0     2610 non-null int64
business_id    2610 non-null object
cool           2610 non-null int64
date           2610 non-null object
funny          2610 non-null int64
review_id      2610 non-null object
stars          2610 non-null int64
text           2610 non-null object
useful         2610 non-null int64
user_id        2610 non-null object
dtypes: int64(5), object(5)
memory usage: 204.0+ KB


## Simple Arithmetic

Use a lambda function to create a new column called 'stars_squared' by squarring the stars column.

In [5]:
# df['stars']

In [6]:
# def square_col_vals(col):
#     for idx in range(len(col)): 
#         col[idx] = col[idx] ** 2
#     return col 

In [7]:
# new_squared_col = square_col_vals(df['stars'])

In [8]:
# stars_squared = df['stars']**2


In [9]:
stars_squared = df['stars'].apply(lambda x: x ** 2)

## Dates
Select the month from the date string using a lambda function.

In [10]:
df['date'] = pd.to_datetime(df['date'])
month_date = df['date'].apply(lambda x:x.month)
month_date

0       11
1       10
2        9
3        2
4        6
5        9
6        8
7        8
8       11
9       11
10      12
11       2
12       2
13       2
14       2
15       6
16      11
17      11
18      11
19       1
20      11
21       5
22       5
23      10
24       4
25       4
26      11
27       4
28       3
29       2
        ..
2580     9
2581     7
2582     7
2583    10
2584     6
2585    10
2586     6
2587     8
2588     7
2589     3
2590    10
2591     7
2592     8
2593     8
2594     8
2595     2
2596     7
2597     8
2598     8
2599     1
2600     6
2601     1
2602     5
2603     3
2604    11
2605     6
2606     8
2607     6
2608    10
2609     8
Name: date, Length: 2610, dtype: int64

## What is the average number of words for a yelp review?
Do this with a single line of code!

In [78]:
df['text']

0       I love this place! My fiance And I go here atl...
1       Terrible. Dry corn bread. Rib tips were all fa...
2       Delicious healthy food. The steak is amazing. ...
3       This place sucks. The customer service is horr...
4       I have been an Emerald Club member for a numbe...
5       The score should be negative. Its HORRIBLE. Th...
6       I went there twice and I am pretty happy with ...
7       Finally! After trying many Mexican restaurants...
8       I have to write a review on the Fractured Prun...
9       I wish i could tell you all about the food but...
10      Wonderful! One of my favorite places. BBQ is e...
11      The price is about the same as Quizinos. The s...
12      The food was ok. You get a lot of food for the...
13      The staff were friendly and helpful. The food ...
14      The staff were very helpful. We painted potter...
15      Limited vegetarian options. Ordered Greek sala...
16      I trust my vehicles here. Comparable prices. M...
17      This p

In [92]:
# df['text'].describe()
df['text'].apply(lambda x: 
len(x.split())).mean()


# number_of_words = df['text'].map(lambda x: 
# len(x.split())).describe()
# number_of_words



77.06551724137931

## Create a new column for the number of words in the review.

In [95]:
df['number_of_words'] = df['text'].apply(lambda x: 
len(x.split()))
# df

## Rewrite the following as a lambda function. Create a new column 'Review_Length'

In [13]:
def rewrite_as_lambda(value):
    if len(value) < 50:
        return 'Short'
    elif len(value) < 80:
        return 'Medium'
    else:
        return 'Long'


In [16]:
df['review_length'] = df['text'].apply(rewrite_as_lambda)
# df['review_length']

## Level Up: Dates Advanced!
<img src="images/date_format_map.png" width="600">  

Overwrite the date column by reordering the month and day from YYYY-MM-DD to DD-MM-YYYY. Try to do this using a lambda function.

AttributeError: 'Timestamp' object has no attribute 'reverse'

## Summary

Great! Hopefully you're getting the hang of lambda functions now! It's important not to overuse them - it will often make more sense to define a function so that it's reusable elsewhere. But whenever you need to quickly apply some simple processing to a collection of data you have a new technique that will help you to do just that. It'll also be useful if you're reading someone else's code that happens to use lambdas.