<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:200%;
           font-family:Arial;letter-spacing:0.5px">

<p width = 20%, style="padding: 10px;
              color:white;">
Lambda Functions and Pandas Transformations
              
</p>
</div>

Data Science Cohort Live NYC May 2022
<p>Phase 1: Topic 5</p>
<br>
<br>

<div align = "right">
<img src="Images/flatiron-school-logo.png" align = "right" width="200"/>
</div>
    
    

#### Lambda Functions

- Lambda functions: simple way to write small, single use functions.
- Often used as argument in other functions:
    - E.g.,  `.map()` or `.apply()` method in pandas series/DataFrame

Let's see lambda functions aiding us in a sort operation:

#### Lambda functions within the `sort()` function
Sort this list on the last name.


In [None]:
# Without a key
names = ['Miriam Marks','Sidney Baird','Elaine Barrera','Eddie Reeves','Marley Beard',
         'Jaiden Liu','Bethany Martin','Stephen Rios','Audrey Mayer','Kameron Davidson',
         'Carter Wong','Teagan Bennett']
sorted(names)

Hmmm...its sorting on the first character.
- Lambda function as argument: return last name as sorting key

In [None]:
# Sorting by last name
names = ['Miriam Marks','Sidney Baird','Elaine Barrera','Eddie Reeves','Marley Beard',
         'Jaiden Liu','Bethany Martin','Stephen Rios','Audrey Mayer','Kameron Davidson',
'Teagan Bennett']
sorted(names, key=lambda x: x.split()[1])


#### Lambda functions with pandas `.map()`
Let's take a look at using lambda expressions on a Yelp ratings dataset.

In [None]:
import pandas as pd
df = pd.read_csv('data/Yelp_Reviews.csv', index_col = 0).reset_index()
df.head(5)

Simple example: naively select the year from the date string rather than convert it to a datetime object.

In [None]:
df.date.map(lambda x: x[:4]).head()

More realistic example:
- Get list of the length of each word in a given review.

In [None]:
df['text'][0]

In [None]:
df['text'].map(lambda text: [len(word) for word in text.split()]).head()

Variable name you use as parameter in `lambda` expression does not matter:

In [None]:
df['text'].map(lambda banana: [len(word) for word in banana.split()]).head()

#### Lambda functions with conditionals
Lambda functions can also accept some conditionals if chained in a list comprehension

In [None]:
df['text'].map(lambda x: 'Good' if any([word in x.lower() for word in ['awesome', 'love', 'good', 'great']]) else 'Bad').head()

##### Note
This is ugly, un-Pythonic and not in line with [PEP 8](https://www.python.org/dev/peps/pep-0008/).
- Guidline for max characters in a line: 72 
- Above: 127 characters. 

#### Lambda functions with pandas `.apply()`

Let's go back to our trusty cereal dataset!

In [None]:
cereal_df = pd.read_csv('Data/cereal.csv', index_col = 'name').drop(columns = ['shelf'])
cereal_df.head(2)

Now we want to apply a standardization transformation to the numeric columns of this dataframe:
- For each column subtract by its mean an divide by standard deviation: $$ \hat{x}_i^{col} = \frac{x_i^{col} - \mu^{col} }{s^{col}} $$

- `lambda` expression takes in a column (Series) in the Dataframe
- `.apply()`: applies to each column in DataFrame.


In [None]:
cereal_df.loc[:, 'calories':'rating'].apply(lambda col: (col - col.mean())/col.std(ddof = 1), axis = 0)

This is a very important kind of transformation. We'll see it later in greater detail.

#### When to use lambda functions

- Single line of code
- Single use function
- Relatively easy to read.































## When not to use lambda functions

- Several lines of code in lambda expression.
- Multiple conditions, loops, etc in function.
- Want to reuse this function often.

If it's hard for you to read, it's even harder for anyone else.
