# Lambda functions

- simple in-line functions for simple operations.
- Usually used without name and defined inline.
- Can take any number of parameters
- can only have one expression.

In [1]:
double = lambda x: x*2

In [2]:
print(double(5))

10


## Filtering a list with lambda function

In [3]:
my_list = [1, 5, 4, 6, 8, 11, 13, 17]
odd = lambda x: (x%2 == 0)

The lambda returns true or false depending on the division result.

In [4]:
new_list = list(filter(odd, my_list))

Filter uses a function with return of a boolean. True means **we need this element**, False means **element is not needed**.

In [5]:
new_list

[4, 6, 8]

### Filter with Inline lambda

In [6]:
new_list = list(filter(lambda x: (x%2 == 0), my_list))
new_list

[4, 6, 8]

## Real-life example for machine learning

In [27]:
import pandas as pd
import numpy as np

Initial data:

In [28]:
data = {
    'mpg': [18.5, 12.6, 16.5, '?', 21.2],
    'disp': [307, 305, 213, 243, 333],
    'hp': [130.5, 141.2, 132.3, 121.8, 151.6],
    'wt': [3504, 3693, 3498, 3761, '?'],
    'acc': [12.0, 11.1, 13.2, 11.2, 12.2],
    'yr': [70, 71, 70, 72, 68],
    'origin': [1, 2, 1, 3, 1],
    'car_type': ['?', 0, 0, 0, 1],
    'car_name': ['ford', 'audi', 'opel', 'fiat', 'lux']
}

df1 = pd.DataFrame(data)
df1

Unnamed: 0,mpg,disp,hp,wt,acc,yr,origin,car_type,car_name
0,18.5,307,130.5,3504,12.0,70,1,?,ford
1,12.6,305,141.2,3693,11.1,71,2,0,audi
2,16.5,213,132.3,3498,13.2,70,1,0,opel
3,?,243,121.8,3761,11.2,72,3,0,fiat
4,21.2,333,151.6,?,12.2,68,1,1,lux


Replace missing values with NaN

In [19]:
mpg_df = df1.replace('?', np.nan)
mpg_df

Unnamed: 0,mpg,disp,hp,wt,acc,yr,origin,car_type,car_name
0,18.5,307,130.5,3504.0,12.0,70,1,,ford
1,12.6,305,141.2,3693.0,11.1,71,2,0.0,audi
2,16.5,213,132.3,3498.0,13.2,70,1,0.0,opel
3,,243,121.8,3761.0,11.2,72,3,0.0,fiat
4,21.2,333,151.6,,12.2,68,1,1.0,lux


In [20]:
mpg_df['hp'] = mpg_df['hp'].astype('float64')
mpg_df

Unnamed: 0,mpg,disp,hp,wt,acc,yr,origin,car_type,car_name
0,18.5,307,130.5,3504.0,12.0,70,1,,ford
1,12.6,305,141.2,3693.0,11.1,71,2,0.0,audi
2,16.5,213,132.3,3498.0,13.2,70,1,0.0,opel
3,,243,121.8,3761.0,11.2,72,3,0.0,fiat
4,21.2,333,151.6,,12.2,68,1,1.0,lux


Numeric columns only

In [21]:
numeric_cols = mpg_df.drop('car_name', axis=1)
numeric_cols

Unnamed: 0,mpg,disp,hp,wt,acc,yr,origin,car_type
0,18.5,307,130.5,3504.0,12.0,70,1,
1,12.6,305,141.2,3693.0,11.1,71,2,0.0
2,16.5,213,132.3,3498.0,13.2,70,1,0.0
3,,243,121.8,3761.0,11.2,72,3,0.0
4,21.2,333,151.6,,12.2,68,1,1.0


**Using a lambda function to filter the values**

In [23]:
numeric_cols.head()

Unnamed: 0,mpg,disp,hp,wt,acc,yr,origin,car_type
0,18.5,307,130.5,3504.0,12.0,70,1,
1,12.6,305,141.2,3693.0,11.1,71,2,0.0
2,16.5,213,132.3,3498.0,13.2,70,1,0.0
3,,243,121.8,3761.0,11.2,72,3,0.0
4,21.2,333,151.6,,12.2,68,1,1.0


In [25]:
filtered_data = numeric_cols.apply(lambda x: x.fillna(x.median()), axis=0)
filtered_data


Unnamed: 0,mpg,disp,hp,wt,acc,yr,origin,car_type
0,18.5,307,130.5,3504.0,12.0,70,1,0.0
1,12.6,305,141.2,3693.0,11.1,71,2,0.0
2,16.5,213,132.3,3498.0,13.2,70,1,0.0
3,17.5,243,121.8,3761.0,11.2,72,3,0.0
4,21.2,333,151.6,3598.5,12.2,68,1,1.0


In [29]:
numeric_cols.mpg.median()

17.5

In [34]:
numeric_cols.wt.median()

3598.5

As you see, the NaN in the 'mpg' and 'wt' columns has been replaced by the median of the relevant columns ( _17.5_ and _3598.5_ ).

In [33]:
numeric_cols.to_csv('mpg_cars.csv', index=False)