# ⚡ Lesson 6: Apply(), Lambda, and Vectorization

Goal:
Write efficient Pandas code without loops.

Key Concepts:
- `.apply()` for column-wise or row-wise transformations
- `lambda` for quick inline logic
- Vectorized operations (fastest performance)


In [1]:
import pandas as pd

df = pd.DataFrame({
    'Name': ['Piyush', 'Sneha', 'Ravi', 'Amit'],
    'Age': [26, 25, 27, 24],
    'Salary': [55000, 60000, 72000, 48000]
})

df


Unnamed: 0,Name,Age,Salary
0,Piyush,26,55000
1,Sneha,25,60000
2,Ravi,27,72000
3,Amit,24,48000


Apply on a Single Column

In [2]:
df['Age_plus_5'] = df['Age'].apply(lambda x: x + 5)
df

Unnamed: 0,Name,Age,Salary,Age_plus_5
0,Piyush,26,55000,31
1,Sneha,25,60000,30
2,Ravi,27,72000,32
3,Amit,24,48000,29


Why Lambda Works Here

Because you are applying a function to each element of that column.

Apply on Multiple Columns (row-wise)

Row-wise apply is slower. Use it only when necessary.

In [3]:
df['Salary_Category'] = df.apply(lambda row: 
                                 'High' if row['Salary'] > 60000 else 'Low',
                                 axis=1)
df


Unnamed: 0,Name,Age,Salary,Age_plus_5,Salary_Category
0,Piyush,26,55000,31,Low
1,Sneha,25,60000,30,Low
2,Ravi,27,72000,32,High
3,Amit,24,48000,29,Low


Vectorization (fastest + cleanest)

Pandas uses NumPy under the hood, so you can perform operations directly on entire columns — no .apply(), no loops.

In [10]:
df['Age_Group'] = pd.cut(df['Age'], bins=[0,25,100], labels=['Junior','Senior'])

In [13]:
df['Salary_With_Bonus_Vectorized'] = df['Salary'] * 1.10
df


Unnamed: 0,Name,Age,Salary,Age_plus_5,Salary_Category,Age_Group,Salary_With_Bonus_Vectorized
0,Piyush,26,55000,31,Low,Senior,60500.0
1,Sneha,25,60000,30,Low,Junior,66000.0
2,Ravi,27,72000,32,High,Senior,79200.0
3,Amit,24,48000,29,Low,Junior,52800.0


Mini Task:
Compute Tax_15 = 15% of Salary (no apply(), only vectorized math)

In [14]:
df['Tax_15'] = df['Salary']*0.15
df

Unnamed: 0,Name,Age,Salary,Age_plus_5,Salary_Category,Age_Group,Salary_With_Bonus_Vectorized,Tax_15
0,Piyush,26,55000,31,Low,Senior,60500.0,8250.0
1,Sneha,25,60000,30,Low,Junior,66000.0,9000.0
2,Ravi,27,72000,32,High,Senior,79200.0,10800.0
3,Amit,24,48000,29,Low,Junior,52800.0,7200.0


Step 6 — Conditional Vectorization with .loc

You can do fast conditional updates like this:

In [18]:
df.loc[df['Age'] > 26, 'Level'] = 'Senior'
df.loc[df['Age'] <= 26, 'Level'] = 'Junior'
df


Unnamed: 0,Name,Age,Salary,Age_plus_5,Salary_Category,Age_Group,Salary_With_Bonus_Vectorized,Tax_15,Level
0,Piyush,26,55000,31,Low,Senior,60500.0,8250.0,Junior
1,Sneha,25,60000,30,Low,Junior,66000.0,9000.0,Junior
2,Ravi,27,72000,32,High,Senior,79200.0,10800.0,Senior
3,Amit,24,48000,29,Low,Junior,52800.0,7200.0,Junior


Mini Task:
Add a new column Performance =

“Excellent” if Salary > 65000

“Good” if between 58000–65000

“Average” otherwise

All using .loc vectorization — no apply().

In [20]:
df.loc[df['Salary'] > 65000, 'Performance'] = 'Excellent'
df.loc[(df['Salary'] >= 58000) & (df['Salary'] <= 65000), 'Performance'] = 'Good'
df.loc[(df['Salary']) < 58000, 'Performance'] = "Average"
df

Unnamed: 0,Name,Age,Salary,Age_plus_5,Salary_Category,Age_Group,Salary_With_Bonus_Vectorized,Tax_15,Level,Performance
0,Piyush,26,55000,31,Low,Senior,60500.0,8250.0,Junior,Average
1,Sneha,25,60000,30,Low,Junior,66000.0,9000.0,Junior,Good
2,Ravi,27,72000,32,High,Senior,79200.0,10800.0,Senior,Excellent
3,Amit,24,48000,29,Low,Junior,52800.0,7200.0,Junior,Average


Step 7 — map() for Entire DataFrame

You can transform every cell:

In [24]:
df[['Name', 'Salary_Category']] = df[['Name', 'Salary_Category']].map(lambda x: x.upper())
df

Unnamed: 0,Name,Age,Salary,Age_plus_5,Salary_Category,Age_Group,Salary_With_Bonus_Vectorized,Tax_15,Level,Performance
0,PIYUSH,26,55000,31,LOW,Senior,60500.0,8250.0,Junior,Average
1,SNEHA,25,60000,30,LOW,Junior,66000.0,9000.0,Junior,Good
2,RAVI,27,72000,32,HIGH,Senior,79200.0,10800.0,Senior,Excellent
3,AMIT,24,48000,29,LOW,Junior,52800.0,7200.0,Junior,Average


### ✅ Summary

- `.apply()` applies custom functions to Series or DataFrames.
- `lambda` allows compact, one-line anonymous functions.
- Vectorized operations are the fastest and most efficient — always prefer them.
- `.loc` helps with conditional updates.
- `.applymap()` modifies every element in a DataFrame.
