# **📘 Day 7 – Applying Functions (apply, applymap, map) 💡**

#### **Goal:** Master the three primary methods—apply(), applymap(), and map()—used to execute custom or built-in functions element-wise, across columns, or across rows for data transformation.

#### **Topics To Cover:** Series.map(), DataFrame.apply() (Row-wise & Column-wise), DataFrame.applymap(), and using lambda functions for efficient, in-line transformations.

----

## **Introduction to Vectorized Operations vs. Functions 🚀**
Pandas and NumPy are highly optimized for vectorized operations (like df['A'] + df['B']), which are fast. However, not every data transformation can be solved with a simple built-in operation.

The Applying Functions family (map, apply, applymap) provides the necessary flexibility to execute arbitrary functions (whether custom functions defined with def or anonymous lambda functions) across your data, bridging the gap between simple vectorized speed and complex business logic.

### **Why are Function Applications Important? 🔑**
* **Custom Logic:** They allow you to integrate custom, complex rules that can't be achieved with standard aggregation (e.g., classifying text, calculating a derived metric based on multiple columns).

* **Flexibility:** They are the tools for applying operations:

* **Element-wise:** To every single value (like applymap()).

* **Series-wise:** Across a column or row (like apply() on an entire Series).

* **Code Clarity:** Using lambda functions with these methods often leads to concise and readable data transformation code.


----

## Let's Begin 🚀

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np

# Load the Data
data = pd.read_csv(r'..\data\BMW sales data (2010-2024) (1).csv')

df = pd.DataFrame(data)
df

Unnamed: 0,Model,Year,Region,Color,Fuel_Type,Transmission,Engine_Size_L,Mileage_KM,Price_USD,Sales_Volume,Sales_Classification
0,5 Series,2016,Asia,Red,Petrol,Manual,3.5,151748,98740,8300,High
1,i8,2013,North America,Red,Hybrid,Automatic,1.6,121671,79219,3428,Low
2,5 Series,2022,North America,Blue,Petrol,Automatic,4.5,10991,113265,6994,Low
3,X3,2024,Middle East,Blue,Petrol,Automatic,1.7,27255,60971,4047,Low
4,7 Series,2020,South America,Black,Diesel,Manual,2.1,122131,49898,3080,Low
...,...,...,...,...,...,...,...,...,...,...,...
49995,i3,2014,Asia,Red,Hybrid,Manual,4.6,151030,42932,8182,High
49996,i3,2023,Middle East,Silver,Electric,Manual,4.2,147396,48714,9816,High
49997,5 Series,2010,Middle East,Red,Petrol,Automatic,4.5,174939,46126,8280,High
49998,i3,2020,Asia,White,Electric,Automatic,3.8,3379,58566,9486,High


## **7.1 Core Methods**

| Method        | Works on  | Scope        | Example use case                        |
| ------------- | --------- | ------------ | --------------------------------------- |
| `.map()`      | Series    | Element-wise | Clean a text column (`.map(str.lower)`) |
| `.apply()`    | Series/DF | Series/Row   | Row/column-wise transformations         |
| `.applymap()` | DataFrame | Element-wise | Apply to every element in DataFrame     |


---

**7.1.1 `.apply()`:** This method is the most flexible and powerful of all the methods, as it can apply any custom function to a grouped object. It can be used for both aggregation and reshaping. This method applies function along axis (`row` or `column`). 

In [2]:
# apply to axis=0
df_numeric = df.select_dtypes(include='number')
df_numeric.apply('mean', axis=0) # returns a Series of columns with mean values

Year               2017.01570
Engine_Size_L         3.24718
Mileage_KM       100307.20314
Price_USD         75034.60090
Sales_Volume       5067.51468
dtype: float64

In [3]:
# apply to axis=1
df_numeric.apply('mean', axis=1)

0        52161.50
1        41266.52
2        26655.30
3        18859.74
4        35426.22
           ...   
49995    40832.52
49996    41590.64
49997    46271.90
49998    14690.96
49999    50456.46
Length: 50000, dtype: float64

In [4]:
# Using lambda for specific columns on axis=1
df_numeric.apply(lambda row: row['Engine_Size_L'] + row['Sales_Volume'], axis=1)

0        8303.5
1        3429.6
2        6998.5
3        4048.7
4        3082.1
          ...  
49995    8186.6
49996    9820.2
49997    8284.5
49998    9489.8
49999    1767.3
Length: 50000, dtype: float64

**7.1.2 `.map()`:** This is a versatile method that performs element-wise transformations. Its behavior depends on whether it's called on a Series or a DataFrame:

- **When used on a Series (a single column):** It maps each value in that column to a new value. This is useful for tasks like replacing categorical labels with different strings or numbers, as we practiced.

- **When used on a DataFrame (the entire table):** It applies a function to every single element in the DataFrame, effectively replacing the deprecated .applymap() method. This is useful for element-wise calculations across the entire table, such as rounding every number.

In [5]:
df['Type'] = df['Fuel_Type'].map({'Electric': 'EV', 'Petrol': 'ICE', 'Diesel': 'ICE', 'Hybrid': 'HB'})
df

Unnamed: 0,Model,Year,Region,Color,Fuel_Type,Transmission,Engine_Size_L,Mileage_KM,Price_USD,Sales_Volume,Sales_Classification,Type
0,5 Series,2016,Asia,Red,Petrol,Manual,3.5,151748,98740,8300,High,ICE
1,i8,2013,North America,Red,Hybrid,Automatic,1.6,121671,79219,3428,Low,HB
2,5 Series,2022,North America,Blue,Petrol,Automatic,4.5,10991,113265,6994,Low,ICE
3,X3,2024,Middle East,Blue,Petrol,Automatic,1.7,27255,60971,4047,Low,ICE
4,7 Series,2020,South America,Black,Diesel,Manual,2.1,122131,49898,3080,Low,ICE
...,...,...,...,...,...,...,...,...,...,...,...,...
49995,i3,2014,Asia,Red,Hybrid,Manual,4.6,151030,42932,8182,High,HB
49996,i3,2023,Middle East,Silver,Electric,Manual,4.2,147396,48714,9816,High,EV
49997,5 Series,2010,Middle East,Red,Petrol,Automatic,4.5,174939,46126,8280,High,ICE
49998,i3,2020,Asia,White,Electric,Automatic,3.8,3379,58566,9486,High,EV


In [6]:
df.columns = df.columns.map(str.lower) # convert column names to lowercase
df

Unnamed: 0,model,year,region,color,fuel_type,transmission,engine_size_l,mileage_km,price_usd,sales_volume,sales_classification,type
0,5 Series,2016,Asia,Red,Petrol,Manual,3.5,151748,98740,8300,High,ICE
1,i8,2013,North America,Red,Hybrid,Automatic,1.6,121671,79219,3428,Low,HB
2,5 Series,2022,North America,Blue,Petrol,Automatic,4.5,10991,113265,6994,Low,ICE
3,X3,2024,Middle East,Blue,Petrol,Automatic,1.7,27255,60971,4047,Low,ICE
4,7 Series,2020,South America,Black,Diesel,Manual,2.1,122131,49898,3080,Low,ICE
...,...,...,...,...,...,...,...,...,...,...,...,...
49995,i3,2014,Asia,Red,Hybrid,Manual,4.6,151030,42932,8182,High,HB
49996,i3,2023,Middle East,Silver,Electric,Manual,4.2,147396,48714,9816,High,EV
49997,5 Series,2010,Middle East,Red,Petrol,Automatic,4.5,174939,46126,8280,High,ICE
49998,i3,2020,Asia,White,Electric,Automatic,3.8,3379,58566,9486,High,EV


<div class='alert alert-info alert-box'>
<strong>Note:</strong> The applymap() method is deprecated and isn't recommended to use
</div>

**7.1.3 `.applymap()`:** this method is for an entire DataFrame. It applies a function to every single element of the DataFrame. It's often used on a subset of numerical columns to perform a consistent transformation, such as rounding or scaling. **Key Features**: Only works on DataFrame, not Series.

In [7]:
# select all the numerical columns and use .applymap() to round every value.
df_numeric = df.select_dtypes(include='number')
df_numeric = df_numeric.applymap(lambda x: round(x))
df_numeric

  df_numeric = df_numeric.applymap(lambda x: round(x))


Unnamed: 0,year,engine_size_l,mileage_km,price_usd,sales_volume
0,2016,4,151748,98740,8300
1,2013,2,121671,79219,3428
2,2022,4,10991,113265,6994
3,2024,2,27255,60971,4047
4,2020,2,122131,49898,3080
...,...,...,...,...,...
49995,2014,5,151030,42932,8182
49996,2023,4,147396,48714,9816
49997,2010,4,174939,46126,8280
49998,2020,4,3379,58566,9486
