# **11. Advanced Features & Optimization**

## 🔁 1. **Advanced Function Application**

In [1]:
import pandas as pd

## 📌 1. **Purpose & When to Use It**

**Purpose**:
Advanced function application in pandas allows you to **apply custom logic** or **transformations** to DataFrame rows, columns, or individual elements. It's especially useful when built-in pandas or NumPy operations aren't flexible enough for your task.

**When to Use**:

* When you need to perform **row-wise or column-wise** computations.
* When doing **custom transformations** not achievable with vectorized operations.
* When applying **complex business rules**, conditions, or calculations.
* During **feature engineering**, text transformation, or conditional value mapping.


## 🧠 2. **Different Methods and Techniques**

| Method       | Scope              | Description                                                             |
| ------------ | ------------------ | ----------------------------------------------------------------------- |
| `apply()`    | Series / DataFrame | Applies a function along axis (`axis=0` for columns, `axis=1` for rows) |
| `map()`      | Series only        | Element-wise transformation on a Series                                 |
| `applymap()` | DataFrame only     | Element-wise transformation across the whole DataFrame                  |
| `lambda`     | Any                | Anonymous function used with `apply`, `map`, or `applymap`              |
| `.pipe()`    | DataFrame / Series | Allows clean chaining of custom functions                               |


## 🧪 3. **Examples with Code**

### 🔹 a) `apply()` with Series

In [2]:
s = pd.Series([1, 2, 3])
s_squared = s.apply(lambda x: x ** 2)
s_squared

0    1
1    4
2    9
dtype: int64

### 🔹 b) `apply()` on DataFrame (Row-wise)

In [3]:
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})

df

Unnamed: 0,A,B
0,1,3
1,2,4


In [4]:
df['Sum'] = df.apply(lambda row: row['A'] + row['B'], axis=1)

df

Unnamed: 0,A,B,Sum
0,1,3,4
1,2,4,6


### 🔹 c) `map()` on a Series

In [5]:
grades = pd.Series(['A', 'B', 'C'])

grade_points = grades.map({'A': 1, 'B': 2, 'C': 3})

grade_points

0    1
1    2
2    3
dtype: int64

### 🔹 d) `applymap()` on DataFrame (element-wise)

In [6]:
df = pd.DataFrame({'X': [1, 2], 'Y': [10, 20]})

df

Unnamed: 0,X,Y
0,1,10
1,2,20


In [7]:
squared_df = df.applymap(lambda x: x ** 2)
squared_df

  squared_df = df.applymap(lambda x: x ** 2)


Unnamed: 0,X,Y
0,1,100
1,4,400


### 🔹 e) `.pipe()` for clean chaining

In [8]:
def add_constant(df, const):
    return df + const

df = pd.DataFrame({'val': [1, 2, 3]})
df = df.pipe(add_constant, const=10)

df

Unnamed: 0,val
0,11
1,12
2,13


## ⚡ 4. **Performance Considerations**

| Method       | Performance                                    |
| ------------ | ---------------------------------------------- |
| `map()`      | Fast (optimized for Series)                    |
| `apply()`    | Slower on DataFrame (row/column-wise)          |
| `applymap()` | Slower than vectorized operations              |
| `lambda`     | Can degrade performance inside `apply()`       |
| `.pipe()`    | No performance boost, but improves readability |

🟨 **Warning**: These are **not** vectorized operations. If possible, prefer **NumPy or pandas built-ins** (e.g., `df['col'] ** 2`) for better performance.


## ⚠️ 5. **Common Pitfalls & Mistakes**

1. **Using `apply()` when a vectorized method is available**:

   ```python
   # BAD
   df['col'].apply(lambda x: x * 2)

   # GOOD
   df['col'] * 2
   ```

2. **Forgetting `axis=1` when applying row-wise**:

   ```python
   df.apply(lambda row: row['A'] + row['B'])  # Wrong!
   df.apply(lambda row: row['A'] + row['B'], axis=1)  # Correct
   ```

3. **Using `applymap()` on Series** – it only works on DataFrames.

4. **Complex logic inside lambda** – makes debugging hard.

5. **Memory and speed** – using these on large datasets can be slow.


## ✅ 6. **Best Practices**

* ✅ **Prefer vectorized operations** whenever possible.
* ✅ Use `map()` for simple Series value replacements.
* ✅ Use `apply()` when logic depends on **multiple columns or rows**.
* ✅ Use `applymap()` for **element-wise** transformations across the whole DataFrame.
* ✅ Keep **lambda functions simple**.
* ✅ Use `.pipe()` to write clean, modular, and testable transformation pipelines.


## 💼 7. **Use Cases in Real Projects**

| Domain                    | Use Case                                                                  |
| ------------------------- | ------------------------------------------------------------------------- |
| 🎓 Education              | Mapping grades to grade points using `map()`                              |
| 🏥 Healthcare             | Calculating BMI using `apply()` across weight and height columns          |
| 📈 Finance                | Creating a custom risk score using multiple columns                       |
| 🛒 E-commerce             | Creating a feature to classify products by combining name and price range |
| 🧪 ML Feature Engineering | Apply custom logic to generate new features                               |
| 📊 Data Cleaning          | Normalizing strings using `applymap()` on an entire DataFrame             |


<center><b>Thanks</b></center>