# 🚀 Feature Engineering: Feature Crosses & Feature Splitting

## 📌 Feature Crosses

### 🔹 Introduction
Feature crossing is a technique where we create **synthetic features** by combining two or more existing features. This helps in learning **non-linear relationships** in data while using a **linear model**.

### 🏗️ Example: Linear vs. Non-Linear Models
- A **linear model** works well when data points can be separated with a straight line.

![Screenshot (9976).png](attachment:52172d79-0b45-4ef7-a1fe-883aa8c38df2.png)

- However, when the data has a **non-linear pattern**, a simple linear model fails.

![Screenshot (9972).png](attachment:3c07d945-65b9-49f6-aa32-71385d941031.png)
  
- **Solution?** Introduce a **feature cross**!

### ✨ Creating a Feature Cross
- Define a new feature, let's call it **X3**.
- Set **X3 = X1 * X2** (the product of two existing features).
- Modify the linear equation:
  
  **Before:**
  
  ```math
  Y = W1*X1 + W2*X2 + Bias
  ```
  
  **After:**
  
  ```math
  Y = W1*X1 + W2*X2 + W3*X3 + Bias
  ```

![Screenshot (9973).png](attachment:e479d680-40bd-40ec-bb8f-b1a29a897ef3.png)

### 📊 Why Feature Crosses Work
- If **X1** and **X2** are both **positive or both negative**, then **X3** is positive → **helps in classifying data better**.
- If **X1** or **X2** is negative, then **X3** is negative.
- This enables the model to learn a **non-linear decision boundary** while still using a **linear model**.

## 🏡 Real-World Examples of Feature Crosses  

Feature crosses are particularly useful when individual features alone don't capture the full complexity of a problem. Here’s how they apply in different scenarios:

### ✅ Housing Prices: Latitude × Number of Bedrooms  
- Consider a **house pricing model** that predicts home values based on factors like **latitude** (geographic location) and **number of bedrooms**.
- **Why is this useful?**  
  - The price of a **3-bedroom house** in **San Francisco** is drastically different from a **3-bedroom house** in **Sacramento**.
  - If you use **only** the "number of bedrooms" as a feature, the model assumes that all 3-bedroom homes have similar prices—**which is incorrect**.
  - By **crossing latitude with the number of bedrooms** (e.g., `latitude × bedrooms`), the model can capture regional pricing differences.
  - This new feature **adjusts bedroom impact** based on **location**.

#### How it works mathematically:
**Original equation:**
```math
Price = W_1*latitude + W_2*bedrooms + bias
```
**After feature crossing:**
```math
Price =  W_1*latitude + W_2*bedrooms + W_3*(latitude*bedrooms) + bias
```
- Now, the model understands that the **effect of bedrooms on price depends on location**.

---

### ✅ Tic-Tac-Toe: Crossing Board Coordinates  
- Suppose you're building a model to **predict game-winning moves** in Tic-Tac-Toe.
- Instead of treating each board position separately, we can **cross board coordinates** to learn patterns.

#### **Why is this useful?**
- Tic-Tac-Toe is about **winning sequences** (rows, columns, diagonals).
- If the model treats each **board cell separately**, it won’t understand **relationships between positions**.
- By crossing **row and column indices** (e.g., `row × column`), the model learns:
  - Diagonal moves (e.g., `(0,0)`, `(1,1)`, `(2,2)`).
  - Horizontal/vertical patterns.

#### **Example:**
**Original Features:**

```plaintext
Row | Column | X/O 
-------------------
 0  |   0    |  X
 0  |   1    |  O
 0  |   2    |  X
 1  |   0    |  O
 1  |   1    |  X
```
**Feature Cross:**

```plaintext
New Feature = Row × Column
```
Now, the model recognizes **winning patterns** instead of treating each move as independent.

---

## 🤔 Why Use Feature Crosses?  
### 1⃣ Allows Non-Linear Learning in Linear Models  
- Linear models are limited to **straight-line decision boundaries**.
- Feature crosses introduce **new, transformed features**, making them capable of modeling **complex, non-linear relationships** without requiring non-linear models.

### 2⃣ Scalability – Works Well with Large Datasets  
- Feature engineering is often **more efficient than increasing model complexity**.
- Feature crosses allow simple models (like logistic regression) to **perform well on big datasets** without the need for deep learning.

### 3⃣ Deep Learning Synergy – Enhancing Neural Networks  
- Deep Learning models **learn their own feature crosses**, but explicitly adding important crosses **reduces training time** and **improves interpretability**.
- Feature crosses **help neural networks converge faster** by providing them with better-structured data.

---

## 🔥 Key Takeaway:  
Feature crosses **capture hidden relationships** in data, **enhancing models** without adding much complexity. 🚀
---

## ✂️ Feature Splitting

### 🔹 Introduction
Feature splitting is a **preprocessing** technique where we break down a single feature into **multiple meaningful components**.

### 🎯 Why Split Features?
- Extract **useful information** from a single column.
- Make **features more interpretable**.
- Improve model **performance** by reducing redundancy.

### ⚙️ Example 1: Splitting Text
#### **Extracting Words from a Sentence**
```python
text = "Machine Learning is amazing"
words = text.split()  # Splitting at spaces
print(words)
```
✅ **Output**:
```python
['Machine', 'Learning', 'is', 'amazing']
```

#### **Splitting a Paragraph by Sentences**
```python
paragraph = "AI is powerful. Data is essential."
sentences = paragraph.split(".")  # Splitting at full stops
print(sentences)
```
✅ **Output**:
```python
['AI is powerful', ' Data is essential']
```

### ⚙️ Example 2: Feature Splitting in a Dataset
#### **Extracting First & Last Names**
```python
import pandas as pd

df = pd.DataFrame({
    'Player Name': ['Michael Jordan', 'LeBron James', 'Kobe Bryant']
})

df['First Name'] = df['Player Name'].apply(lambda x: x.split()[0])
df['Last Name'] = df['Player Name'].apply(lambda x: x.split()[-1])

print(df[['First Name', 'Last Name']])
```
✅ **Output**:
```python
  First Name Last Name
0   Michael   Jordan
1    LeBron    James
2      Kobe   Bryant
```

### 📌 Key Takeaways
- **Feature Crossing**: Combine features to introduce non-linearity into linear models.
- **Feature Splitting**: Extract meaningful parts from existing features.
- Both techniques enhance **model accuracy** and **interpretability**! 🚀

