# Feature Construction And Feature Splitting | [Link](https://github.com/AdilShamim8/50-Days-of-Machine-Learning/tree/main/Day%2028%20Feature%20Construction%20and%20Feature%20Splitting)

Feature **Construction** and **Feature Splitting** are two important techniques in feature engineering that help improve machine learning model performance by transforming and refining input data.

---

## **1. Feature Construction**
Feature Construction refers to creating new features from existing ones to better capture relationships in data.

### **Key Techniques**
- **Mathematical Operations**:  
    $$ x_{\text{new}} = x_1 + x_2   x_{\text{new}} = x_1 \times x_2  x_{\text{new}} = \frac{x_1}{x_2 + \epsilon} $$

- **Polynomial Features**:  
  $$ \phi(X) = [x_1, x_2, x_1^2, x_2^2, x_1 x_2] $$

- **Log Transform**:  
  $$ x_{\text{log}} = \log (x + 1) $$

- **Binning**:  
  $$ x_{\text{bin}} = \text{floor} \left( \frac{x}{\text{bin\_size}} \right) $$

### **Python Example: Feature Construction**
```python
import pandas as pd
import numpy as np
from sklearn.preprocessing import PolynomialFeatures

df = pd.DataFrame({'x1': [1, 2, 3, 4], 'x2': [10, 20, 30, 40]})

# Interaction feature
df['x1_x2'] = df['x1'] * df['x2']

# Log transform
df['x1_log'] = np.log(df['x1'] + 1)

# Polynomial features
poly = PolynomialFeatures(degree=2, include_bias=False)
poly_features = poly.fit_transform(df[['x1', 'x2']])
print("Polynomial Features:\n", poly_features)
```

---

## **2. Feature Splitting**
Feature Splitting involves breaking down a complex feature into multiple simpler features, making it easier for models to learn relationships.

### **Common Scenarios**
- **Splitting Text Features** (e.g., Names, Dates, Addresses)
- **Splitting Categorical Data**
- **Extracting Date/Time Components**

### **Mathematical Representation**
Given a timestamp **\( t \)**:
- **Extracting Year, Month, Day**
  $$ t_{\text{year}} = \text{year}(t) $$
  $$ t_{\text{month}} = \text{month}(t) $$
  $$ t_{\text{day}} = \text{day}(t) $$

- **Extracting Words from Text**
  $$ x_{\text{split}} = \text{split}(x, \text{delimiter}) $$

### **Python Example: Feature Splitting**
```python
df = pd.DataFrame({'full_date': ['2025-02-27 14:30:00', '2024-12-10 08:15:00']})

# Convert to datetime
df['full_date'] = pd.to_datetime(df['full_date'])

# Splitting into new features
df['year'] = df['full_date'].dt.year
df['month'] = df['full_date'].dt.month
df['day'] = df['full_date'].dt.day
df['hour'] = df['full_date'].dt.hour

print(df)
```

---

## **Feature Construction vs. Feature Splitting: Comparison Table**

| Aspect               | Feature Construction | Feature Splitting |
|----------------------|---------------------|-------------------|
| **Definition**       | Creating new features by combining or transforming existing ones | Breaking an existing feature into multiple smaller features |
| **Purpose**         | Enhances predictive power by introducing new relationships | Simplifies complex data for better interpretability |
| **Example**         | Creating an interaction feature: \( x_{\text{new}} = x_1 \times x_2 \) | Splitting a datetime into year, month, and day |
| **Methods Used**    | Arithmetic operations, transformations, polynomial features | String operations, datetime parsing, categorical splitting |
| **Use Case**        | Improving model performance with engineered features | Breaking complex features into useful components |

---

## **Conclusion**
- **Feature Construction** creates new features to capture hidden relationships.
- **Feature Splitting** simplifies existing features by breaking them into meaningful components.