# Feature Engineering

This notebook performs feature engineering steps required for machine
learning models, including:

- Handling remaining missing values
- Feature scaling
- Dimensionality reduction using PCA

The resulting dataset will be used for model training.


In [None]:
import numpy as np
import pandas as pd

from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

In [None]:
df = pd.read_csv("../data/dataset_clean.csv")

df.head()

In [None]:
X = df.drop(columns=["m_falla"])
y = df["m_falla"]

The target variable is separated from the feature matrix before
applying transformations.

In [None]:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

In [None]:
pca = PCA(n_components=0.95, random_state=42)
X_pca = pca.fit_transform(X_scaled)

In [None]:
X_pca.shape

In [None]:
df_pca = pd.DataFrame(X_pca)
df_pca["target"] = y.values

df_pca.head()

In [None]:
df_pca.to_csv("../data/dataset_final_pca.csv", index=False)

## Summary

Feature engineering was applied to prepare the dataset for machine
learning models. The final dataset includes scaled and PCA-transformed
features, ensuring reduced dimensionality and improved learning
efficiency.