<a href="https://colab.research.google.com/github/Ghazaleh-Ramezani/data-driven-materials-optimization/blob/main/notebooks/03_nonlinear_models_and_optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Nonlinear Modeling and Data-Driven Optimization

This notebook extends the baseline linear and LASSO-based models by exploring nonlinear
regression techniques to better capture complex structure–property relationships in
nanocomposite systems.

The focus is on evaluating nonlinear predictive performance and demonstrating a simple
data-driven optimization workflow based on trained models.


In [None]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import r2_score, mean_squared_error

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C

# =========================
# Load dataset
# =========================
# Use the same dataset as previous notebooks
# Example:
# df = pd.read_csv("../data/data.csv")

target_col = "y"   # update if needed

X = df.drop(columns=[target_col])
y = df[target_col]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# =========================
# Gaussian Process Regression
# =========================
kernel = C(1.0) * RBF(length_scale=1.0)

gpr_model = Pipeline([
    ("scaler", StandardScaler()),
    ("gpr", GaussianProcessRegressor(kernel=kernel, normalize_y=True))
])

gpr_model.fit(X_train, y_train)

y_pred = gpr_model.predict(X_test)

r2 = r2_score(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f"R2 score: {r2:.3f}")
print(f"RMSE: {rmse:.3f}")


### Conclusion

Nonlinear modeling using Gaussian Process Regression improves the ability to capture
complex structure–property relationships compared to linear approaches.  
These results demonstrate the potential of nonlinear models as a foundation for
data-driven optimization and advanced materials design.
