# üîå Using Featuristic in scikit-learn Pipelines

Featuristic supports full integration with the **scikit-learn pipeline API**, so you can:

- Chain synthesis, selection, and modeling steps
- Use `Pipeline`, `GridSearchCV`, `cross_val_score`, etc.
- Deploy or serialize Featuristic pipelines like any other transformer

---

## ‚úÖ Basic Pipeline Example

```python
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from featuristic import GeneticFeatureSynthesis
from featuristic.datasets import fetch_wine_dataset

X, y = fetch_wine_dataset()

pipeline = Pipeline([
    ("gfs", GeneticFeatureSynthesis(num_features=5, max_generations=30)),
    ("clf", RandomForestClassifier())
])

pipeline.fit(X, y)
```

---

## üß™ Cross-Validation

```python
from sklearn.model_selection import cross_val_score

scores = cross_val_score(pipeline, X, y, cv=5)
print("CV Accuracy:", scores.mean())
```

---

## üîç Grid Search Integration

```python
from sklearn.model_selection import GridSearchCV

grid = GridSearchCV(
    pipeline,
    param_grid={
        "clf__n_estimators": [100, 200],
        "gfs__num_features": [5, 10]
    },
    cv=3
)

grid.fit(X, y)
print(grid.best_params_)
```

---

## üîó Combining Synthesis + Selection

To first generate new features, then filter them down:

```python
from featuristic import GeneticFeatureSynthesis, GeneticFeatureSelector
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.pipeline import Pipeline
from sklearn.metrics import log_loss

# Define a selector objective
def objective(X_subset, y):
    from sklearn.linear_model import LogisticRegression
    clf = LogisticRegression(max_iter=500).fit(X_subset, y)
    probs = clf.predict_proba(X_subset)
    return log_loss(y, probs)

pipeline = Pipeline([
    ("synthesis", GeneticFeatureSynthesis(num_features=20, max_generations=25)),
    ("select", GeneticFeatureSelector(objective_function=objective, max_generations=30)),
    ("model", GradientBoostingClassifier())
])

pipeline.fit(X, y)
```

‚úÖ This pattern is great when:

- You want rich, expressive symbolic features
- But still want to trim the noise via model-aware selection

---

## üíæ Saving & Reusing Pipelines

```python
import joblib
joblib.dump(pipeline, "gfs_pipeline.pkl")
```

later

```python
loaded = joblib.load("gfs_pipeline.pkl")
loaded.predict(X_new)
```

---

##¬†‚úÖ Summary

| Use Case                    | Featuristic Support |
| --------------------------- | ------------------- |
| `Pipeline` chaining         | ‚úÖ                   |
| Cross-validation            | ‚úÖ                   |
| Grid/randomized search      | ‚úÖ                   |
| Synthesis + selection combo | ‚úÖ                   |
| `joblib` serialization      | ‚úÖ                   |

