# Lab 13: Exporting scikit-learn Models & Building Gradio Demos

This notebook walks you through saving scikit-learn models (as Pipelines) and creating simple Gradio demos you can use for your final project.

In this lab we will:

- Save trained scikit-learn Pipelines with `joblib`.
- Load saved models in a new session (simulating deployment).
- Build simple Gradio demos for classification and regression, based on templates you can adapt for your final project.

## 1. Install dependencies

Run this cell to install `gradio` (for demos) and `joblib` (for saving models).

In [None]:
!pip install -q gradio joblib scikit-learn

## 2. Classification example: Train a small example model (Iris data)

We'll train an `SVC` inside a `Pipeline` with a `StandardScaler` and save it with `joblib`.

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
import joblib
import numpy as np

# Load data
X, y = load_iris(return_X_y=True)

# Split (we keep this small and reproducible)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build pipeline (ALWAYS save pipeline)
clf_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('svc', SVC(probability=True, random_state=42))
])

# Fit
clf_pipeline.fit(X_train, y_train)

# Quick eval
acc = clf_pipeline.score(X_test, y_test)
print(f"Test accuracy: {acc:.3f}")

# Save the pipeline
joblib.dump(clf_pipeline, 'iris_svc_pipeline.joblib')
print('Saved iris_svc_pipeline.joblib')


##### <font color='red'>**TRY IT**</font> &#x1f9e0;: Inspect the saved file in the file browser (left side of Colab) to confirm `iris_svc_pipeline.joblib` exists.

## 3. Simulate a fresh session: load the model and build a Gradio demo (classification)

In practice, demos and apps run in a separate process from training. We'll simulate that by **loading** the saved joblib file and creating a `predict` function that the Gradio UI calls.

In [None]:
import joblib
import gradio as gr
import numpy as np

# Load the saved pipeline
loaded_clf = joblib.load('iris_svc_pipeline.joblib')
print('Loaded pipeline: ', loaded_clf)

# Map numeric target to names (keeps UI friendly)
target_names = load_iris().target_names.tolist()

def predict_iris(sepal_length, sepal_width, petal_length, petal_width):
    """Takes four numeric inputs and returns a predicted species and probabilities."""
    X_input = np.array([[sepal_length, sepal_width, petal_length, petal_width]])
    pred_idx = int(loaded_clf.predict(X_input)[0])
    proba = loaded_clf.predict_proba(X_input)[0]
    proba_str = ', '.join([f"{name}: {p:.2f}" for name, p in zip(target_names, proba)])
    return f"Predicted species: {target_names[pred_idx]}\nProbabilities -> {proba_str}"

# Build Gradio interface (simple sliders for numeric features)
clf_demo = gr.Interface(
    fn=predict_iris,
    inputs=[
        gr.Slider(4.0, 8.0, value=5.0, step=0.1, label='Sepal length (cm)'),
        gr.Slider(2.0, 4.5, value=3.0, step=0.1, label='Sepal width (cm)'),
        gr.Slider(1.0, 7.0, value=4.0, step=0.1, label='Petal length (cm)'),
        gr.Slider(0.1, 2.5, value=1.2, step=0.1, label='Petal width (cm)'),
    ],
    outputs='text',
    title='Iris SVC demo (sklearn pipeline)')

# Launch the demo inline in Colab
print('Starting Gradio demo... (this cell will show a link)')
clf_demo.launch(share=True)


## 4) Regression example (California housing, tiny pipeline)

This is a template for regression. We scale features, fit a linear regression, save the pipeline, and create a Gradio demo that accepts numeric inputs. For your final project, replace the feature names and ranges with those that match your dataset.

In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

X_reg, y_reg = fetch_california_housing(return_X_y=True)

# For speed, let's use a small subset (optional)
Xr = X_reg[:3000]
yr = y_reg[:3000]

reg_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('lr', LinearRegression())
])

reg_pipeline.fit(Xr, yr)
print('R^2 on training set:', reg_pipeline.score(Xr, yr))
joblib.dump(reg_pipeline, 'california_lr_pipeline.joblib')
print('Saved california_lr_pipeline.joblib')

In [None]:
import joblib
import gradio as gr
import numpy as np

reg_loaded = joblib.load('california_lr_pipeline.joblib')
print('Loaded regressor: ', reg_loaded)

feature_names = [
    'MedInc', 'HouseAge', 'AveRooms', 'AveBedrms',
    'Population', 'AveOccup', 'Latitude', 'Longitude'
]

def predict_house(*vals):
    """Accepts 8 numeric values (order as feature_names) and returns predicted median house value."""
    X_in = np.array([list(vals)])
    pred = reg_loaded.predict(X_in)[0]
    return f"Predicted median house value: ${pred*100000:.0f}"

inputs = [gr.Number(label=n, value=0.0) for n in feature_names]
reg_demo = gr.Interface(fn=predict_house, inputs=inputs, outputs='text', title='Housing price demo')
reg_demo.launch(share=True)


## 5) Final project checklist (copy this into your demo notebook so you make sure you complete all the steps!)

- [ ] Trained scikit-learn model inside a `Pipeline` (preprocessing + estimator)
- [ ] Saved `.joblib` file included in your repo
- [ ] A separate demo notebook that **loads** the `.joblib` file and runs a Gradio apps
- [ ] Friendly UI labels
- [ ] Description at the top of the notebook telling users how to run the demo

### Final tips
- If you have categorical features, use `gr.Dropdown()` with the exact categories.
- If a numeric feature has a bounded range, use `gr.Slider()` with a sensible range and step.
- Keep the demo focused: one clean prediction function is better than multiple confusing ones.
- If your model requires a mapping (e.g., label encoder), include that mapping in the notebook and save it with the pipeline or as a separate `.joblib` file.
- You can have some "pre-loaded" examples in your notebook so users can quickly select test cases. Check out [the documentation](https://www.gradio.app/docs/gradio/examples) for more details.

### Persistent hosting (optional)

- `gradio` gives you a temporary public link via `share=True`. This is good for demos but not production.
- For a persistent demo you can deploy on Hugging Face Spaces (or wherever you prefer). Those platforms accept `gradio` apps and can host them permanently. See [this guide](https://www.gradio.app/guides/using-hugging-face-integrations) for information on how to host on HuggingFace. If you go that route, be sure to keep your `joblib` file in the repo.

Good luck and ***have fun with it!*** This is the exciting part: showing the world your awesome achievements!