# Model Packaging and Persistence

Learn how to **save, load, and package** machine learning models for reuse and deployment.

We'll cover:
- Model persistence basics
- Pickle and Joblib usage
- MLflow model logging and packaging

## Why Model Persistence?

Saving a trained model helps you reuse it for predictions later without retraining. It ensures consistency and speeds up deployment.

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# Load dataset
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

## 🧰 Method 1: Pickle

In [None]:
import pickle

# Save model
with open('rf_model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Load model
with open('rf_model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)

print('✅ Model loaded successfully!')

In [None]:
y_pred = loaded_model.predict(X_test)
print('Sample predictions:', y_pred[:5])

## ⚡ Method 2: Joblib

In [None]:
import joblib

# Save model
joblib.dump(model, 'rf_model.joblib')

# Load model
loaded_joblib_model = joblib.load('rf_model.joblib')
print('✅ Joblib model loaded successfully!')

## 📦 Method 3: MLflow Packaging

In [None]:
import mlflow
import mlflow.sklearn

with mlflow.start_run(run_name='RF_Model_Packaging'):
    mlflow.sklearn.log_model(model, 'random_forest_model')
    print('✅ Model logged with MLflow!')

### 🔍 Load MLflow Model (Example)

In [None]:
# model_uri = 'runs:/<run_id>/random_forest_model'
# loaded_mlflow_model = mlflow.sklearn.load_model(model_uri)

## 🧾 Save Metadata

In [None]:
import json

metadata = {
    'model_name': 'RandomForestClassifier',
    'version': '1.0',
    'accuracy': float(model.score(X_test, y_test)),
    'features': list(iris.feature_names)
}

with open('model_metadata.json', 'w') as f:
    json.dump(metadata, f, indent=4)

print('🗂️ Metadata saved successfully!')

## 🧩 Best Practices
- Use Joblib for large scikit-learn models.
- Always store model version and metadata.
- Use MLflow for experiment tracking.
- Test loading in clean environments.
- Maintain consistent directory structure.

## ✅ Summary
You learned how to:
- Save models using Pickle and Joblib
- Log models via MLflow
- Store model metadata for reproducibility
- Prepare models for deployment