# Model export

In the previous notebooks, methods are explored to improve the results of your model. Once you are done, you might want to use your model in a different place.

In [10]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import lightgbm as lgb
import joblib
import json

In [3]:
df = pd.read_csv('data/chl_regression_tutorial.csv')
df_train, df_test = train_test_split(df, test_size=0.2, random_state=42)

features = ['rho_443_a', 'rho_492_a', 'rho_560_a', 'rho_665_a', 'rho_704_a', 'rho_740_a', 'rho_783_a', 'rho_865_a']
target = 'CHL'

X_train = df_train[features]
y_train = df_train[target]

X_test = df_test[features]
y_test = df_test[target]

Saving an sklearn model is usually done using the `joblib` package, which allows you to save your model with one line of code. It is also considered good practice to export metadata along your model, for example in a json file, which can include information such as the features used to train the model.

Exporting the feature order used in the training, allows the person importing the model to use the correct feature set when applying the model in practice.

In [12]:
model = lgb.LGBMRegressor(verbose=-1)
model.fit(X_train, y_train)

joblib.dump(model, 'models/regression_model.joblib')
with open('models/regression_features.json', "w", encoding="utf-8") as f:
    json.dump(features, f)

The model can then be loaded in using `joblib.load`, allowing it to be applied to other data.

In [13]:
model = joblib.load('models/regression_model.joblib')
with open('models/regression_features.json', "r", encoding="utf-8") as f:
    features = json.load(f)

X_test = X_test[features]
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f'MSE: {mse}')

MSE: 3.090875641325473


One thing to be wary of, is exporting your model to a different python environment and / or a different version of sklearn. To deal with this, a model is often hosted on a different server, instead of copying the model file, but this requires significant infrastructural setup.