# Model Persistence

Training a model takes a lot of time if the data set has million samples which is like we have in real life. This is the reason why model persistence is important. We built and train our model once in a while and save it to a file. Next time we want to make predictions, we simply load the model from the file and ask it to make prdecitons. That model is already trained.

In [1]:
# before model persistence

import pandas as pd
from sklearn.tree import DecisionTreeClassifier

music_data = pd.read_csv("music.csv")
X = music_data.drop(columns = 'genre')
y = music_data['genre']

model = DecisionTreeClassifier()
model.fit(X, y)

predictions = model.predict([ [21, 1] ])

In [3]:
# after

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.externals import joblib

music_data = pd.read_csv("music.csv")
X = music_data.drop(columns = 'genre')
y = music_data['genre']

model = DecisionTreeClassifier()
model.fit(X, y)

# save model to a file
joblib.dump(model, 'music-recommender.joblib')

['music-recommender.joblib']

In [4]:
# load model from the file
model = joblib.load('music-recommender.joblib')

predictions = model.predict([ [21, 1] ])
predictions

array(['HipHop'], dtype=object)