Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I save a pyod model? #88

Open
singyaowu opened this issue May 8, 2019 · 13 comments
Open

How can I save a pyod model? #88

singyaowu opened this issue May 8, 2019 · 13 comments
Assignees

Comments

@singyaowu
Copy link

singyaowu commented May 8, 2019

I've just trained a auto-encoder model, and I wonder how can I save the model so that I don't need to train it again next time I want it. I didn't see any function related to save a model in auto_encoder.py, so I'm not sure if there is a function which I can use to save my model. Do you implement this kind of function?

@yzhao062
Copy link
Owner

yzhao062 commented May 8, 2019

Agreed that a model save functionality should be added. Marked as a todo task. I am not sure whether pickle will work or not (hopefully yes), and I will also do some tests.

@osancus
Copy link

osancus commented Aug 6, 2019

When trying to save AutoEncoder model using Pickle, Following error occurs. Any idea how can I fix it?

TypeError: can't pickle _thread.RLock objects

#Code
clf = fit_model(X_train)
pickle.dump(clf, open('./autoencoder.h5', 'wb'))

@yzhao062
Copy link
Owner

@epicsol-inc sorry for the late response. AE in pyod is written with keras, and saving the model can be tricky.

To my understanding, keras models may not be pickable (keras-team/keras#10528)...

If saving model is a must, you may have to copy the code out from auto_encoder.py directly. Sorry for the inconvenience..

@sbysiak
Copy link

sbysiak commented Aug 14, 2019

@epicsol-inc
I managed to save it using dill (https://pypi.org/project/dill/), which has syntax very similar to pickle

with open(out_fname, 'wb') as f: dill.dump(model, f, dill.HIGHEST_PROTOCOL)

You can check if it works in your case

@yzhao062
Copy link
Owner

@sbysiak Thanks for the note. Much appreciated. Will also check out it and consider add this to the documentation :)

@lgo7
Copy link

lgo7 commented Oct 16, 2019

Any news regarding save PyOD models? I need to save an IForest model, can I use Pickle?

@yzhao062
Copy link
Owner

Any news regarding save PyOD models? I need to save an IForest model, can I use Pickle?

Sorry I have not tested it out which should be. If pickle is not working, I will say using "https://pypi.org/project/dill/" as mentioned above.

This will be listed on the top of my priority list now.

@yzhao062 yzhao062 self-assigned this Oct 16, 2019
@lgo7
Copy link

lgo7 commented Oct 24, 2019

I've used picke.dump and worked!

@AlexDelPab
Copy link

I've also used pickle.dump() for the classifiers knn, oc-svm, iforest and fabod and it works saving and loading them with:

save: pickle.dump(clf, open(folder + clf_name + '.h5', 'wb'))
load: pickle.loads(open(
folder + 'k Nearest Neighbors (kNN).h5',
'rb').read())

@bhowmiks
Copy link

bhowmiks commented Apr 17, 2020

Pickle and dill can save successfully. But these formats can make it time consuming to load the model. For autoencoder model, I saved the weights as HDF5 and the classifier object as pickle for faster loads and less disk space.

from pyod.models.auto_encoder import AutoEncoder
autoenModel= AutoEncoder()
autoenModel.fit(X=x_train)


##serialize model to JSON
model_json = autoenModel.model_.to_json()
with open(model_path+".json", "w") as json_file:
  json_file.write(model_json)
##serialize weights to HDF5
autoenModel.model_.save_weights(model_path+"model.h5")

##then set autoencoder model to None. It makes it smaller

autoenModel.model_ = None
with open(newpath+"//"+model_name+"_model"+'.pickle', 'wb') as handle:
  pickle.dump(autoenModel, handle, protocol=pickle.HIGHEST_PROTOCOL)

Model Load

##load the auto encoder instance
with open(path + "//" + model_n+"_model" + ".pickle", 'rb') as handle:
loaded_model = pickle.load(handle)

# load json and create model
json_file = open(path + "//" + model_n + '.json', 'r')

loaded_model_json = json_file.read()
loaded_model_json = loaded_model_json.replace("\"ragged\": false,", " ")
json_file.close()
loaded_model_ = model_from_json(loaded_model_json)
# load weights into new model
loaded_model_.load_weights(path + "//" + model_n + "model.h5")
print("Loaded model from disk")

loaded_model.model_ = loaded_model_   ## Set the loaded model to the auto encoder instance model

This works almost 5x faster and model size is 10X smaller.

yzhao062 added a commit that referenced this issue Sep 19, 2020
yzhao062 added a commit that referenced this issue Sep 19, 2020
@ezzeldinadel
Copy link

ezzeldinadel commented Feb 9, 2021

loaded_model_ = model_from_json(loaded_model_json)

what is model_from_json? this https://www.tensorflow.org/api_docs/python/tf/keras/models/model_from_json ?

@SaqlainHussainShah
Copy link

I have tried with .pkl and .h5 extension along with dill, pickle and joblib but the issue persists

Unable to save model can't pickle _thread.RLock objects

@lfvillavicencio
Copy link

Pickle and dill can save successfully. But these formats can make it time consuming to load the model. For autoencoder model, I saved the weights as HDF5 and the classifier object as pickle for faster loads and less disk space.

from pyod.models.auto_encoder import AutoEncoder
autoenModel= AutoEncoder()
autoenModel.fit(X=x_train)


##serialize model to JSON
model_json = autoenModel.model_.to_json()
with open(model_path+".json", "w") as json_file:
  json_file.write(model_json)
##serialize weights to HDF5
autoenModel.model_.save_weights(model_path+"model.h5")

##then set autoencoder model to None. It makes it smaller

autoenModel.model_ = None
with open(newpath+"//"+model_name+"_model"+'.pickle', 'wb') as handle:
  pickle.dump(autoenModel, handle, protocol=pickle.HIGHEST_PROTOCOL)

Model Load

##load the auto encoder instance with open(path + "//" + model_n+"_model" + ".pickle", 'rb') as handle: loaded_model = pickle.load(handle)

# load json and create model
json_file = open(path + "//" + model_n + '.json', 'r')

loaded_model_json = json_file.read()
loaded_model_json = loaded_model_json.replace("\"ragged\": false,", " ")
json_file.close()
loaded_model_ = model_from_json(loaded_model_json)
# load weights into new model
loaded_model_.load_weights(path + "//" + model_n + "model.h5")
print("Loaded model from disk")

loaded_model.model_ = loaded_model_   ## Set the loaded model to the auto encoder instance model

This works almost 5x faster and model size is 10X smaller.

Hi!
Where do you import that function model_from_json ?
thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants