New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] save/load aka serialization/deserialization for estimators #3336
Conversation
I see,
|
with this design, you don't have to implement separate functions or logic, just overload the three new |
Hey @fkiraly , if isinstance(serial, tuple):
cls = serial[0]
stored = serial[1]
return cls.load_from_serial(stored)
elif isinstance(serial, (str, ZipFile)):
if isinstance(serial, str):
zipfile = ZipFile(serial)
with zipfile.open("metadata", mode="r") as metadata:
cls = metadata.read()
with zipfile.open("object", mode="r") as object:
return cls.load_from_path(object.read()) Here, I am assuming |
The idea is that In the default implementation, |
Could you confirm if this works for DL Classifiers, like TypeError: object of type 'ABCMeta' has no len() |
@achieveordie, in the tests, this is currently skipped for all DL classifiers, since they are not compatible with Re your proposed change, is there a draft PR? I could look at it. Your error message might be related to |
@achieveordie, I suppose I have misunderstood your question. Did you want to know whether the default in this PR works for DL classifiers? The answer is "no", and I know that it doesn't work. However, having the abstract methods in place, it should be possible to override this in the deep learning base class, with something that works - as you have probably tried to (this was the idea, I just don't know what precisely would work). |
Fixes #3022 - adds save/load functionality for DL estimators. Also changes the file format of file saves slightly, but this has not been released yet and therefore does not require deprecation. The Save/Load function for classical estimators is from #3336 and has been expanded for DL models along the same design. Keras is known to be problematic for certain kinds of (de)serialization, so this uses `model.save()` from `keras` along with `pickle.dump()`. The estimators are saved as `.zip` files. Components of this directory include: 1. Metadata 2. Object 3. Keras History 4. Keras Model subdirectory The first two components are the same as previously mentioned PR, stored differently to [avoid](#3336 (comment)) this problem. All the attributes are saved and restored (optimizers/history etc.) appropriately.
Serialization and deserializSerialization and deserialization interface for sktime estimators, supporting the following workflows: 1. serialization to in-memory object and deserialization 2. serialization to single file and deserialization Initially proposed in sktime issues [#3128](sktime/sktime#3128) and [#3336](sktime/sktime#3336). The proposed design introduces: * an object interface points for estimators, `save` * a loose method `load` * a file format specification `skt` for serialized estimator objects which are counterparts and universal. Concrete class or intermediate class specific logic is implemented in: * `save`, a class method * `load_from_path` and `load_from_serial`, class methods called by `load` while no class specific logic is present in `load`.ation for sktime estimators
This adds a prototype design for serializing and deserializing estimators.
This follows the current (Aug 19) draft STEP 27 sktime/enhancement-proposals#27.
This is done via
save
and twoload
methods which can be overridden by class, and a genericload
(class agnostic), in theBaseObject
module.Also changes the pickle/unpickle test to a generic serialize/deserialize test which allows custom, class specific implementations of serialize and deserialize.
FYI @AurumnPegasus - design alternative to #3128 which avoids the workflow where we have to construct an object first when we load.