dill load and sklearn clone result in error #1026

DCoupry · 2023-10-23T11:59:22Z

Dumping a skorch model with dill then reloading it (does not matter if with dill or pickle) makes it incompatible with sklearn.base.clone , apparently due to some attributes becoming empty - the optimizers I think, but Ihad no time to investigate further. This behaviour occurs neither with pickle or joblib.

This makes using functions such as cross_val_predict unuseable after loading a previously dumped model.

to reproduce:

python=3.10
tried with a bunch of versions for dill / torch / skorch / sklearn. all bug out.

from sklearn.datasets import make_regression
from sklearn.base import clone
import torch
import skorch
import dill

X, y = make_regression()
base_model = skorch.NeuralNetRegressor(torch.nn.Linear(100,1))
cloned_model = clone(base_model)
dumped_model = dill.loads(dill.dumps(base_model))
cloned_dumped_model = clone(dumped_model)

base_model.fit(X, y) # works
cloned_model .fit(X, y) # works
dumped_model .fit(X, y) # works
cloned_dumped_model .fit(X, y) # does not work

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2023-10-23T15:44:56Z

I could not 100% reproduce the issue, thus I had to make some small changes:

from sklearn.datasets import make_regression
from sklearn.base import clone
import numpy as np
import torch
import skorch
import dill

dill.__version__  # 0.3.6

X, y = make_regression()
X, y = X.astype(np.float32), y.astype(np.float32).reshape(-1, 1)  # added
base_model = skorch.NeuralNetRegressor(torch.nn.Linear(100, 1))
cloned_model = clone(base_model)
dumped_model = dill.loads(dill.dumps(base_model))
cloned_dumped_model = clone(dumped_model)

base_model.fit(X, y) # works
cloned_model.fit(X, y) # works
dumped_model.fit(X, y) # THIS ALREADY FAILS FOR ME
cloned_dumped_model.fit(X, y) # fails with same error

First, could you please confirm that my snippet produces the same error for you?

Second, is the error you get also:

...

1225 self.notify("on_batch_begin", batch=batch, training=training)
1226 step = step_fn(batch, **fit_params)
-> 1227 self.history.record_batch(prefix + "_loss", step["loss"].item())
1228 batch_size = (get_len(batch[0]) if isinstance(batch, (tuple, list))
1229 else get_len(batch))
1230 self.history.record_batch(prefix + "_batch_size", batch_size)

TypeError: 'NoneType' object is not subscriptable

DCoupry · 2023-10-23T15:53:58Z

I can reproduce it, yes. And indeed the dumped version does die also.
I am confused as the dumped model did work for me at one point, but the cloned one did not. trying to refine this.

What does work is:

dumped_model = dill.loads(pickle.dumps(base_model))
dumped_model.fit(X, y)

the error is the same. the loss is None here. The process looks okay to me and goes through all the initializations, and I have tracked it to the train_step function, where if you print the optimizers you will get an empty list. But when you take the models themselves, and print the pre-fit atributes, everything looks good! quite frustrating.

DCoupry · 2023-10-23T16:05:29Z

okay, after some checks:

from sklearn.datasets import make_regression
from sklearn.base import clone
import numpy as np
import torch
import skorch
import dill
import pickle

dill.__version__  # 0.3.6

X, y = make_regression()
X, y = X.astype(np.float32), y.astype(np.float32).reshape(-1, 1)  # added
base_model = skorch.NeuralNetRegressor(torch.nn.Linear(100, 1))
cloned_model = clone(base_model)
dumped_model = dill.loads(dill.dumps(base_model))
dumped_fitted_model = dill.loads(dill.dumps(base_model.fit(X, y)))
cloned_dumped_model = clone(dumped_model)
cloned_dumped_fitted_model = clone(cloned_dumped_model)

base_model.fit(X, y) # works
cloned_model.fit(X, y) # works
dumped_model.fit(X, y) # fails
dumped_fitted_model.fit(X, y) # works
cloned_dumped_fitted_model.fit(X, y) # fails
cloned_dumped_model.fit(X, y) # fails with same error

BenjaminBossan · 2023-10-24T10:23:28Z

Thanks for investigating further. This is super strange IMO, because the _optimizers attribute is empty but _modules and _crtiteria are not empty, even though these three attributes are treated exactly the same. Do you know if dill uses __getstate__ and __setstate__ or if it has equivalent methods? Maybe we can salvage something there.

Edit: Just checked it, dill does call __getstate__ and __setstate__, which makes this even more confusing.

DCoupry · 2023-10-31T11:50:15Z

we could print a trace of execution with the final fit and diff across, maybe?

BenjaminBossan · 2023-10-31T14:42:58Z

Sorry, I don't understand. How can this be done?

DCoupry · 2023-10-31T15:41:33Z

I was thinking pdb might be of some help here, will report if I manage anything. In the meantime, I have found that dumping byref with dill solves the fail:

# works
dill.loads(dill.dumps(base_model, byref=True)).fit(X, y) 
clone(dill.loads(dill.dumps(base_model.fit(X, y), , byref=True))).fit(X, y)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dill load and sklearn clone result in error #1026

dill load and sklearn clone result in error #1026

DCoupry commented Oct 23, 2023 •

edited

BenjaminBossan commented Oct 23, 2023

DCoupry commented Oct 23, 2023 •

edited

DCoupry commented Oct 23, 2023

BenjaminBossan commented Oct 24, 2023 •

edited

DCoupry commented Oct 31, 2023

BenjaminBossan commented Oct 31, 2023

DCoupry commented Oct 31, 2023

dill load and sklearn clone result in error #1026

dill load and sklearn clone result in error #1026

Comments

DCoupry commented Oct 23, 2023 • edited

BenjaminBossan commented Oct 23, 2023

DCoupry commented Oct 23, 2023 • edited

DCoupry commented Oct 23, 2023

BenjaminBossan commented Oct 24, 2023 • edited

DCoupry commented Oct 31, 2023

BenjaminBossan commented Oct 31, 2023

DCoupry commented Oct 31, 2023

DCoupry commented Oct 23, 2023 •

edited

DCoupry commented Oct 23, 2023 •

edited

BenjaminBossan commented Oct 24, 2023 •

edited