-
-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PicklingError on compute with HyperbandSearchCV #549
Comments
That looks like an error unclear definitions, and Python doesn't know what to pickle. Here's a good SO question: https://stackoverflow.com/questions/1412787/picklingerror-cant-pickle-class-decimal-decimal-its-not-the-same-object Basically, either remove the I'd be surprised if this is an issue with skorch: https://skorch.readthedocs.io/en/stable/user/save_load.html |
@fonnesbeck have you had a chance to look into this again? I was recently able to use HyperbandSearchCV with Skorch. |
Yes, removing my model class from the notebook and putting it into a Python file did the trick. The error message will continue to confuse users, though. |
I suspect there's not much we can do about it, since it's an error from Python about a different package. We just happen to hit it here since dask needs to pickle things to move them around :/ |
Would this be a good place for us to build custom serialization? Is there an obvious subclass for all of these and a clean way of serializing them? (I also ran into this) |
Does anyone have a reproducible example? This doesn't do it from distributed import Client
import torch
import skorch
import pickle
client = Client()
class DNNRegressor(torch.nn.Module):
pass
dnnr = skorch.NeuralNetRegressor(
module=DNNRegressor,
module__n_feature=128,
module__n_hidden=128,
module__n_output=1,
module__dropout_rate=0.5,
criterion=torch.nn.MSELoss,
)
pickle.loads(pickle.dumps(dnnr))
client.scatter([dnnr], broadcast=True) Do I need Hyperband to reproduce the problem? |
Here's a reproducer from distributed import Client
from dask_ml.model_selection import HyperbandSearchCV
from dask_ml.datasets import make_classification
import torch
import skorch
import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
from skorch import NeuralNetRegressor
from scipy.stats import loguniform, uniform
client = Client()
X, y = make_classification(chunks=(10, -1))
class HiddenLayerNet(nn.Module):
def __init__(self, n_features=10, n_outputs=1, n_hidden=100, activation="relu"):
super().__init__()
self.fc1 = nn.Linear(n_features, n_hidden)
self.fc2 = nn.Linear(n_hidden, n_outputs)
self.activation = getattr(F, activation)
def forward(self, x, **kwargs):
return self.fc2(self.activation(self.fc1(x)))
niceties = {
"callbacks": False,
"warm_start": True,
"train_split": None,
"max_epochs": 1,
}
model = NeuralNetRegressor(
module=HiddenLayerNet,
module__n_features=X.shape[1],
optimizer=optim.SGD,
criterion=nn.MSELoss,
lr=0.0001,
**niceties,
)
params = {
"module__activation": ["relu", "elu", "softsign", "leaky_relu", "rrelu"],
"batch_size": [32, 64, 128, 256],
"optimizer__lr": loguniform(1e-4, 1e-3),
"optimizer__weight_decay": loguniform(1e-6, 1e-3),
"optimizer__momentum": uniform(0, 1),
"optimizer__nesterov": [True],
}
search = HyperbandSearchCV(model, params, random_state=2, verbose=True, max_iter=2)
search.fit(X, y) |
It is odd that it requires HyperbandSearchCV though. We might try various
combinations of scatter/submit
…On Wed, Aug 5, 2020 at 3:06 PM James Bourbeau ***@***.***> wrote:
Here's a reproducer
from distributed import Clientfrom dask_ml.model_selection import HyperbandSearchCVfrom dask_ml.datasets import make_classificationimport torchimport skorchimport torch.optim as optimimport torch.nn as nnimport torch.nn.functional as Ffrom skorch import NeuralNetRegressorfrom scipy.stats import loguniform, uniform
client = Client()
X, y = make_classification(chunks=(10, -1))
class HiddenLayerNet(nn.Module):
def __init__(self, n_features=10, n_outputs=1, n_hidden=100, activation="relu"):
super().__init__()
self.fc1 = nn.Linear(n_features, n_hidden)
self.fc2 = nn.Linear(n_hidden, n_outputs)
self.activation = getattr(F, activation)
def forward(self, x, **kwargs):
return self.fc2(self.activation(self.fc1(x)))
niceties = {
"callbacks": False,
"warm_start": True,
"train_split": None,
"max_epochs": 1,
}
model = NeuralNetRegressor(
module=HiddenLayerNet,
module__n_features=X.shape[1],
optimizer=optim.SGD,
criterion=nn.MSELoss,
lr=0.0001,
**niceties,
)
params = {
"module__activation": ["relu", "elu", "softsign", "leaky_relu", "rrelu"],
"batch_size": [32, 64, 128, 256],
"optimizer__lr": loguniform(1e-4, 1e-3),
"optimizer__weight_decay": loguniform(1e-6, 1e-3),
"optimizer__momentum": uniform(0, 1),
"optimizer__nesterov": [True],
}
search = HyperbandSearchCV(model, params, random_state=2, verbose=True, max_iter=2)search.fit(X, y)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#549 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTA77S7UYADHBZ76463R7HJXTANCNFSM4I4P4FOA>
.
|
It's almost like the class is being mutated, by hyperband or someone else? I'll look a bit today. |
Nothing on the pickling yet, but a couple updates to James' reproducer based on using
from distributed import Client
from dask_ml.model_selection import HyperbandSearchCV
from dask_ml.datasets import make_classification, make_regression
import torch
import skorch
import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
from skorch import NeuralNetRegressor
from scipy.stats import loguniform, uniform
client = Client(processes=True)
X, y = make_regression(chunks=(10, -1))
y = y.reshape(-1, 1).astype("float32")
X = X.astype("float32")
class HiddenLayerNet(nn.Module):
def __init__(self, n_features=10, n_outputs=1, n_hidden=100, activation="relu"):
super().__init__()
self.fc1 = nn.Linear(n_features, n_hidden)
self.fc2 = nn.Linear(n_hidden, n_outputs)
self.activation = getattr(F, activation)
def forward(self, x, **kwargs):
return self.fc2(self.activation(self.fc1(x)))
niceties = {
"callbacks": False,
"warm_start": True,
"train_split": None,
"max_epochs": 1,
}
model = NeuralNetRegressor(
module=HiddenLayerNet,
module__n_features=X.shape[1],
optimizer=optim.SGD,
criterion=nn.MSELoss,
lr=0.0001,
**niceties,
)
params = {
"module__activation": ["relu", "elu", "softsign", "leaky_relu", "rrelu"],
"batch_size": [32, 64, 128, 256],
"optimizer__lr": loguniform(1e-4, 1e-3),
"optimizer__weight_decay": loguniform(1e-6, 1e-3),
"optimizer__momentum": uniform(0, 1),
"optimizer__nesterov": [True],
}
search = HyperbandSearchCV(model, params, random_state=2, verbose=True, max_iter=2)
search.fit(X, y) But still seeing the _pickle.PicklingError: Can't pickle <class '__main__.HiddenLayerNet'>: attribute lookup HiddenLayerNet on __main__ failed with that. |
This rabbit hole keeps on going. I don't fully understand the issue, but the original exception came from trying to pickle >>> model.module().to("cpu")
HiddenLayerNet(
(fc1): Linear(in_features=10, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=1, bias=True)
) That's an instance of the interactively defined class. Apparently something (cloudpickle? Dask?) has trouble serializing those when they're attributes of another object. Anyway, we can get around that by serializing it separately import cloudpickle
import skorch
from .serialize import dask_serialize, dask_deserialize
@dask_serialize.register(skorch.NeuralNet)
def serialize_skorch(x):
has_module = hasattr(x, "module_")
headers = {"has_module": has_module}
if has_module:
module = x.__dict__.pop("module_")
try:
frames = [cloudpickle.dumps(x), cloudpickle.dumps(module)]
finally:
x.__dict__["module_"] = module
else:
frames = [cloudpickle.dumps(x)]
return headers, frames
@dask_deserialize.register(skorch.NeuralNet)
def deserialize_skorch(header, frames):
model = cloudpickle.loads(frames[0])
if header["has_module"]:
module = cloudpickle.loads(frames[1])
model.module_ = module
return model But now we face a trickier problem. Hyperband calls I'd hoped that
sklearn.base.clone , but that's failing some tests. Will need to look more later.
|
I'm attempting to do a hyperparameter search using
HyperbandSearchCV
on a PyTorch model that has been wrapped withskorch
, but am running into a failure when I callfit
:The exception does not seem to make sense.
My model is a subclass of
torch.nn.Module
that is just a deep neural network regressor, and this has been wrapped by a skorchNeuralNetRegressor
as followsAny obvious reason for this to be happening?
Running dask_ml 1.0.0, skorch 0.6.0 and pytorch 1.1.0 on a GCS instance.
The text was updated successfully, but these errors were encountered: