Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] PicklingError: "Can't pickle : attribute lookup on __main__ failed" happend when fit model #1843

Closed
joshua-xia opened this issue Jun 21, 2023 · 6 comments · Fixed by #1957
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@joshua-xia
Copy link

joshua-xia commented Jun 21, 2023

Describe the bug
when I test the model (maybe any forecast model, I test TFTModel, NlinearModel etc.)
I set the following parameters of model:

force_reset=True,
save_checkpoints=True,
optimizer_kwargs={"lr": results.suggestion()},
pl_trainer_kwargs={"callbacks": [early_stopper]},
add_encoders={
'cyclic': {'future': ['month']},
'datetime_attribute': {'future': ['hour', 'dayofweek']},
'position': {'past': ['relative'], 'future': ['relative']},
'custom': {'past': [lambda idx: (idx.year - 1950) / 50]},
'transformer': Scaler(StandardScaler())
}

model.fit(
series=[train_1, train_2, train_3],
val_series=[val_1, val_2, val_3],
past_covariates=[train_1_covariates, train_2_covariates, train_3_covariates],
val_past_covariates=[val_1_covariates, val_2_covariates, val_3_covariates],
)

and train the model with validation data, it should trigger the model save the best model checkpoint.

but the following error happened:

634 pickler.persistent_id = persistent_id
635 print(type(obj))

--> 636 pickler.dump(obj)
637 data_value = data_buf.getvalue()
638 zip_file.write_record('data.pkl', data_value, len(data_value))

PicklingError: Can't pickle at 0x2bc431280>: attribute lookup on main failed

when I remove add_encoders then the train work well.
I think the add_encoders lead to something can NOT dump correctly.

Anybody take a look this issue? it block me the model performance testing.

Thanks!

To Reproduce
see the description.

Expected behavior
model should be train and save the best checkpoint

System (please complete the following information):

  • Python version: 3.8
  • darts version 0.24.0
@joshua-xia joshua-xia added bug Something isn't working triage Issue waiting for triaging labels Jun 21, 2023
@joshua-xia
Copy link
Author

joshua-xia commented Jun 21, 2023

in class TorchForecastingModel:

  1. with open(path, "wb") as f_out:
  2. --> torch.save(self, f_out)

error happened in this line

@joshua-xia
Copy link
Author

Python documentation says that lambda functions cannot be pickled.
https://docs.python.org/3/library/pickle.html#id2:

Should we conside using dill dump the model?

@philippGraf
Copy link

jupp, had the same issue.
i used cloudpickle as pickling module.
I don't really know, if it is connected , but if you also want to train on gpu and evaluate on cpu, this seems to be important also...
I ended up with:

  • pickling with cloudpickle (handling lambda)
  • using torch==2.0.1
  • darts==0.24

@joshua-xia
Copy link
Author

the workaround is change the lambda to be function, the fix suggestion is to use dill dump the model.

@madtoinou madtoinou added good first issue Good for newcomers and removed triage Issue waiting for triaging labels Jun 23, 2023
@madtoinou
Copy link
Collaborator

Thank you for reporting the bug with a code snippet and finding a workaround @joshua-xia, thank you @philippGraf for sharing another solution.

I don't think that we will want to add a dependency just to pickle the lambda functions in the add_encoders but another way to fix this would be to add kwargs to the TorchForecastingModel.save() method that will be passed to the torch.save() method so that user can change the pickle module and protocol as described in the torch documentation.

These kwargs should also be added to the TorchForecastingModel._init_model() as it calls the save() method when the model is trained for the first time.

Would you be interested in opening a PR to fix this?

@dennisbader
Copy link
Collaborator

I think we can just update the docs (and maybe enforce) to use a function instead of lambdas for the encoders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants