Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_pickle.PicklingError: Can't pickle <function AcceleratedOptimizer.step> #7

Open
jose opened this issue Mar 6, 2023 · 1 comment
Open

Comments

@jose
Copy link
Contributor

jose commented Mar 6, 2023

$ python main.py --model codet5 --task assert --subset raw --train_batch_size 16 --eval_batch_size 16

results in the following error:

==================== INITIALIZING ====================
Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: fp16

...

==================== LOADING ====================
Loaded config 'T5Config' from 'Salesforce/codet5-base'
Loaded tokenizer 'RobertaTokenizerFast' from 'Salesforce/codet5-base', size: 32100
Loaded unwrapped model 'T5ForConditionalGeneration' from 'Salesforce/codet5-base'
Loaded model 'T5ForConditionalGeneration' from 'Salesforce/codet5-base'
Trainable parameters: 223M
Start loading train data from ../datasets/assert/assert/raw
Loading train data: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 150523/150523 [00:00<00:00, 158473.45it/s]
train data loaded, total size: 150523
Start encoding train data into input features
Encoding: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 150523/150523 [00:16<00:00, 8902.78it/s]
train data encoded, total size: 150523
Start loading valid data from ../datasets/assert/assert/raw
Loading valid data: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18816/18816 [00:00<00:00, 339752.22it/s]
valid data loaded, total size: 18816
Start encoding valid data into input features
Encoding: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18816/18816 [00:08<00:00, 2348.51it/s]
valid data encoded, total size: 18816
Data is loaded and prepared

...

==================== TRAINING ====================

...

The best em model is saved to ../outputs/codet5_assert_assert_raw_bs16_ep30_lr5e-05_warmup1000_20230305_151212/models/best_em
Traceback (most recent call last):
  File "/tmp/FineTuner/src/main.py", line 113, in <module>
    main()
  File "/tmp/FineTuner/src/main.py", line 109, in main
    run_fine_tune(args, accelerator, run)
  File "/tmp/FineTuner/src/run_fine_tune.py", line 416, in run_fine_tune
    torch.save(optimizer, os.path.join(save_last_dir, "optimizer.pt"))
  File "/tmp/FineTuner/fine-tuner/lib/python3.9/site-packages/torch/serialization.py", line 423, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)
  File "/tmp/FineTuner/fine-tuner/lib/python3.9/site-packages/torch/serialization.py", line 635, in _save
    pickler.dump(obj)
_pickle.PicklingError: Can't pickle <function AcceleratedOptimizer.step at 0x7ff18e0d3670>: it's not the same object as accelerate.optimizer.AcceleratedOptimizer.step

PS: This issue assumes #5 has been accepted and merged.

@jose
Copy link
Contributor Author

jose commented Mar 6, 2023

For now, I've removed line 416 from the main.py script and it seems to be working fine. I will update this issue if I find any other related problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant