You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using bfloat16 Automatic Mixed Precision (AMP)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
WARNING:datasets.builder:Reusing dataset super_glue (/Users/caffrey/Documents/research/t-few-master/cache/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)
Missing logger folder: exp_out/first_exp/log
WARNING:datasets.builder:Reusing dataset super_glue (/Users/caffrey/Documents/research/t-few-master/cache/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)
Train size 32
Eval size 277
| Name | Type | Params
-----------------------------------------------------
0 | model | T5ForConditionalGeneration | 2.8 B
-----------------------------------------------------
2.8 B Trainable params
0 Non-trainable params
2.8 B Total params
11,399.029Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]Traceback (most recent call last):
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Users/caffrey/Documents/paper/t-few-master/src/pl_train.py", line 86, in <module>
main(config)
File "/Users/caffrey/Documents/paper/t-few-master/src/pl_train.py", line 57, in main
trainer.fit(model, datamodule)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
self._call_and_handle_interrupt(
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run
results = self._run_stage()
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage
return self._run_train()
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1345, in _run_train
self._run_sanity_check()
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1413, in _run_sanity_check
val_loop.run()
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 204, in run
self.advance(*args, **kwargs)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 199, in run
self.on_run_start(*args, **kwargs)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 88, in on_run_start
self._data_fetcher = iter(data_fetcher)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/utilities/fetching.py", line 178, in __iter__
self.dataloader_iter = iter(self.dataloader)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 443, in __iter__
return self._get_iterator()
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 389, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1062, in __init__
w.start()
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'create_collate_fn.<locals>.collate_fn'
The text was updated successfully, but these errors were encountered:
Are you running this code on a node with multiple GPUs?
I remember we had similar issue of pickling the scheduler when we run on a server with multi GPUs. Try export CUDA_VISIBLE_DEVICES=0 in terminal so that the code uses only one GPU. This fixed the issue for us. Let us know if it works out for you.
When I tried to run the demo, I found this error! @dptam @jmohta @muqeeth
The text was updated successfully, but these errors were encountered: