You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
run python scripts/train.py -c examples/bert_crf/configs/resume.yaml
On window11, i7 12700h, Nvidia RTX 3070 laptop.
Installed modelscope==1.0.3, it works on the linux platforms.
2022-12-03 00:15:25,329 - modelscope - INFO - epoch [1][200/239] lr: 5.000e-05, eta: 0:26:17, iter_time: 0.319, data_load_time: 0.005, memory: 4263, loss: 17.1283
2022-12-03 00:15:37,843 - modelscope - WARNING - ('METRICS', 'default', 'ner-metric') not found in ast index file
2022-12-03 00:15:37,843 - modelscope - WARNING - ('METRICS', 'default', 'ner-dumper') not found in ast index file
Total test samples: 0%| | 0/463 [00:00<?, ?it/s]2022-12-03 00:15:38,091 - modelscope - INFO - PyTorch version 1.12.0 Found.
2022-12-03 00:15:38,093 - modelscope - INFO - Loading ast index from C:\Users\zx920\.cache\modelscope\ast_indexer
2022-12-03 00:15:38,147 - modelscope - INFO - Loading done! Current index file version is 1.0.3, with md5 ab126a3e272314963017d9feade29ae0
Traceback (most recent call last):
File "<string>", line 1, in <module>
Total test samples: 0%| | 0/463 [00:02<?, ?it/s]
Traceback (most recent call last):
File "scripts/train.py", line 54, in <module>
main(args)
File "scripts/train.py", line 21, in main
File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\spawn.py", line 105, in spawn_main
trainer.train(args.checkpoint_path)
File "C:\Users\zx920\workspace\AdaSeq\adaseq\trainers\default_trainer.py", line 354, in train
exitcode = _main(fd)
File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\spawn.py", line 115, in _main
return super().train(checkpoint_path=checkpoint_path, *args, **kwargs)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 459, in train
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
self.train_loop(self.train_dataloader)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 871, in train_loop
self.invoke_hook(TrainerStages.after_train_epoch)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 977, in invoke_hook
getattr(hook, fn_name)(self)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\hooks\evaluation_hook.py", line 31, in after_train_epoch
self.do_evaluate(trainer)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\hooks\evaluation_hook.py", line 35, in do_evaluate
eval_res = trainer.evaluate()
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 484, in evaluate
metric_classes)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 921, in evaluation_loop
data_loader_iters=self._eval_iters_per_epoch)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\utils\inference.py", line 51, in single_gpu_test
for i, data in enumerate(data_loader):
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\utils\data\dataloader.py", line 438, in __iter__
return self._get_iterator()
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\utils\data\dataloader.py", line 384, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\utils\data\dataloader.py", line 1048, in __init__
w.start()
File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\multiprocessing\reductions.py", line 145, in reduce_tensor
raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "
RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries. If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).
[W C:\cb\pytorch_1000000000000\work\torch\csrc\CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
The text was updated successfully, but these errors were encountered:
run
python scripts/train.py -c examples/bert_crf/configs/resume.yaml
On window11, i7 12700h, Nvidia RTX 3070 laptop.
Installed
modelscope==1.0.3
, it works on the linux platforms.The text was updated successfully, but these errors were encountered: