Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horovod error: TypeError: __init__() missing 2 required positional arguments: 'named_parameters' and 'compression' #147

Closed
marsggbo opened this issue Jun 5, 2021 · 1 comment

Comments

@marsggbo
Copy link

marsggbo commented Jun 5, 2021

Reproduce command

mpirun -np 2 python run.py +trainer.accelerator=horovod

the error is raised during trainer.test().

the full error info is below

Error executing job with overrides: ['+trainer.accelerator=horovod']
Traceback (most recent call last):
  File "run.py", line 31, in main
    return train(config)
  File "/home/user/code/lightning-hydra-template/src/train.py", line 82, in train
    trainer.test()
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in test
    results = self._run(model)
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 753, in _run
    self.pre_dispatch()
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 778, in pre_dispatch
    self.accelerator.pre_dispatch(self)
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 108, in pre_dispatch
    self.training_type_plugin.pre_dispatch()
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/horovod.py", line 93, in pre_dispatch
    optimizers = [
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/horovod.py", line 94, in <listcomp>
    hvd.DistributedOptimizer(
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/horovod/torch/optimizer.py", line 585, in DistributedOptimizer
    return cls(optimizer.param_groups, named_parameters, compression, backward_passes_per_step, op,
  File "/home/user/.conda/envs/torch18/lib/python3.8/site-packages/horovod/torch/optimizer.py", line 41, in __init__
    super(self.__class__, self).__init__(params)
TypeError: __init__() missing 2 required positional arguments: 'named_parameters' and 'compression'
@ashleve
Copy link
Owner

ashleve commented Jun 5, 2021

@marsggbo Thanks,
I see you already resolved the issue Lightning-AI/pytorch-lightning#7839

@ashleve ashleve closed this as completed Jun 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants