maml 方法似乎不支持多gpu训练 #10

ypy516478793 · 2021-10-08T15:30:06Z

maml方法能在单个gpu上训练，但在多个gpu上平行训练会报错。具体错误如下：

  File "/home/cougarnet.uh.edu/pyuan2/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/home/cougarnet.uh.edu/pyuan2/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/cougarnet.uh.edu/pyuan2/Projects/LibFewShot/core/model/backbone/conv_four.py", line 69, in forward
    out1 = self.layer1(x)
  File "/home/cougarnet.uh.edu/pyuan2/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/cougarnet.uh.edu/pyuan2/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/home/cougarnet.uh.edu/pyuan2/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/cougarnet.uh.edu/pyuan2/Projects/LibFewShot/core/model/backbone/utils/maml_module.py", line 63, in forward
    if self.weight.fast is not None and self.bias.fast is not None:
AttributeError: 'Tensor' object has no attribute 'fast'

The text was updated successfully, but these errors were encountered:

wZuck · 2021-10-10T05:55:29Z

你好，感谢你的反馈，我们正在解决这个问题，会尽快回复。

yangcedrus · 2021-10-11T11:28:34Z

你好，关于你说的maml方法多gpu的问题，我们发现确实存在这样的问题。并且如果要修改支持多gpu的话，需要对代码进行较大的改动。我们打算在之后进行一次更新，来修复这些比较大的问题。

ypy516478793 · 2021-10-11T15:12:33Z

你好，关于你说的maml方法多gpu的问题，我们发现确实存在这样的问题。并且如果要修改支持多gpu的话，需要对代码进行较大的改动。我们打算在之后进行一次更新，来修复这些比较大的问题。

好的，谢谢！

yangcedrus · 2022-09-26T06:42:48Z

MAML现在可以多gpu进行训练了。

有一个没有解决的问题是MAML在DistributedDataParallel下不能和SyncBatchNorm同时使用，我们后续会分析缺少同步操作对最终结果的影响，并寻找相应的解决办法。

RL-VIG locked and limited conversation to collaborators Oct 13, 2022

wZuck converted this issue into discussion #59 Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

maml 方法似乎不支持多gpu训练 #10

maml 方法似乎不支持多gpu训练 #10

ypy516478793 commented Oct 8, 2021

wZuck commented Oct 10, 2021

yangcedrus commented Oct 11, 2021

ypy516478793 commented Oct 11, 2021

yangcedrus commented Sep 26, 2022

This issue was moved to a discussion.

This issue was moved to a discussion.

maml 方法似乎不支持多gpu训练 #10

maml 方法似乎不支持多gpu训练 #10

Comments

ypy516478793 commented Oct 8, 2021

wZuck commented Oct 10, 2021

yangcedrus commented Oct 11, 2021

ypy516478793 commented Oct 11, 2021

yangcedrus commented Sep 26, 2022

This issue was moved to a discussion.