Any progress for pytorch 1.5? #6

LuChengTHU · 2020-07-19T14:56:27Z

Hi, do you have any ideas for running the code for pytorch v1.5 with data parallel?

jerrybai1995 · 2020-07-20T05:09:48Z

Yup, I have pushed a branch named "pytorch-1.5" for the repo. Please pull the repo, do git checkout pytorch-1.5 and train the model there. Also, see the updated README on what's been changed.

Let me know if it works!

LuChengTHU · 2020-07-20T07:05:58Z

Thanks! And I'm confused about the 'func_copy' model. It seems that we need to use 2x GPU memory because of this implementation. Is there a more efficient way of implementing the backward method?

jerrybai1995 · 2020-07-20T14:57:57Z

Yes, that is a design choice due to PyTorch's nn.DataParallel. If you use only 1 GPU (i.e., no nn.DataParallel), then you are able to do the actual implicit differentiation all in the backward() in deq.py and without func_copy. You can simply do it through one layer, as we hoped.

However, the weird thing we found was, once nn.DataParallel was invoked, the parameter gradients on the replica will all vanish. In other words, the gradients computed in the backward() will disappear. This happened in PyTorch 1.4, I'm not so sure about 1.5. But anyway, that was the rationale behind this design choice; we found no good choice but to leave a func_copy there for the Jacobian-vector product part computation.

Indeed, once we are able to solve this issue, we will have much better memory efficiency than the ones reported in our paper.

LuChengTHU · 2020-07-20T16:27:24Z

Thanks! I'm waiting for the better implementation!

LuChengTHU closed this as completed Jul 20, 2020

zaccharieramzi added a commit to zaccharieramzi/deq that referenced this issue Feb 1, 2023

Merge pull request locuslab#6 from zaccharieramzi/on-disk-warm-restart

cdae3e9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any progress for pytorch 1.5? #6

Any progress for pytorch 1.5? #6

LuChengTHU commented Jul 19, 2020

jerrybai1995 commented Jul 20, 2020

LuChengTHU commented Jul 20, 2020

jerrybai1995 commented Jul 20, 2020

LuChengTHU commented Jul 20, 2020

Any progress for pytorch 1.5? #6

Any progress for pytorch 1.5? #6

Comments

LuChengTHU commented Jul 19, 2020

jerrybai1995 commented Jul 20, 2020

LuChengTHU commented Jul 20, 2020

jerrybai1995 commented Jul 20, 2020

LuChengTHU commented Jul 20, 2020