训练模型时显存爆了 #27

cronfox · 2021-08-21T12:41:13Z

Variable._execution_engine.run_backward(RuntimeError: CUDA out of memory. Tried to allocate 88.00 MiB (GPU 0; 4.00 GiB totalcapacity; 2.68 GiB already allocated; 0 bytes free; 2.85 GiB reserved in total by PyTorch)

能不能提供一个调batch_size的参数? 我目前用的显卡显存只有4G(GTX1050Ti)，默认参数正常训练时经常爆掉显存....

The text was updated successfully, but these errors were encountered:

wangkewk · 2021-08-21T13:05:51Z

CorentinJ/Real-Time-Voice-Cloning#664

你可以找找这个

wangkewk · 2021-08-21T13:06:52Z

CorentinJ/Real-Time-Voice-Cloning#700

babysor · 2021-08-21T14:16:11Z

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

cronfox · 2021-08-21T16:50:01Z

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

我试着通过修改部分文件将batch_size调到6，跑的很稳定。

XiuChen-Liu · 2021-08-22T16:16:13Z

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

我试着通过修改部分文件将batch_size调到6，跑的很稳定。

可以請問是修改什麼文件呢，我看了上面的引用，研究了半天看不大懂。

cronfox · 2021-08-23T13:55:53Z

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

我试着通过修改部分文件将batch_size调到6，跑的很稳定。

可以請問是修改什麼文件呢，我看了上面的引用，研究了半天看不大懂。

需要修改 ./synthesizer/hparams.py 这一部分

XiuChen-Liu · 2021-08-24T05:02:49Z

好的我試試，謝謝!

XiuChen-Liu · 2021-08-30T04:06:41Z

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

請問訓練 vocoder 時也遇到 RuntimeError: CUDA out of memory. Tried to allocate 74.00 MiB (GPU 0; 6.00 GiB total capacity; 3.56 GiB already allocated; 14.88 MiB free; 3.74 GiB reserved in total by PyTorch) 同樣更改batch size來解決嗎

utmcontent · 2021-09-05T19:23:44Z

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

請問訓練 vocoder 時也遇到 RuntimeError: CUDA out of memory. Tried to allocate 74.00 MiB (GPU 0; 6.00 GiB total capacity; 3.56 GiB already allocated; 14.88 MiB free; 3.74 GiB reserved in total by PyTorch) 同樣更改batch size來解決嗎

我把下面代码的 12 to 2 暂时不报错了
文件:yourmainfolder\synthesizer\hparams.py中的

        tts_schedule = [(2,  1e-3,  20_000,  2),   # Progressive training schedule
                        (2,  5e-4,  40_000,  2),   # (r, lr, step, batch_size)
                        (2,  2e-4,  80_000,  2),   #
                        (2,  1e-4, 160_000,  2),   # r = reduction factor (# of mel frames
                        (2,  3e-5, 320_000,  2),   #     synthesized for each decoder iteration)
                        (2,  1e-5, 640_000,  2)]

XiuChen-Liu · 2021-09-06T04:42:02Z

好勒，謝謝你

babysor · 2021-10-01T03:19:57Z

已经update到 Readme

babysor closed this as completed Oct 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练模型时显存爆了 #27

训练模型时显存爆了 #27

cronfox commented Aug 21, 2021 •

edited

wangkewk commented Aug 21, 2021

wangkewk commented Aug 21, 2021

babysor commented Aug 21, 2021

cronfox commented Aug 21, 2021

XiuChen-Liu commented Aug 22, 2021

cronfox commented Aug 23, 2021

XiuChen-Liu commented Aug 24, 2021

XiuChen-Liu commented Aug 30, 2021 •

edited

utmcontent commented Sep 5, 2021

XiuChen-Liu commented Sep 6, 2021

babysor commented Oct 1, 2021

训练模型时显存爆了 #27

训练模型时显存爆了 #27

Comments

cronfox commented Aug 21, 2021 • edited

wangkewk commented Aug 21, 2021

wangkewk commented Aug 21, 2021

babysor commented Aug 21, 2021

cronfox commented Aug 21, 2021

XiuChen-Liu commented Aug 22, 2021

cronfox commented Aug 23, 2021

XiuChen-Liu commented Aug 24, 2021

XiuChen-Liu commented Aug 30, 2021 • edited

utmcontent commented Sep 5, 2021

XiuChen-Liu commented Sep 6, 2021

babysor commented Oct 1, 2021

cronfox commented Aug 21, 2021 •

edited

XiuChen-Liu commented Aug 30, 2021 •

edited