Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练模型时显存爆了 #27

Closed
cronfox opened this issue Aug 21, 2021 · 11 comments
Closed

训练模型时显存爆了 #27

cronfox opened this issue Aug 21, 2021 · 11 comments

Comments

@cronfox
Copy link

cronfox commented Aug 21, 2021

Variable._execution_engine.run_backward(RuntimeError: CUDA out of memory. Tried to allocate 88.00 MiB (GPU 0; 4.00 GiB totalcapacity; 2.68 GiB already allocated; 0 bytes free; 2.85 GiB reserved in total by PyTorch)

能不能提供一个调batch_size的参数? 我目前用的显卡显存只有4G(GTX1050Ti),默认参数正常训练时经常爆掉显存....

@wangkewk
Copy link

CorentinJ/Real-Time-Voice-Cloning#664

你可以找找这个

@wangkewk
Copy link

@babysor
Copy link
Owner

babysor commented Aug 21, 2021

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

@cronfox
Copy link
Author

cronfox commented Aug 21, 2021

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

我试着通过修改部分文件将batch_size调到6,跑的很稳定。

@XiuChen-Liu
Copy link
Contributor

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

我试着通过修改部分文件将batch_size调到6,跑的很稳定。

可以請問是修改什麼文件呢,我看了上面的引用,研究了半天看不大懂。

@cronfox
Copy link
Author

cronfox commented Aug 23, 2021

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

我试着通过修改部分文件将batch_size调到6,跑的很稳定。

可以請問是修改什麼文件呢,我看了上面的引用,研究了半天看不大懂。

需要修改 ./synthesizer/hparams.py 这一部分

@XiuChen-Liu
Copy link
Contributor

好的我試試,謝謝!

@XiuChen-Liu
Copy link
Contributor

XiuChen-Liu commented Aug 30, 2021

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

請問訓練 vocoder 時也遇到 RuntimeError: CUDA out of memory. Tried to allocate 74.00 MiB (GPU 0; 6.00 GiB total capacity; 3.56 GiB already allocated; 14.88 MiB free; 3.74 GiB reserved in total by PyTorch) 同樣更改batch size來解決嗎

@utmcontent
Copy link

嗯嗯,如上所示,我之前用940MX, batch size在2是可以跑的,你这个话可以试下4.

請問訓練 vocoder 時也遇到 RuntimeError: CUDA out of memory. Tried to allocate 74.00 MiB (GPU 0; 6.00 GiB total capacity; 3.56 GiB already allocated; 14.88 MiB free; 3.74 GiB reserved in total by PyTorch) 同樣更改batch size來解決嗎

我把下面代码的 12 to 2 暂时不报错了
文件:yourmainfolder\synthesizer\hparams.py中的

        tts_schedule = [(2,  1e-3,  20_000,  2),   # Progressive training schedule
                        (2,  5e-4,  40_000,  2),   # (r, lr, step, batch_size)
                        (2,  2e-4,  80_000,  2),   #
                        (2,  1e-4, 160_000,  2),   # r = reduction factor (# of mel frames
                        (2,  3e-5, 320_000,  2),   #     synthesized for each decoder iteration)
                        (2,  1e-5, 640_000,  2)]

@XiuChen-Liu
Copy link
Contributor

好勒,謝謝你

@babysor
Copy link
Owner

babysor commented Oct 1, 2021

已经update到 Readme

@babysor babysor closed this as completed Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants