-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
环境使用内存的问题 #21
Comments
把config_small.json里的参数再改小,直到可以放得下为止,现在还是太大了 |
我将batch_size设成4就可以过去(原先是8), 请问batch_size会影响输出的结果吗? 现在环境使用状况如下... |
会有一点影响,但是就现在你的硬件条件看,能跑已经不错了,不用奢求太多 |
了解~感谢指导~ |
我照配置使用small.json去配置,在4GPU每个GPU有8GB的环境去跑还是会出现既济体不够的问题
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pytorch_transformers-1.0.0-py3.6.egg/pytorch_transformers/modeling_gpt2.py", line 100, in gelu
return 0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3))))
RuntimeError: CUDA out of memory. Tried to allocate 24.00 MiB (GPU 0; 7.44 GiB total capacity; 6.88 GiB already allocated; 21.50 MiB free; 128.30 MiB cached)
我执行NVIDIA-smi
(pytorch_p36) ubuntu@ip-172-31-38-29:~/GPT2$ nvidia-smi
Wed Aug 14 06:03:59 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 00000000:00:1B.0 Off | 0 |
| N/A 34C P8 23W / 150W | 0MiB / 7618MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M60 On | 00000000:00:1C.0 Off | 0 |
| N/A 39C P8 22W / 150W | 0MiB / 7618MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M60 On | 00000000:00:1D.0 Off | 0 |
| N/A 36C P8 22W / 150W | 0MiB / 7618MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M60 On | 00000000:00:1E.0 Off | 0 |
| N/A 39C P8 22W / 150W | 0MiB / 7618MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
确定每台有8GB GRAM, 但是为何看起来只使用到8G做配置
执行命令如下...
python3 train.py --raw --device="0,1,2,3"
这会是哪方面问题?
The text was updated successfully, but these errors were encountered: