Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train crashed. #4

Closed
10183308 opened this issue Jul 18, 2017 · 1 comment
Closed

train crashed. #4

10183308 opened this issue Jul 18, 2017 · 1 comment

Comments

@10183308
Copy link

when I train the model, it crashed.

F0718 15:20:34.923629 16833 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
./train_voc_reduced.sh: line 7: 16833 Aborted (core dumped) python tools/train_net.py --gpu 0 --solver models/pascalvoc/VGG16-REDUCED/solver.prototxt --imdb voc_2007_trainval --weights data/ImageNet_models/VGG_ILSVRC_16_layers_fc_reduced.caffemodel --batchsize 64 --iters 4000

root$ nvidia-smi
Tue Jul 18 15:29:09 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:06:00.0 Off | 0 |
| N/A 69C P0 57W / 149W | 0MiB / 11439MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 0000:07:00.0 Off | 0 |
| N/A 51C P0 71W / 149W | 0MiB / 11439MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K80 Off | 0000:83:00.0 Off | 0 |
| N/A 62C P0 58W / 149W | 0MiB / 11439MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla K80 Off | 0000:84:00.0 Off | 0 |
| N/A 48C P0 72W / 149W | 0MiB / 11439MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

@taokong
Copy link
Owner

taokong commented Jul 29, 2017

batchsize is too large

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants