Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DetectNet training fails with error code -9 #1362

Open
SimonBirrell opened this issue Dec 26, 2016 · 2 comments
Open

DetectNet training fails with error code -9 #1362

SimonBirrell opened this issue Dec 26, 2016 · 2 comments

Comments

@SimonBirrell
Copy link

Hi,

I've been following the object detection tutorial with one difference: I've compiled NVCaffe to be CPU only.

After a few hours of training, I get the following:

ERROR: error code -9

Ignoring source layer loss2/classifier
Ignoring source layer loss2/loss
Ignoring source layer pool4/3x3_s2
Ignoring source layer pool4/3x3_s2_pool4/3x3_s2_0_split
Ignoring source layer pool5/7x7_s1
Ignoring source layer pool5/drop_7x7_s1
Ignoring source layer loss3/classifier
Ignoring source layer loss3/loss3
Starting Optimization
Solving
Learning Rate Policy: step
Iteration 0, Testing net (#0)
Ignoring source layer train_data
Ignoring source layer train_label
Ignoring source layer train_transform
Test net output #0: loss_bbox = 18.4677 (* 2 = 36.9353 loss)
Test net output #1: loss_coverage = 329.823 (* 1 = 329.823 loss)
Test net output #2: mAP = 0
Test net output #3: precision = 0
Test net output #4: recall = 0

Any ideas? I can't find "-9" in the source code of either Caffe or DIGITS.

Thanks!

Simon

@gheinrich
Copy link
Contributor

Hi @SimonBirrell, negative error codes are for Linux signals. Here, -9 is for SIGKILL . Your process was killed: a common cause is that it allocated too much memory. I have never tried to train DetectNet on CPU...

@SimonBirrell
Copy link
Author

Thanks for that! I'm trying to run the training again with more physical memory.

Training without a GPU is obviously not ideal - I'm travelling away from the various nVidia workstations I have access to, and wanted to give Digits and object detection a try. So far, 2 hours of 1 CPU and 4GB RAM on a virtual machine haven't got to 1% of training, so we'll see if this is feasible at all. I'll report back!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants