Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caffe crashed with SIGABRT #1683

Open
Fengmoon93 opened this issue Jun 17, 2017 · 0 comments
Open

caffe crashed with SIGABRT #1683

Fengmoon93 opened this issue Jun 17, 2017 · 0 comments

Comments

@Fengmoon93
Copy link

I want to use DIGITS to train the FCN-8s with the SYNTHIA dataset,I follow Greg Heinrich's comments in Google,and here is the caffemodel:http://dl.caffe.berkeleyvision.org/fcn8s-heavy-pascal.caffemodel
the model is 538 MB,and when I choose the model,the error is Check failed: fd != -1 (-1 vs. -1) File not found: /home/fengmoon/fcn8s.caffemodel.
And I try to trace the problem,the problem is that
## caffe crashed with SIGABRT in goolge::LogMessage::fail()
I found the DIGITS use the /user/bin/caffe train --slover --weights (why?)and I use the DIGITS to train the FCN-Alexnet.caffemodel with my own data.FCN-Alexnet caffemodel is much small.and it worked.
so I tried to use the LMDB created from DIGITS and use the ./caffe/tools/caffe train.But the problem is "prefetching the data" at the first iteration and the loss never changes.
here is the out_log.

Learning Rate Policy: step
I0617 16:25:14.054409 17367 solver.cpp:330] Iteration 0, Testing net (#0)
### I0617 16:41:45.633934 17396 data_layer.cpp:73] Restarting data prefetching from start.
### I0617 16:41:45.646694 17395 data_layer.cpp:73] Restarting data prefetching from start.
I0617 16:41:46.954879 17367 solver.cpp:397] Test net output #0: accuracy = 0.0103052
I0617 16:41:46.954936 17367 solver.cpp:397] Test net output #1: loss = 2.48487 (* 1 = 2.48487 loss)
I0617 16:41:48.127647 17367 solver.cpp:218] Iteration 0 (-7.50073e-36 iter/s, 994.11s/1340 iters), loss = 2.48491
I0617 16:41:48.127719 17367 solver.cpp:237] Train net output #0: loss = 2.48491 (* 1 = 2.48491 loss)
I0617 16:41:48.127744 17367 sgd_solver.cpp:105] Iteration 0, lr = 0.0001
I0617 17:09:55.530222 17367 solver.cpp:218] Iteration 1340 (0.794095 iter/s, 1687.46s/1340 iters), loss = 2.48491
I0617 17:09:55.530390 17367 solver.cpp:237] Train net output #0: loss = 2.48491 (* 1 = 2.48491 loss)
I0617 17:09:55.530411 17367 sgd_solver.cpp:105] Iteration 1340, lr = 0.0001

here is the warning:(if I use the DIGITS directly,the error comes)
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 537963413。
I think this may be a bug?
DIGITS should support the large caffemodel,if it can't,then we can't use it to slove the problem at all.
how to solve the problem?

Please help me or give me a hint if you know.The problem really makes me crazy,thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant