Floating point exception in solver.cpp #5976

nitinsinghgit · 2017-10-13T05:07:03Z

Floating point exception in caffe.
I1012 21:53:53.901605 11666 net.cpp:761] Ignoring source layer argmax
I1012 21:53:53.902276 11666 solver.cpp:279] Solving Squeezenet_4
I1012 21:53:53.902292 11666 solver.cpp:280] Learning Rate Policy: step
I1012 21:53:54.131595 11666 solver.cpp:228] Iteration 0, loss = 0.0893308
I1012 21:53:54.166373 11666 solver.cpp:244] Train net output #0: label = 0
I1012 21:53:54.166404 11666 solver.cpp:244] Train net output #1: label = 0
I1012 21:53:54.166419 11666 solver.cpp:244] Train net output #2: label = 0
I1012 21:53:54.166430 11666 solver.cpp:244] Train net output #3: label = 0
I1012 21:53:54.166443 11666 solver.cpp:244] Train net output #4: label = 0
I1012 21:53:54.166523 11666 solver.cpp:244] Train net output #5: label_result = 0
I1012 21:53:54.166545 11666 solver.cpp:244] Train net output #6: label_result = 0
I1012 21:53:54.166559 11666 solver.cpp:244] Train net output #7: label_result = 0
I1012 21:53:54.166574 11666 solver.cpp:244] Train net output #8: label_result = 0
I1012 21:53:54.166586 11666 solver.cpp:244] Train net output #9: label_result = 0
I1012 21:53:54.166604 11666 solver.cpp:244] Train net output #10: loss = 0.0893308 (* 1 = 0.0893308 loss)
Floating point exception

Steps to reproduce

The issue appears with the train network and is reproducible, even if we consider only the image data layer in the prototxt.
The issue is not present if we run the model in test mode.

Your system configuration

Operating system: Ubunutu 14.04
Compiler: g++ 4.8.4
CUDA version (if applicable): 8.0
CUDNN version (if applicable):
BLAS: atlas
Python or MATLAB version (for pycaffe and matcaffe respectively): python 2.7

shaibagon · 2017-10-14T17:02:14Z

please attach the train.prototxt and solver.prototxt as well as input data you are using.
Is it possible you training data is faulty?

nitinsinghgit · 2017-10-16T06:15:39Z

train.prototxt

name: "Squeezenet_4"
layer {
name: "train_data"
type: "ImageData"
top: "data"
top: "label"
include{
phase:TRAIN}
image_data_param {
source: "data_path"
batch_size: 5
scale: 0.0039215684
new_height: 224
new_width : 224
}
transform_param {
mean_value: 104
mean_value: 117
mean_value: 123
}

}
layer{
name: "result_data"
type: "ImageData"
top: "result"
top: "label_result"
include{
phase:TRAIN}
image_data_param{
source: "label_path"
batch_size: 5
is_color:false
new_height: 224
new_width : 224
}
}
layer{
name:"silence"
bottom:"data"
bottom:"result"
type:"Silence"
}

solver.prototxt
net: "train.prototxt"
base_lr: 0.1
max_iter: 20000
lr_policy: "step"
gamma: 0.1
display: 20
weight_decay: 1.0001
solver_mode: GPU
random_seed: 831486
stepvalue: 8000
stepvalue: 13000
snapshot_prefix: "snapshot_path"
snapshot:5000
type: "AdaGrad"

The training data is fine, because the network works on the phase:TEST. It is the same issue as https://groups.google.com/forum/#!topic/caffe-users/9dmLlIeihnU

Noiredd · 2017-10-20T10:17:18Z

Does the problem persist when you change your dataset to some other file?
Does the problem persist when you change the solver type, let's say to standard SGD?

nitinsinghgit · 2017-10-23T06:14:30Z

Yes, the problem persist when the dataset is changed and also for type:"SGD". However if i read the same dataset in TEST mode it works.

nitinsinghgit · 2017-10-24T07:20:29Z

The issue is with the low storage space on disk.
It is not straightforward from the error message "Floating point exception" that the issue might be with low storage space on disk.

Noiredd changed the title ~~Flaoting point exception~~ Floating point exception in solver.cpp Oct 20, 2017

nitinsinghgit closed this as completed Oct 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating point exception in solver.cpp #5976

Floating point exception in solver.cpp #5976

nitinsinghgit commented Oct 13, 2017

shaibagon commented Oct 14, 2017

nitinsinghgit commented Oct 16, 2017

Noiredd commented Oct 20, 2017

nitinsinghgit commented Oct 23, 2017

nitinsinghgit commented Oct 24, 2017

Floating point exception in solver.cpp #5976

Floating point exception in solver.cpp #5976

Comments

nitinsinghgit commented Oct 13, 2017

Steps to reproduce

Your system configuration

shaibagon commented Oct 14, 2017

nitinsinghgit commented Oct 16, 2017

Noiredd commented Oct 20, 2017

nitinsinghgit commented Oct 23, 2017

nitinsinghgit commented Oct 24, 2017