Skip to content
This repository has been archived by the owner on Feb 7, 2023. It is now read-only.

Loading Pretrained Models tutorial fails when running prediction #1559

Closed
ryanrolds opened this issue Dec 2, 2017 · 14 comments
Closed

Loading Pretrained Models tutorial fails when running prediction #1559

ryanrolds opened this issue Dec 2, 2017 · 14 comments

Comments

@ryanrolds
Copy link

ryanrolds commented Dec 2, 2017

In caffe2/caffe2/blob/master/caffe2/python/tutorials/Loading_Pretrained_Models.ipynb fails with:

RuntimeError                              Traceback (most recent call last)
<ipython-input-4-2e7dda059c39> in <module>()
      9 
     10 # run the net and return prediction
---> 11 results = p.run([img])
     12 
     13 # turn it into something we can play with and examine which is in a multi-dimensional array

RuntimeError: [enforce fail at tensor.h:671] i < dims_.size(). 0 vs 0. Exceeding ndim limit Error from operator: 
input: "data" input: "conv1_w" input: "conv1_b" output: "conv1" type: "Conv" arg { name: "stride" i: 2 } arg { name: "pad" i: 0 } arg { name: "kernel" i: 3 } device_option { } engine: ""
** while accessing input: data
$ ls -l /usr/local/caffe2/python/models/squeezenet/
total 6048
-rw-r--r-- 1 root root 6180983 Dec  2 20:00 init_net.pb
-rw-r--r-- 1 root root    6385 Dec  2 20:00 predict_net.pb
@pietern
Copy link
Contributor

pietern commented Dec 20, 2017

Thanks for reporting. Trying to repro.

@pietern
Copy link
Contributor

pietern commented Dec 21, 2017

I have a segv repro, now figuring out what's causing it. I'll document my steps here:

For starters, these are the blobs + dimensions after loading the init_net:

workspace.CreateNet(init_net, True)
for b in workspace.Blobs():
    x = workspace.FetchBlob(b)
    print("{}: {}".format(b, x.shape))
conv10_b: (1000,)
conv10_w: (1000, 512, 1, 1)
conv1_b: (64,)
conv1_w: (64, 3, 3, 3)
fire2/expand1x1_b: (64,)
fire2/expand1x1_w: (64, 16, 1, 1)
fire2/expand3x3_b: (64,)
fire2/expand3x3_w: (64, 16, 3, 3)
fire2/squeeze1x1_b: (16,)
fire2/squeeze1x1_w: (16, 64, 1, 1)
fire3/expand1x1_b: (64,)
fire3/expand1x1_w: (64, 16, 1, 1)
fire3/expand3x3_b: (64,)
fire3/expand3x3_w: (64, 16, 3, 3)
fire3/squeeze1x1_b: (16,)
fire3/squeeze1x1_w: (16, 128, 1, 1)
fire4/expand1x1_b: (128,)
fire4/expand1x1_w: (128, 32, 1, 1)
fire4/expand3x3_b: (128,)
fire4/expand3x3_w: (128, 32, 3, 3)
fire4/squeeze1x1_b: (32,)
fire4/squeeze1x1_w: (32, 128, 1, 1)
fire5/expand1x1_b: (128,)
fire5/expand1x1_w: (128, 32, 1, 1)
fire5/expand3x3_b: (128,)
fire5/expand3x3_w: (128, 32, 3, 3)
fire5/squeeze1x1_b: (32,)
fire5/squeeze1x1_w: (32, 256, 1, 1)
fire6/expand1x1_b: (192,)
fire6/expand1x1_w: (192, 48, 1, 1)
fire6/expand3x3_b: (192,)
fire6/expand3x3_w: (192, 48, 3, 3)
fire6/squeeze1x1_b: (48,)
fire6/squeeze1x1_w: (48, 256, 1, 1)
fire7/expand1x1_b: (192,)
fire7/expand1x1_w: (192, 48, 1, 1)
fire7/expand3x3_b: (192,)
fire7/expand3x3_w: (192, 48, 3, 3)
fire7/squeeze1x1_b: (48,)
fire7/squeeze1x1_w: (48, 384, 1, 1)
fire8/expand1x1_b: (256,)
fire8/expand1x1_w: (256, 64, 1, 1)
fire8/expand3x3_b: (256,)
fire8/expand3x3_w: (256, 64, 3, 3)
fire8/squeeze1x1_b: (64,)
fire8/squeeze1x1_w: (64, 384, 1, 1)
fire9/expand1x1_b: (256,)
fire9/expand1x1_w: (256, 64, 1, 1)
fire9/expand3x3_b: (256,)
fire9/expand3x3_w: (256, 64, 3, 3)
fire9/squeeze1x1_b: (64,)
fire9/squeeze1x1_w: (64, 512, 1, 1)

@pietern
Copy link
Contributor

pietern commented Dec 21, 2017

The segv happens here, where dims_ is an empty vector.

https://github.com/caffe2/caffe2/blob/21dc87bac01a60b695e6f5f81dfa31b13713f110/caffe2/core/tensor.h#L674

@pietern
Copy link
Contributor

pietern commented Dec 21, 2017

(@ryanrolds I think you must have run with NDEBUG set to trigger the proper enforce guard)

@ryanrolds
Copy link
Author

Is that the default behavior? I'm encountering this issue with a build from master following directions on the site.

@pietern
Copy link
Contributor

pietern commented Dec 21, 2017

Yep, it's all good, it's just why I wasn't seeing the enforce fire, but a hard segv.

@nafest
Copy link

nafest commented Jan 26, 2018

Any progress on this issue?

@danfouer
Copy link

RuntimeError Traceback (most recent call last)
in ()
9
10 # run the net and return prediction
---> 11 results = p.run([img])
12
13 # turn it into something we can play with and examine which is in a multi-dimensional array

RuntimeError: [enforce fail at tensor.h:675] i < dims_.size(). 0 vs 0. Exceeding ndim limit Error from operator:
input: "data" input: "conv1_w" input: "conv1_b" output: "conv1" type: "Conv" arg { name: "stride" i: 2 } arg { name: "pad" i: 0 } arg { name: "kernel" i: 3 } device_option { } engine: ""
** while accessing input: data

@danfouer
Copy link

@pietern have you solve this problem?

@orionr
Copy link
Contributor

orionr commented Feb 1, 2018

Hi @danfouer and @ryanrolds - it looks like the SqueezeNet model in Amazon S3 (where the model downloader gets files from) was corrupt. I've just updated with the one in GitHub. Can you confirm doing the following

python -m caffe2.python.models.download -i squeezenet

again and replacing the file allows you to run the tutorial? Thanks for your patience!

@orionr orionr self-assigned this Feb 1, 2018
@ryanrolds
Copy link
Author

Awesome. I will confirm ASAP, it may be this weekend. Thank you.

@houseroad
Copy link
Contributor

Changing this line "results = p.run([img])" to "results = p.run({'data': img})" can solve the problem for the old model.

@ryanrolds
Copy link
Author

Confirming that the new model resolved the issue. Thanks.

@jjoss
Copy link

jjoss commented Oct 31, 2018

hi! i have the last mobilenet_v2 and the error persist

results = p.run({'data': img})
RuntimeError: [enforce fail at fully_connected_op.h:68] K == W.size() / N. Dimension mismatch: X: [1, 1280, 2, 2], W: [1000, 1280], b: [1000], axis: 1, M: 1, N: 1000, K: 5120Error from operator:
input: "final_avg" input: "pred_w" input: "pred_b" output: "pred" name: "" type: "FC" arg { name: "order" s: "NCHW" } arg { name: "ws_nbytes_limit" i: 268435456 } device_option { } engine: ""

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants