Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fine-tune the mxnet ssd get mismatchfrom.shape() error #6474

Closed
liumusicforever opened this issue May 27, 2017 · 15 comments
Closed

Fine-tune the mxnet ssd get mismatchfrom.shape() error #6474

liumusicforever opened this issue May 27, 2017 · 15 comments

Comments

@liumusicforever
Copy link

Hi,

  1. I created my own .rec with one class dataset (people) ,
    (which reference to train.rec (created from prepare_dataset.py))
  2. I use this .rec to fine-tune with pre-trained model (vgg16_ssd_300_voc0712_trainval) , and I pass the following command :
    python train.py --gpus 0,1,2 --batch-size 100 --train-path ~/path/to/my/own/train.rec --val-path ~/path/to/my/own/val.rec --num-example 10000 --end-epoch 1000 --prefix=model/ssd --batch-size 32 --class-names people --num-class 1 --finetune 1

3.Excuse me why I get the following error :
mxnet.base.MXNetError: [11:58:35] src/ndarray/ndarray.cc:299: Check failed: from.shape() == to->shape() operands shape mismatchfrom.shape = (126,) to.shape=(12,)

Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.0-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f440199efbc]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.0-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet10CopyFromToERKNS_7NDArrayEPS0_i+0x105) [0x7f4402474eb5]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.0-py2.7.egg/mxnet/libmxnet.so(+0x115ac54) [0x7f44024d2c54]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.0-py2.7.egg/mxnet/libmxnet.so(MXImperativeInvoke+0x2cd) [0x7f44023533fd]
[bt] (4) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f440dfc6adc]
[bt] (5) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x1fc) [0x7f440dfc640c]
[bt] (6) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(_ctypes_callproc+0x48e) [0x7f440e1dd5fe]
[bt] (7) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(+0x15f9e) [0x7f440e1def9e]
[bt] (8) python(PyEval_EvalFrameEx+0x98d) [0x5244dd]
[bt] (9) python(PyEval_EvalCodeEx+0x2b1) [0x555551]

4.(126,) to.shape=(12,) above ,I guess this may meens the ord model has 20 class and one backgroung (20+1)*6=126 ,and my one data only have one class and one background (1+1)*6=12 ,but I put args "--num-class 1" and "fine-tune 1" already , why still show this error?

5.please help me to fix it , thanks!!!

@adrianloy
Copy link

Could you solve this? I get the same error all the time I try to use the demo for a model that I trained myself

@hungpt297
Copy link

hungpt297 commented Jul 13, 2017

@liumusicforever @adrianloy
Could you solve this?
I get same error.
I saw that Release-v0.2-beta provided 2 models:
- Pretrained classification model: vgg16_reduced.zip
- Pretrained detection model: ssd_300_voc0712.zip
What is difference between them?
How to convert from Pretrained detection model to Pretrained classification model?

Please help me. Many thanks.

@adrianloy
Copy link

Yeah I could. Check if the CLASSES are set correctly at training and in demo.py. If you train with a single class, you need to set CLASS =[classname,] otherwise len(CLASS) returns a wrong value.

@lijuan123
Copy link

@adrianloy Hi , I have set the num_class and class_names in train.py, but i still have the erro. So i want to know if there are other places i need to change the setting, thank you very much!

@adrianloy
Copy link

When I had to do with this project, I also had to adjust it in demo.py and when preparing the dataset. But I do not know if that is still the case, they changed some stuff in the last month and I am not up to date anymore.

@liumusicforever
Copy link
Author

liumusicforever commented Sep 18, 2017

I solve from making sure the num_class is equal with calling by symbol.py (importlib)"" and loading from model params file (load_checkpoint).

@lijuan123
Copy link

@liumusicforever oh sorry, can you tell me more about it. I don't understand well with what you mean. The symbol used is resnet50, and i load the pretrained model from epoch 0 . Thank you again

@liumusicforever
Copy link
Author

Did your classes number of pretrained model is same as classes number of symbol ?

@liumusicforever
Copy link
Author

sorry , I make a mistake , I mean numbers of class not shape of data above.

@lijuan123
Copy link

@liumusicforever thank you!

@szha
Copy link
Member

szha commented Dec 22, 2017

@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage.

For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.

@lanking520
Copy link
Member

Hi @liumusicforever , are you still facing this issue?

@liumusicforever
Copy link
Author

@lanking520 I solved it by checking dimension on pretrained model , symbol and datas.

@lanking520
Copy link
Member

Thank you @liumusicforever . Are we safe to close this issue?

@liumusicforever
Copy link
Author

yes !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants