-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to convert caffe ResNet model #12
Comments
I have replicated the first error you mention, but it looks like it is not in the first Scale layer, but as this line of Theano's error suggests:
it is in layer 'scale5a_branch1', which is close to the end of the network. If we look at the result of model.summary():
it seems that for some reason it is trying to reshape the 'gamma' parameter, which has a shape of (2048) to (1,1,1,7), when in fact it should reshape it to (1,2048,1,1) for applying the corresponding operations on the input data. About your tests with Tensorflow I must say that I am not providing compatibility for it, so several unknown errors might arise. I will continue taking a look at this, if you have any other idea or clue let me know. |
Any update on this front? I'm getting the same error.. |
Hey Marc! I understand. However Im not trying to convert a resnet pretrained in Imagenet but in places2 (which we then could add to keras). Any idea on when you might have a look at this? |
I am pretty busy right now, so I'm afraid can't promise anything. Although, PRs are welcome in case anybody is willing to help with this or any other possible functionality. |
Moved this from #8 since it doesn't really fit under that topic:
I was able to convert ResNet50 with the current commit and adding the data layer to the model text, but wasn't able to get inference working.
Assuming it was accidentally getting recompiled , I removed the explicit compile step and got this error (same as if replacing mode=0 with mode=2 in the model files):
only change to test_converted.py was replacing the 3 losses with one when compiling.
model.summary() looks reasonable around that area.
Using tensorflow instead as a backend crashes when trying to load the weights.
I modified batch_set_value to pass a sanitized version of the name along, and the problem occurs on the first caffe batchnorm scaling layer:
So it seems related to the new Scale layer which attempts to replicate caffe's batchnorm+scale division
The text was updated successfully, but these errors were encountered: