Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Group convolution in Keras] ResNeXt mxnet -> IR -> keras #58

Closed
kamikawa opened this Issue Jan 18, 2018 · 14 comments

Comments

Projects
None yet
3 participants
@kamikawa
Copy link

kamikawa commented Jan 18, 2018

Hi
Thank you for a great covert tool.

I am trying to convert from mxnet resnext to keras.
symbol file: http://data.mxnet.io/models/imagenet/resnext/101-layers/resnext-101-64x4d-symbol.json
param file: http://data.mxnet.io/models/imagenet/resnext/101-layers/resnext-101-64x4d-0000.params

I could convert from mxnet to IR with no error,

python -m mmdnn.conversion._script.convertToIR -f mxnet -n resnext-101-64x4d-symbol.json -w resnext-101-64x4d-0000.params -d resnext-101-64x4d --inputShape 3 224 224

but failed to convert from IR to keras with an error below.
Would you support this model?

Regards,


python -m mmdnn.conversion._script.IRToCode -f keras --IRModelPath resnext-101-64x4d.pb --dstModelPath keras_resnext-101-64x4d.py

Parse file [resnext-101-64x4d.pb] with binary format successfully.
Traceback (most recent call last):
File "C:\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion_script\IRToCode.py", line 120, in
_main()
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion_script\IRToCode.py", line 115, in _main
ret = _convert(args)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion_script\IRToCode.py", line 56, in _convert
emitter.run(args.dstModelPath, args.dstWeightPath, args.phase)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\common\DataStructure\emitter.py", line 21, in run
self.save_code(dstNetworkPath, phase)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\common\DataStructure\emitter.py", line 53, in save_code
code = self.gen_code(phase)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 95, in gen_code
func(current_node)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 194, in emit_Conv
return self._emit_convolution(IR_node, 'layers.Conv{}D'.format(dim))
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 179, in _emit_convolution
input_node, padding = self._defuse_padding(IR_node)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 160, in _defuse_padding
padding = self._convert_padding(padding)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 139, in _convert_padding
padding = convert_onnx_pad_to_tf(padding)[1:-1]
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\common\utils.py", line 62, in convert_onnx_pad_to_tf
return np.transpose(np.array(pads).reshape([2, -1])).reshape(-1, 2).tolist()

ValueError: cannot reshape array of size 1 into shape (2,newaxis)

@kitstar

This comment has been minimized.

Copy link
Contributor

kitstar commented Jan 19, 2018

Hi @kamikawa, fixed the issue you met. But there is no offical "group conv" support in keras. Any help to fill this group_conv part like this pull request are welcome.

@kitstar kitstar changed the title ResNeXt mxnet -> IR -> keras [Group convolution in Keras] ResNeXt mxnet -> IR -> keras Jan 19, 2018

@kamikawa

This comment has been minimized.

Copy link
Author

kamikawa commented Jan 19, 2018

Thank you for an answer, I understand the situation.
For official group conv support in keras, backend support may also be needed...

@kitstar

This comment has been minimized.

Copy link
Contributor

kitstar commented Jan 20, 2018

We can split the input into groups, apply the kernel for each one, and concat them after all, to work around it.

@kitstar

This comment has been minimized.

Copy link
Contributor

kitstar commented Jan 26, 2018

Hi @kamikawa . The keras group convolution is implemented by @skybigzhou and tested mxnet resnext->keras conversion. Please try the newest code. Thanks both.

@kamikawa

This comment has been minimized.

Copy link
Author

kamikawa commented Jan 26, 2018

Hi @kitstar , thank you for handling this issue, @skybigzhou , thank you for the fixed code.
I tried the fixed version.
resnext IR architecture file could be converted fine, but I failed to convert IR weights file (hang ups) .
It might be my environment specific problem, or weights file convert also needs work around...
But thanks, anyway.


$ python -m mmdnn.conversion._script.IRToCode -f keras --IRModelPath resnext-101-64x4d.pb --dstModelPath keras_resnext-101-64x4d.py
Parse file [resnext-101-64x4d.pb] with binary format successfully.
Target network code snippet is saved as [keras_resnext-101-64x4d.py].

$ python -m mmdnn.conversion.examples.keras.imagenet_test -n keras_resnext-101-64x4d.py -w resnext-101-64x4d.npy --dump keras_resnext-101-64x4d.h5
/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
2018-01-26 23:16:15.297039: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-01-26 23:16:15.297483: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
2018-01-26 23:16:15.667115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Device peer to peer matrix
2018-01-26 23:16:15.667190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] DMA: 0
2018-01-26 23:16:15.667211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 0: Y
2018-01-26 23:16:15.667226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)

@kitstar

This comment has been minimized.

Copy link
Contributor

kitstar commented Jan 27, 2018

It is not hanged up.... It took 30 minutes in my machine to finish the conversion (keras model.save() function). Maybe it needs to copy weights for each conv group since it is a workaround solution. You can put it background and wait for ... 1hour?

@kamikawa

This comment has been minimized.

Copy link
Author

kamikawa commented Jan 27, 2018

Oh, sorry. I think I waited about 10 minutes. I will try again next week, and let you know.

@kamikawa

This comment has been minimized.

Copy link
Author

kamikawa commented Jan 29, 2018

Hi @kitstar , I tried the resnext-101-64x64d IR weight conversion (took about 70 min.) , but failed to save to hdf5 by the limitation of hdf5 object header size. I could convert resnext-50 cases.
I am wondering why you succeeded, but hdf5 64KB object header limit may be the reason for the failure.
If so, the solution is discussed below and we might want to wait for it.
keras-team/keras#7508


Traceback (most recent call last):
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/mmdnn-0.1.2-py3.6.egg/mmdnn/conversion/examples/keras/imagenet_test.py", line 58, in
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/mmdnn-0.1.2-py3.6.egg/mmdnn/conversion/examples/keras/imagenet_test.py", line 50, in dump
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/keras/engine/topology.py", line 2573, in save
save_model(self, filepath, overwrite, include_optimizer)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/keras/models.py", line 119, in save_model
topology.save_weights_to_hdf5_group(model_weights_group, model_layers)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/keras/engine/topology.py", line 2869, in save_weights_to_hdf5_group
f.attrs['layer_names'] = [layer.name.encode('utf8') for layer in layers]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/h5py/_hl/attrs.py", line 95, in setitem
self.create(name, data=value, dtype=base.guess_dtype(value))
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/h5py/_hl/attrs.py", line 188, in create
attr = h5a.create(self._id, self._e(tempname), htype, space)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5a.pyx", line 47, in h5py.h5a.create
RuntimeError: Unable to create attribute (object header message is too large)


@kamikawa

This comment has been minimized.

Copy link
Author

kamikawa commented Feb 2, 2018

Another problem.
KitModel(weight_file=None) function of keras resnext-50 model I converted does not work with an error below.
Regards.


Traceback (most recent call last):
File "resnext_test.py", line 169, in
main()
File "resnext_test.py", line 93, in main
model = KitModel(weight_file=None)
File "/home/kamikawa/test/keras_resnext50.py", line 59, in KitModel
stage1_unit1_conv2 = convolution(weights_dict, name='stage1_unit1_conv2', input=stage1_unit1_conv2_input, group=32, conv_type='layers.Conv2D', filters=128, kernel_size=(3, 3), strides=(1, 1), dilation_rate=(1, 1), padding='valid', use_bias=False)
File "/home/kamikawa/test/keras_resnext50.py", line 266, in convolution
weights_dict[name + "_" + str(c)] = dict()
TypeError: 'NoneType' object does not support item assignment

@kitstar

This comment has been minimized.

Copy link
Contributor

kitstar commented Feb 2, 2018

Hi @kamikawa

For resnext-50, we tested with below steps:

  1. Download MXNet resnext-50 model
$ python3 -m mmdnn.conversion.examples.mxnet.extract_model -n imagenet1k-resnext-50 -i mmdnn/conversion/examples/data/seagull.jpg

Downloading file [./resnext-50-symbol.json] from [http://data.mxnet.io/models/imagenet/resnext/50-layers/resnext-50-symbol.json]
100% [..............................................................................] 79195 / 79195Downloading file [./resnext-50-0000.params] from [http://data.mxnet.io/models/imagenet/resnext/50-layers/resnext-50-0000.params]
100% [......................................................................] 100404458 / 100404458Model imagenet1k-resnext-50 saved.
[11:13:01] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[11:13:01] src/nnvm/legacy_json_util.cc:198: Symbol successfully upgraded!
[(396, 0.7104751), (398, 0.122665755), (438, 0.06391319), (440, 0.029796895), (417, 0.019492012)]
  1. MXNet -> IR
$ python3 -m mmdnn.conversion._script.convertToIR -f mxnet -n examples/mxnet/models/resnext-50-symbol.json -w examples/mxnet/models/resnext-50-0000.params -d mxnet_resnext50 --inputShape 3 224 224

IR network structure is saved as [mxnet_resnext50.json].
IR network structure is saved as [mxnet_resnext50.pb].
IR weights are saved as [mxnet_resnext50.npy].
  1. IR -> Keras
$ python3 -m mmdnn.conversion._script.IRToCode -f keras --IRModelPath mxnet_resnext50.pb --dstModelPath converted_resnext50.py --IRWeightPath mxnet_resnext50.npy

Parse file [mxnet_resnext50.pb] with binary format successfully.
Target network code snippet is saved as [converted_resnext50.py].
  1. Test Keras Result
$ python3 -m mmdnn.conversion.examples.keras.imagenet_test -p imagenet1k-resnext-50 -s mxnet -n converted_resnext50 -w mxnet_resnext50.npy -i mmdnn/conversion/examples/data/seagull.jpg

[(396, 0.71047646), (398, 0.12266552), (438, 0.063913494), (440, 0.02979695), (417, 0.01949201)]
Test model [imagenet1k-resnext-50] from [mxnet] passed.
  1. Save as Keras model
$ python3 -m mmdnn.conversion.examples.keras.imagenet_test -n converted_resnext50 -w mxnet_resnext50.npy --dump mxnet_resnext50.h5

Keras model file is saved as [mxnet_resnext50.h5], generated by [converted_resnext50.py] and [mxnet_resnext50.npy]

Converted model inference result matches the original inference result. Test passed.


For resnext-101, we tested only without last step and result is ok like resnext-50. Seems we can wait for Keras update to fix this problem.

Thanks.


You can download the resnext-50 related files(Including final h5 file and IR files) from here.

@kamikawa

This comment has been minimized.

Copy link
Author

kamikawa commented Feb 2, 2018

Hi @kitstar , thank you for the detailed procedure.
I did not notice IR weights npy file can be used instead of hdf5 keras weights file.
This is a good specification (though it is slow to load).
Your test sequence command seems set IR weights filename for an argument of KitModel function.
So I think I figure out the situation, and below is summary of error condition I get.

model = KitModel(weight_file=None) # error when resnext , OK when resnet
model = KitModel(weight_file=IR_weights_filename) # OK resnext or resnet. The test mode is run with this setting.

It seems that the converted resnext Kitmodel function does not support when argument is set "None"

Regards.

@kitstar

This comment has been minimized.

Copy link
Contributor

kitstar commented Feb 2, 2018

Glad it help! Thank you for using it and issue submission! Any other idea is welcome!

@kamikawa

This comment has been minimized.

Copy link
Author

kamikawa commented Apr 5, 2018

The limitation of Keras hdf5 object header size seems to be solved
keras-team/keras#7508
I close this issue.
Tensorflow (a backend of Keras) official support of group conv is being discussed here.
tensorflow/tensorflow#3332
When keras supports it , I hope MMdnn convertor will also support it.
Thank you.

@chrisluu

This comment has been minimized.

Copy link

chrisluu commented Dec 27, 2018

Hi @kitstar I followed ur step, but got an error in the final step saving as Keras model, the error is as following:

Traceback (most recent call last):
  File "/home/imsight/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/imsight/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/imsight/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/keras/imagenet_test.py", line 161, in <module>
    tester = TestKeras()
  File "/home/imsight/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/keras/imagenet_test.py", line 31, in __init__
    self.model = self.MainModel.KitModel(self.args.w)
  File "converted_resnext50.py", line 247, in KitModel
    set_layer_weights(model, weights_dict)
  File "converted_resnext50.py", line 27, in set_layer_weights
    current_layer_parameters.extend([cur_dict['mean'], cur_dict['var']])
KeyError: 'mean'

Do you know why this happened? Any suggestions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.