Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mobilenet depth_multiplier issue #10349

Closed
pedghz opened this issue Jun 3, 2018 · 8 comments
Closed

Mobilenet depth_multiplier issue #10349

pedghz opened this issue Jun 3, 2018 · 8 comments

Comments

@pedghz
Copy link

pedghz commented Jun 3, 2018

I want to change the depth_multiplier to other values than 1. I have tried these two lines of code!

from keras.applications.mobilenet import MobileNet
basic_model = MobileNet(alpha=0.25, depth_multiplier=0.25, weights=None)

But this error occurs:

Traceback (most recent call last):
File ".../test.py", line 2, in basic_model = MobileNet(alpha=0.25, depth_multiplier=0.25, weights=None)
File "...\Anaconda3\lib\site-packages\keras\applications\mobilenet.py", line 456, in MobileNet x = _depthwise_conv_block(x, 64, alpha, depth_multiplier, block_id=1)
File "...\Anaconda3\lib\site-packages\keras\applications\mobilenet.py", line 654, in depthwise_conv_block name='conv_dw%d' % block_id)(inputs)
File "...\Anaconda3\lib\site-packages\keras\engine\topology.py", line 576, in call self.build(input_shapes[0])
File "...\Anaconda3\lib\site-packages\keras\applications\mobilenet.py", line 228, in build constraint=self.depthwise_constraint)
File "...\Anaconda3\lib\site-packages\keras\legacy\interfaces.py", line 87, in wrapper return func(*args, **kwargs)
File "...\Anaconda3\lib\site-packages\keras\engine\topology.py", line 397, in add_weight weight = K.variable(initializer(shape),
File "...\Anaconda3\lib\site-packages\keras\initializers.py", line 212, in call dtype=dtype, seed=self.seed)
File "...\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 3627, in random_uniform dtype=dtype, seed=seed)
File "...\Anaconda3\lib\site-packages\tensorflow\python\ops\random_ops.py", line 240, in random_uniform shape, dtype, seed=seed1, seed2=seed2)
File "...\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_random_ops.py", line 247, in _random_uniform seed=seed, seed2=seed2, name=name)
File "...\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 589, in apply_op param_name=input_name)
File "...\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 60, in _SatisfiesTypeConstraint ", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64

@titu1994
Copy link
Contributor

titu1994 commented Jun 3, 2018

Depth multiplier argument takes a positive integer greater than 0. The error you are seeing is showing that you passed a float32 value for the depth_multiplier argument, where it requires an int32 or int64 value.

@pedghz
Copy link
Author

pedghz commented Jun 3, 2018

@titu1994 but isn't it supposed to make the model smaller and faster as described in the original paper for resolution multiplier?

@titu1994
Copy link
Contributor

titu1994 commented Jun 3, 2018

That is the alpha parameter. Depth multipler is kept constant at 1 for all MobileNet and MobileNetV2 models.

@pedghz
Copy link
Author

pedghz commented Jun 3, 2018

@titu1994 I think in the original paper it is both of them but in the Keras implementation, there is this difference!

@titu1994
Copy link
Contributor

titu1994 commented Jun 3, 2018

In both papers, as well as the TF code from which the models were ported, the DepthwiseConvolution2D layer had default depth_multiplier set to 1 internally, and no options were specified to allow changing them. Only alpha was provided to change the width of the layers.

The DepthwiseConvolution2D layer can take the argument depth_multiplier, therefore I added it as an additional parameter in the original MobileNet building code in my repo.

@pedghz
Copy link
Author

pedghz commented Jun 5, 2018

@titu1994 from the mobilent paper: "The second hyper-parameter to reduce the computational cost of a neural network is a resolution multiplier ρ. We apply this to the input image and the internal representation of every layer is subsequently reduced by the same multiplier. In practice, we implicitly set ρ by setting the input resolution. We can now express the computational cost for the core layers of our network as depthwise separable convolutions with width multiplier α and resolution multiplier ρ:
DK · DK · αM · ρDF · ρDF + αM · αN · ρDF · ρDF (7)
where ρ ∈ (0, 1] which is typically set implicitly so that the input resolution of the network is 224, 192, 160 or 128. ρ = 1 is the baseline MobileNet and ρ < 1 are reduced computation MobileNets. Resolution multiplier has the effect of reducing computational cost by ρ^2."

@pedghz
Copy link
Author

pedghz commented Jun 5, 2018

And Keras has claimed the depth_multiplier is same as resolution multiplier in here and here: "depth_multiplier: depth multiplier for depthwise convolution (also called the resolution multiplier)"

@titu1994
Copy link
Contributor

titu1994 commented Jun 5, 2018

@pedghz Admittedly, that was a poor choice of words on my part for the documentation, and a poor choice on the name of the parameter itself.

The input resolution parameter rho is implicitly defined by the input shape. The depth multiplier parameter has no relation to it.

The depth multiplier parameter is used as the output channel multiplier for the weight matrix (https://github.com/fchollet/deep-learning-models/blob/master/mobilenet.py#L220). The documentation is clearer in the original Tensorflow repository (https://www.tensorflow.org/api_docs/python/tf/nn/depthwise_conv2d).

The output has in_channels * channel_multiplier channels.

Here, channel multipler is depth multipler parameter, and hence it needs to be an integer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants