how to convert tensorflow model to caffe model? #59

cyh24 · 2016-09-09T09:20:17Z

How to convert a tensorflow .ckpt to caffe model? Is that possible without .prototxt?

Any help is appreciated.

Dagalaki · 2016-12-14T12:19:45Z

Have the same problem. I want to convert tensorflow model to caffe model.
Have you found any way on how to do it?

soldier828 · 2016-12-27T09:26:41Z

Same problem. If you @cyh24 @Dagalaki find some solutions, please inform me.

catsdogone · 2017-02-14T02:53:40Z

@ethereon could you provide some advises?

ethereon · 2017-02-15T23:46:50Z

The reverse conversion is fairly similar:

Map TensorFlow ops (or groups of ops) to Caffe layers
Transform parameters to match Caffe's expected format

Things are slightly trickier for step 1 when going from tf to caffe, since the equivalent of a caffe layer might be split into multiple tf sub ops. So pattern matching against the op signatures / scopes might be one approach for tackling this.

For certain ops like convolutions, you can avoid the transformation in step 2 by specifying a Caffe compatible ordering (eg: data_format = NCHW)

catsdogone · 2017-02-21T07:08:33Z

@ethereon @cyh24 Thank you for your help. I am trying to convert inception-resnet-v2 model to caffe, and not sure about the param of BatchNorm layer.
Is it right if I make:
tfLayer/weights:0 -> caffeLayer_weights, ...[0].data
tfLayer/BatchNorm/beta:0 -> caffeLayerScale_bias, ...[1].data
tfLayer/BatchNorm/moving_mean:0 -> caffeLayerBn_mean, ...[0].data
tfLayer/BatchNorm/moving_variance:0 -> caffeLayerBn_var ? ...[1].data
I copy the parameters while the produced caffe model show bad classification results.

Jerryzcn · 2017-03-23T04:49:40Z

@catsdogone I tried the same thing and my activation is off, and cannot get the same accuracy. I also specified the scale parameter in scale layer to 1, and set BatchNorm's moving average factor to 1. :(

sskgit · 2017-04-05T16:09:49Z

Same issue. Want to convert Tensorflow Inception V3 and ResNet model to Caffe. That will be great!

Jerryzcn · 2017-04-07T20:35:22Z

Okay, I was able to achieve similar performance after changing the padding for the 1x7 and 7x1 filter to (0,3) and (3,0) instead of (1,2) and (2,1)

nyyznyyz1991 · 2017-05-19T02:35:35Z

@Jerryzcn @catsdogone
Could you share how to transfer model from tensorflow to caffe?
rewrite caffe's prototxt from scratch based on tensorflow or write a transfer.py script to achieve it?

Jerryzcn · 2017-06-10T21:17:22Z

@nyyznyyz1991 I use pycaffe to generate the prototxt based on tensorflow. I cannot share it though.

neobarney · 2017-06-19T00:33:39Z

@Jerryzcn is it hard to code the conversion script with pycaffe?

Jerryzcn · 2017-06-19T00:51:51Z

@neobarney took me about 1 week.

neobarney · 2017-06-19T02:53:24Z

@Jerryzcn wow that's long, you're not planning to release it on github ? would be helpful to lots of peoples ! :)

Jerryzcn · 2017-06-20T23:27:04Z

@neobarney it should only take u 2-3 days, I spend half a week on figuring out why my activation does not match the original network.

neobarney · 2017-06-21T03:57:01Z

cool, I'll try then, thanks for sharing Jerry !!!

zmlmanly · 2017-07-03T12:19:14Z

@Jerryzcn If I use tf.contrib.layers.batch_norm(input, scale=False) in Tensorflow, the "scale =False" means whether the gamma is zero in "y = gamma*x+beta".
the definition of contrib.layers.batch_norm in tensorflow:
def batch_norm(inputs,
decay=0.999,
center=True,
scale=False,
epsilon=0.001,
activation_fn=None,
param_initializers=None,
param_regularizers=None,
updates_collections=ops.GraphKeys.UPDATE_OPS,
is_training=True,
reuse=None,
variables_collections=None,
outputs_collections=None,
trainable=True,
batch_weights=None,
fused=False,
data_format=DATA_FORMAT_NHWC,
zero_debias_moving_mean=False,
scope=None,
renorm=False,
renorm_clipping=None,
renorm_decay=0.99):
scale: If True, multiply by gamma. If False, gamma is
not used. When the next layer is linear (also e.g. nn.relu), this can be
disabled since the scaling can be done by the next layer.

And how to set the param in batchnormlayer in Caffe to make the result same between tensorflow and caffe.

Jerryzcn · 2017-07-04T00:52:40Z

@zmlmanly set scale to 1 on caffe should work

zmlmanly · 2017-07-04T01:12:23Z

@Jerryzcn Thank you very much. I’m converting tensorflow to caffe. I use tf.contrib.layers.batch_norm(input, scale=False) in Tensorflow, so there is only beta param in checkpoint, in your view,
caffeLayer:scale_layer_gamma=1
caffeLayer:scale_layer_beta=tfLayer/BatchNorm/beta:0
But I can not find the mean and variance in checkpoint, so how to set the mean and variance in caffe?

zmlmanly · 2017-07-04T03:13:46Z

@catsdogone Hi, I want to know how to save the moving_mean and the moving_variance param in tensorflow/batchnormlayer. I have check the param in my trained model of tensorflow, but there is no mean and variance in batchnormlayer. Thank you for your help.

Jerryzcn · 2017-07-04T21:19:47Z

@zmlmanly I think I set them either to zero or one. I forgot which exactly.

MayankSingal · 2017-10-08T08:54:56Z

@zmlmanly @neobarney Were you able to get it running?

jzhaosc · 2017-10-14T00:22:15Z

Convert batch normalization layer in tensorflow to caffe: 1 batchnorm layer in tf is equivalent to a successive of two layer : batchNorm + Scale:
net.params[bn_name][0].data[:] = tf_movingmean
# epsilon 0.001 is the default value used by tf.contrib.layers.batch_norm!!
net.params[bn_name][1].data[:] = tf_movingvariance + 0.001
net.params[bn_name][2].data[:] = 1 # important, set it to be 1
net.params[scale_name][0].data[:] = tf_gamma
net.params[scale_name][1].data[:] = tf_beta

lhCheung1991 · 2017-10-31T03:56:21Z

@jzhaosc
Thx, it helps a lot.

Be careful of the epsilon guys.

jiezhicheng · 2017-12-22T07:15:05Z

@catsdogone May I learn if you have successfully converted inception_resnet_v2 from tensorflow to caffe? Thank you.

AddASecond · 2018-02-01T13:35:35Z

when you are using tf.nn.batch_normalization,
tf movingvariance + 0.001 ==> caffe batchnorm bias
tf movingmean ==> caffe batchnorm weights
tf gamma ==> caffe scale weights
tf beta ==> caffe scale bias

zhongchengyong · 2018-03-07T02:57:51Z

Maybe you can see another open source library called MMdnn by Microsoft,it is the link:https://github.com/Microsoft/MMdnn

AddASecond · 2018-03-07T03:16:54Z

@giticaniup yes I‘ve already down it with MMdnn, which is much easier and intuitive. The only question exists is how to make the image pre-process in tensorflow equals to that in caffe?

cjerry1243 · 2018-04-20T05:17:15Z

I'm now converting tf mobilenet-v2 to caffe model.
I use the protrotxt here (https://github.com/shicai/MobileNet-Caffe)
and have converted all the params correctly, but cannot get the same accuracy

Do I miss any detail?

AddASecond · 2018-04-20T05:39:38Z

@cjerry1243 I have tried that before thus strongly recommend not to waste time using this repository on training Mobilenet in caffe. This repository uses caffe built-in group convolution, where the depthwise convolution implementation is a non-parallel 'for loop', also do not have a good cuda/cudnn support. The training process was very slow, so that it will take a lot of time to tune hyperparameters. Maybe you should try https://github.com/yonghenglh6/DepthwiseConvolution or other implementations instead.

cjerry1243 · 2018-04-20T07:18:41Z

@bobauditore Thanks for your advice.
I still want to convert movilenet-v2 ckpt to caffe model

Except for the different "depthwise convolution"in that repository(https://github.com/shicai/MobileNet-Caffe), I found another problem during conversion.

The first conv layer output values
(caffe: net.params['conv1'][0].data and tf: sess.graph.get_tensor_by_name('MobilenetV2/Conv/Conv2D:0')) are different when I feed in the same preprocessed image. The only difference of the image input is channel last and first for tf and caffe.

Besides, I use np.swapaxes to swap tf variables and feed into caffe variables:
tf_var_shape: (height, width, depth, channel)
caffe_var shape: (channel, depth, height, width)

Where's the mistake of my conversion ?

KeyKy · 2018-05-20T03:04:57Z

@cjerry1243 Do you solve the problem?

cjerry1243 · 2018-05-20T06:56:04Z

@KeyKy I have solved it !
The padding method is different in tf and caffe. Use pad=2 and followed by 2 slice layers when the conv stride=2, pad=1 in tf.

KeyKy · 2018-05-21T03:39:32Z

@cjerry1243 could you share your code with me? It is hard to understand.

update: I see! Thanks!

yzsatgithub · 2018-12-24T09:35:22Z

Does anyone have any update about new tools to convert tf models to caffemodel? It's been a long time since the question was posted.

sanchom mentioned this issue Dec 3, 2017

Converted tensorflow model's output does not match the caffe output #144

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to convert tensorflow model to caffe model? #59

how to convert tensorflow model to caffe model? #59

cyh24 commented Sep 9, 2016

Dagalaki commented Dec 14, 2016

soldier828 commented Dec 27, 2016

catsdogone commented Feb 14, 2017 •

edited

ethereon commented Feb 15, 2017

catsdogone commented Feb 21, 2017 •

edited

Jerryzcn commented Mar 23, 2017 •

edited

sskgit commented Apr 5, 2017

Jerryzcn commented Apr 7, 2017

nyyznyyz1991 commented May 19, 2017 •

edited

Jerryzcn commented Jun 10, 2017

neobarney commented Jun 19, 2017

Jerryzcn commented Jun 19, 2017

neobarney commented Jun 19, 2017

Jerryzcn commented Jun 20, 2017

neobarney commented Jun 21, 2017 via email •

edited

zmlmanly commented Jul 3, 2017

Jerryzcn commented Jul 4, 2017

zmlmanly commented Jul 4, 2017

zmlmanly commented Jul 4, 2017

Jerryzcn commented Jul 4, 2017 •

edited

MayankSingal commented Oct 8, 2017

jzhaosc commented Oct 14, 2017

lhCheung1991 commented Oct 31, 2017

jiezhicheng commented Dec 22, 2017

AddASecond commented Feb 1, 2018

zhongchengyong commented Mar 7, 2018

AddASecond commented Mar 7, 2018 •

edited

cjerry1243 commented Apr 20, 2018

AddASecond commented Apr 20, 2018

cjerry1243 commented Apr 20, 2018

KeyKy commented May 20, 2018

cjerry1243 commented May 20, 2018 •

edited

KeyKy commented May 21, 2018 •

edited

yzsatgithub commented Dec 24, 2018

how to convert tensorflow model to caffe model? #59

how to convert tensorflow model to caffe model? #59

Comments

cyh24 commented Sep 9, 2016

Dagalaki commented Dec 14, 2016

soldier828 commented Dec 27, 2016

catsdogone commented Feb 14, 2017 • edited

ethereon commented Feb 15, 2017

catsdogone commented Feb 21, 2017 • edited

Jerryzcn commented Mar 23, 2017 • edited

sskgit commented Apr 5, 2017

Jerryzcn commented Apr 7, 2017

nyyznyyz1991 commented May 19, 2017 • edited

Jerryzcn commented Jun 10, 2017

neobarney commented Jun 19, 2017

Jerryzcn commented Jun 19, 2017

neobarney commented Jun 19, 2017

Jerryzcn commented Jun 20, 2017

neobarney commented Jun 21, 2017 via email • edited

zmlmanly commented Jul 3, 2017

Jerryzcn commented Jul 4, 2017

zmlmanly commented Jul 4, 2017

zmlmanly commented Jul 4, 2017

Jerryzcn commented Jul 4, 2017 • edited

MayankSingal commented Oct 8, 2017

jzhaosc commented Oct 14, 2017

lhCheung1991 commented Oct 31, 2017

jiezhicheng commented Dec 22, 2017

AddASecond commented Feb 1, 2018

zhongchengyong commented Mar 7, 2018

AddASecond commented Mar 7, 2018 • edited

cjerry1243 commented Apr 20, 2018

AddASecond commented Apr 20, 2018

cjerry1243 commented Apr 20, 2018

KeyKy commented May 20, 2018

cjerry1243 commented May 20, 2018 • edited

KeyKy commented May 21, 2018 • edited

yzsatgithub commented Dec 24, 2018

catsdogone commented Feb 14, 2017 •

edited

catsdogone commented Feb 21, 2017 •

edited

Jerryzcn commented Mar 23, 2017 •

edited

nyyznyyz1991 commented May 19, 2017 •

edited

neobarney commented Jun 21, 2017 via email •

edited

Jerryzcn commented Jul 4, 2017 •

edited

AddASecond commented Mar 7, 2018 •

edited

cjerry1243 commented May 20, 2018 •

edited

KeyKy commented May 21, 2018 •

edited