Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cafemodel directly #72

Open
waTeim opened this issue Jul 14, 2015 · 21 comments
Open

Support cafemodel directly #72

waTeim opened this issue Jul 14, 2015 · 21 comments

Comments

@waTeim
Copy link

waTeim commented Jul 14, 2015

Yes, asked before in #55, and yes caffemodels can be converted. Not good enough. But looks like they can be read directly using Julia ProtoBuf.jl see JuliaIO/ProtoBuf.jl#48.

To confirm, what test would you propose?

@waTeim
Copy link
Author

waTeim commented Jul 15, 2015

Following up, I wrote a little exploratory stuff and here are the layer types of GoogleNet. InnerProducts are supported, and I guess Convolutions. What about the rest? Looks like in addition there's DATA POOLING RELU SPLIT SOFTMAX_LOSS LRN CONCAT DROPOUT

julia> import CaffeOperations;
julia> x = CaffeOperations.loadCaffeeNetwork("bvlc_googlenet.caffemodel");

julia> reshape(CaffeOperations.layerTypes(x),(13,13))
13x13 Array{Symbol,2}:
 :DATA         :CONVOLUTION  :CONCAT       :CONVOLUTION  :CONVOLUTION    :INNER_PRODUCT  …  :RELU           :DROPOUT        :POOLING      :RELU         :RELU         
 :SPLIT        :RELU         :SPLIT        :RELU         :RELU           :SOFTMAX_LOSS      :CONVOLUTION    :INNER_PRODUCT  :CONVOLUTION  :CONVOLUTION  :CONVOLUTION  
 :CONVOLUTION  :CONVOLUTION  :CONVOLUTION  :CONCAT       :POOLING        :CONVOLUTION       :RELU           :SOFTMAX_LOSS   :RELU         :RELU         :RELU         
 :RELU         :RELU         :RELU         :POOLING      :CONVOLUTION    :RELU              :POOLING        :CONVOLUTION    :CONCAT       :POOLING      :CONVOLUTION  
 :POOLING      :CONVOLUTION  :CONVOLUTION  :SPLIT        :RELU           :CONVOLUTION       :CONVOLUTION    :RELU           :POOLING      :CONVOLUTION  :RELU         
 :LRN          :RELU         :RELU         :CONVOLUTION  :CONCAT         :RELU           …  :RELU           :CONVOLUTION    :SPLIT        :RELU         :POOLING      
 :CONVOLUTION  :CONVOLUTION  :CONVOLUTION  :RELU         :SPLIT          :CONVOLUTION       :CONCAT         :RELU           :CONVOLUTION  :CONCAT       :CONVOLUTION  
 :RELU         :RELU         :RELU         :CONVOLUTION  :POOLING        :RELU              :SPLIT          :CONVOLUTION    :RELU         :SPLIT        :RELU         
 :CONVOLUTION  :CONVOLUTION  :CONVOLUTION  :RELU         :CONVOLUTION    :CONVOLUTION       :POOLING        :RELU           :CONVOLUTION  :CONVOLUTION  :CONCAT       
 :RELU         :RELU         :RELU         :CONVOLUTION  :RELU           :RELU              :CONVOLUTION    :CONVOLUTION    :RELU         :RELU         :POOLING      
 :LRN          :POOLING      :CONVOLUTION  :RELU         :INNER_PRODUCT  :CONVOLUTION    …  :RELU           :RELU           :CONVOLUTION  :CONVOLUTION  :DROPOUT      
 :POOLING      :CONVOLUTION  :RELU         :CONVOLUTION  :RELU           :RELU              :INNER_PRODUCT  :CONVOLUTION    :RELU         :RELU         :INNER_PRODUCT
 :SPLIT        :RELU         :POOLING      :RELU         :DROPOUT        :POOLING           :RELU           :RELU           :CONVOLUTION  :CONVOLUTION  :SOFTMAX_LOSS 

julia> x.name
"GoogleNet"

@pluskid
Copy link
Owner

pluskid commented Jul 15, 2015

All the layers mentioned here are supported. Checkout the I Julia notebook of pretrained image net model for example of the correspondence.

@waTeim
Copy link
Author

waTeim commented Jul 15, 2015

Yea I'm reading the docs now looks like there's a translation possible, I'm looking at them one by one.

So far looks like all of the convolution layers have 2 blobs associated with them is that expected?

As far as the Xavier filler <--> Initializer, looks like the caffe model allows parameterization?

dump(x.layers[9].convolution_param)
...
  weight_filler: CaffeOperations.caffe.FillerParameter 
    _type: ASCIIString "xavier"
    value: Float32 0.0
    min: Float32 0.0
    max: Float32 1.0
    mean: Float32 0.0
    std: Float32 0.03      <--- here
    sparse: Int32 -1
    variance_norm: Int32 0

Do you have a URL for that notebook?

@pluskid
Copy link
Owner

pluskid commented Jul 15, 2015

Sorry I'm currently traveling and do not have a computer. So I'll try to be brief.

There is a link to the notebook in the tutorial section of the doc. Currently xavier layer is not customizable I believe, but it should be very easy to add a parameter.

Yes convolutional layer expect two blocks, but you can always set the bias blob to zero if do not need it.

@waTeim
Copy link
Author

waTeim commented Jul 17, 2015

that's fine, the caffe file does have 2 blobs. Bias blob? Does that correspond to bottom?

make_blob(backend, ...  

Should this have a default value of whatever the current backend is?

@pluskid
Copy link
Owner

pluskid commented Jul 18, 2015

Yes caffe has two blobs for convolution. They are not bottoms, bottoms are input blobs, what we were talking about are parameter blobs.

I'm not sure I like the idea of a global backend. The idea is a user should supply an initialized backend whenever he wAnted to do something important. I think it is perfectly fine for the function that converting caffe model to accept a backend parameter.

@waTeim
Copy link
Author

waTeim commented Jul 21, 2015

So how are the parameter blobs connected to a Convolution layer. I see the only candidates are bottom and top. If not those then what else is there?

@pluskid
Copy link
Owner

pluskid commented Jul 22, 2015

@waTeim filters and bias are parameters of a layer. For example, in an InnerProductLayer, top = parameter * bottom. There are three kinds of blobs: input (bottom), output (top), and weight/filters (parameters).

@waTeim
Copy link
Author

waTeim commented Jul 22, 2015

The part I'm having trouble with is the mapping. Here's the ProtoBuf description. There is a blobs field in the layers section. When read this field is populated with 2 blobs. Which is which? They're no labeled

julia> size(x.layers[9].blobs)
(2,)

Bottoms and tops are set to arrays of symbols which I think refer to some index, how do the blobs get associated with those symbols. Does the .ipynb make it clear?

Current, maybe wrong

  return Mocha.ConvolutionLayer(
   name = caffeLayer.name,
   n_filter = Int(caffeLayer.convolution_param.num_output),
   kernel = kernel,
   pad = pad,
   stride = stride,
   filter_init = newInitializer(caffeLayer.convolution_param.weight_filler),
   bias_init = biasInitializer,
   tops = getLayerRefList(caffeLayer.top),
   bottoms = getLayerRefList(caffeLayer.bottom)
  );

@pluskid
Copy link
Owner

pluskid commented Jul 22, 2015

Is the x object a mocha net or a caffe net? In mocha layer state, there is a field called blobs which hold reference to output blobs, but you don't need to care about them as they will be created automatically. In contrary, in caffe, iirc, the blobs fields holds the parameter blobs. You can do the following things with it:

  1. Ignore it, as the parameter blobs will be created automatically according to the specification such as n-filter, etc
  2. You may do cross checking to make sure that the shape of the parameter blobs matches the specification of layer definition. Eg is the n-filter parameter correct?
  3. If the caffe file contains a already trained model, you can actually copy those blobs out and use a customized initializer for the parameter blobs so that they are filled with those trained parameters instead of random init values.

@waTeim
Copy link
Author

waTeim commented Jul 23, 2015

x is a parsed trained caffe net, so looks like option 3. Is this simply a matter of creating a new Initializer type?

@pluskid
Copy link
Owner

pluskid commented Jul 23, 2015

Yes, the easiest way I can imagine is to create an initializer that simply copy the content of an existing array to the target blob being initialized. Something roughly like

ConvolutionLayer(..., filter_init=CopyInitializer(caffe_layer.blobs[1]), bias_init=CopyInitializer(caffe_layer.blobs[2]),...)

@waTeim
Copy link
Author

waTeim commented Aug 9, 2015

Took a while, but I'm back on it. This look about right?

immutable CopyInitializer <: Mocha.Initializer
   caffeBlob::caffe.BlobProto
end

function init(initializer::CopyInitializer,blob::Mocha.Blob)
   Mocha.fill!(blob,initializer.caffeBlob.data)
end

@pluskid
Copy link
Owner

pluskid commented Aug 9, 2015

Yes, maybe small modifications

  • I'm not sure whether the data in caffe.BlobProto will retain after you close the protobuffer file. You might need to copy the data into a Julia array and hold the Julia array in your CopyInitializer instead.
  • You should use Mocha.copy! instead of fill! as fill! is only used to fill everywhere with a scalar.

@waTeim
Copy link
Author

waTeim commented Aug 9, 2015

I'm pretty sure that by the time it gets into to caffe.BlobProto it's an array that exists independent of the file, normal GC applies. Re using copy, yea asy change.

Coming up next is the Inner Product, layer type which seems to be straightforward, except it's not clear to me that caffe's num_output is equivalent to Mocha's dim, though it did appear to be the only choice left.

Here's the Protobuf stuff:

type InnerProductParameter
    num_output::UInt32
    bias_term::Bool
    weight_filler::FillerParameter
    bias_filler::FillerParameter
    axis::Int32

From caffe's docs:

Parameters (InnerProductParameter inner_product_param)
  • Required
    num_output (c_o): the number of filters
  • Strongly recommended
    weight_filler [default type: 'constant' value: 0]
  • Optional
    bias_filler [default type: 'constant' value: 0]
    bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs

@pluskid
Copy link
Owner

pluskid commented Aug 10, 2015

@waTeim Yes, num_output is exactly output_dim, and similar as before, the fillers correspond to initializers in Mocha.

@waTeim
Copy link
Author

waTeim commented Aug 16, 2015

Last layer type, it's the data layer type and comes with a Transformation Parameter:

type TransformationParameter
    scale::Float32
    mirror::Bool
    crop_size::UInt32
    mean_file::AbstractString
    mean_value::Array{Float32,1}
    force_color::Bool
    force_gray::Bool
    TransformationParameter() = (o=new(); fillunset(o); o)
end #type TransformationParameter

A number of these things don't appear to be supported, but scale and mean. It looks like Caffe assumes both of these things happen simultaneously while mocha appears to want to apply one then the other (assuming mean subtraction followed by scaling). Caffe appears to have multiple mean values (1 per channel?) while Mocha want's a blob.

What's the expected format of this blob?

@waTeim
Copy link
Author

waTeim commented Aug 16, 2015

Limited success.

  1. To keep things simple I used cifar10_nin.caffemodel from Model Zoo
  2. The output can be seen here.
  3. I just arbitrarily picked input blob dimensions of 10x10x1x1 which is almost certainly wrong.

the critical line is this

 x = CaffeOperations.convertCaffeNetwork("cifar10_nin.caffemodel",[(10,10,1,1),(10,10,1,1)]);

How do I determine the input blobs dims. This comes from the data?

@pluskid
Copy link
Owner

pluskid commented Aug 17, 2015

scale and mean can be mapped to DataTransformers in Mocha.

Caffe specifies everything together, but technically they cannot happen "together". For example, caffe subtract the mean first, and then do re-scaling. See their code here: https://github.com/BVLC/caffe/blob/master/src/caffe/data_transformer.cpp#L113

Yes, Mocha data transformer expect a mean blob, which should be of the same shape as the data point. Specifically, for image data, we can make this blob by duplicating values for channels at each pixel location. For example,

mean_channels = [1,2,3] # a is an array of mean values for each of the RGB channel
img_width = 256
img_height = 256
mean_channels = reshape(mean_channels, (1,1,3)) # make it proper shape
mean_img = repeat(mean_channels, inner=[img_width,img_height,1]) # of proper layout for mean_blob

crop option can be supported by the CropLayer in Mocha.

force_color and force_gray are not supported yet.

@pluskid
Copy link
Owner

pluskid commented Aug 17, 2015

@waTeim That is brilliant! I'm not sure why do you need to decide the input blob dims? I'm not sure whether Caffe model stored this information somewhere. They will be automatically determined when the program start reading data from the HDF5 files. Do you mean you need this shape information in the data transformer?

@waTeim
Copy link
Author

waTeim commented Aug 17, 2015

Hey, thanks! As far as dims, I kinda brought it on myself as I'm trying to remain agnostic as much as I can, and therefore as using MemoryDataLayer. Potentially I can use LevelDB directly as well with some additional help.

Here's the still primitive function in question:

function newDataLayer(caffeLayer::caffe.V1LayerParameter,dims)
   data = Vector{Array}();
   for i = 1:length(dims)
      push!(data,Array(Float32,dims[i]))
   end
   transformers::Vector = [];
   if ProtoBuf.has_field(caffeLayer,:transform_param)
      scale = Float32(caffeLayer.transform_param.scale)
      push!(transformers,Mocha.DataTransformers.Scale(scale));
   end
   return Mocha.MemoryDataLayer(
    name = caffeLayer.name,
    batch_size = 1,
    data = data,    
    transformers = transformers,
    tops = getLayerRefList(caffeLayer.top)
   );
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants