Bias term in Conv operator #54

zhreshold · 2017-09-24T22:14:47Z

Is there a possible bug in Conv operator defined in ConvOpSchemaGenerator?

schema.NumInputs(2, 3);
            schema.NumOutputs(1);
            schema.Input(0,
                         "X",
                         "Input data tensor from previous layer; has size (N x C x H x W)"
                         ", where N is the batch size, C is the number of channels, and"
                         " H and W are the height and width. Note that this is for the 2D image."
                         "Otherwise the size is (N x D1 x D2 ... x Dn)");
            schema.Input(1,
                         "weights",
                         "The weight tensor that will be used in the convolutions; "
                         "has size (M x C x kH x kW), where C is the number of channels, "
                         "and kH and kW are the height and width of the kernel, and M is the number "
                         "of feature maps. For more than 2 dimensions, the kernel shape will be "
                         "(M x C x k1 x k2 x ... x kn), where is the dimension of the kernel");
            schema.Output(0,
                          "Y",
                          "Output data tensor that contains the result of the convolution. The "
                          "output dimensions are functions of the kernel size, stride size, "
                          "and pad lengths.");

Input 2 as bias missing here?

BTW, there's no 'num_output' attribute for conv, conv_transpose, fullyconnected, and no 'use_bias' attribute as well.
In other words, these attributes are not retrievable in graph, instead, we have to parse these attributes from the initializers, this design is obviously not friendly to frameworks other than caffe2.

Am I missing something?

The text was updated successfully, but these errors were encountered:

zhreshold · 2017-09-24T22:19:38Z

Okay, seems like bias terms are explicitly excluded from convolution op and registered as an 'add' op. Is it on purpose?

ebarsoum · 2017-09-25T02:04:34Z

@zhreshold It isn't missing, the reason that we still have number of inputs 2 and optional 3, is to avoid breaking PyTorch or Caffe2. In my previous PR, I updated the number of inputs to only have 2 inputs and I revert it back temporarily in order to avoid breaking current back-end. So yes it is in purpose for bias simply use add op.

ezyang · 2017-09-25T15:54:49Z

This is related to the situation at #24, where there is a tension between keeping the operator definitions minimal (and thus easier to implement from first principles) versus having fused operators which accurately reflect the SoA implementations that everyone uses (e.g. CuDNN)

bddppq · 2017-09-25T18:17:38Z

@zhreshold How is Caffe2 different from other frameworks regarding to retrieving these information in the graph?

zhreshold · 2017-09-25T18:25:29Z

@bddppq Is there anyway I can get the # channles for convolution or # hidden nodes in FC in GraphProto without checking the shape of weights?
In the example of onnx-caffe2, https://github.com/onnx/onnx-caffe2/blob/master/onnx_caffe2/backend.py, it seems like no such kind of information is required for caffe2?

ezyang · 2017-10-09T01:38:47Z

@zhreshold Unfortunately not. If you have a need for it, we should consider adding it; there is currently a debate about how much information should/should not go in operators; the more info we add, the harder it is for producers but the easier it is for consumers. Use cases will help us make a better decision.

ebarsoum · 2017-10-11T05:27:54Z

FC is experimental and we put there temporarily because Caffe depend on it. But I was surprised that the ONNX model that we share uses FC instead of core ONNX OPs that we agree on.

ezyang · 2017-10-11T15:33:52Z

PyTorch HEAD has been fixed to use Gemm instead of FC, but I think @zhreshold will have the same problem with Gemm.

bddppq closed this as completed Aug 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bias term in Conv operator #54

Bias term in Conv operator #54

zhreshold commented Sep 24, 2017

zhreshold commented Sep 24, 2017

ebarsoum commented Sep 25, 2017

ezyang commented Sep 25, 2017

bddppq commented Sep 25, 2017 •

edited

Loading

zhreshold commented Sep 25, 2017

ezyang commented Oct 9, 2017

ebarsoum commented Oct 11, 2017

ezyang commented Oct 11, 2017

Bias term in Conv operator #54

Bias term in Conv operator #54

Comments

zhreshold commented Sep 24, 2017

zhreshold commented Sep 24, 2017

ebarsoum commented Sep 25, 2017

ezyang commented Sep 25, 2017

bddppq commented Sep 25, 2017 • edited Loading

zhreshold commented Sep 25, 2017

ezyang commented Oct 9, 2017

ebarsoum commented Oct 11, 2017

ezyang commented Oct 11, 2017

bddppq commented Sep 25, 2017 •

edited

Loading