-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bias term in Conv operator #54
Comments
Okay, seems like bias terms are explicitly excluded from convolution op and registered as an 'add' op. Is it on purpose? |
@zhreshold It isn't missing, the reason that we still have number of inputs 2 and optional 3, is to avoid breaking PyTorch or Caffe2. In my previous PR, I updated the number of inputs to only have 2 inputs and I revert it back temporarily in order to avoid breaking current back-end. So yes it is in purpose for bias simply use add op. |
This is related to the situation at #24, where there is a tension between keeping the operator definitions minimal (and thus easier to implement from first principles) versus having fused operators which accurately reflect the SoA implementations that everyone uses (e.g. CuDNN) |
@zhreshold How is Caffe2 different from other frameworks regarding to retrieving these information in the graph? |
@bddppq Is there anyway I can get the # channles for convolution or # hidden nodes in FC in GraphProto without checking the shape of weights? |
@zhreshold Unfortunately not. If you have a need for it, we should consider adding it; there is currently a debate about how much information should/should not go in operators; the more info we add, the harder it is for producers but the easier it is for consumers. Use cases will help us make a better decision. |
FC is experimental and we put there temporarily because Caffe depend on it. But I was surprised that the ONNX model that we share uses FC instead of core ONNX OPs that we agree on. |
PyTorch HEAD has been fixed to use Gemm instead of FC, but I think @zhreshold will have the same problem with Gemm. |
Is there a possible bug in Conv operator defined in ConvOpSchemaGenerator?
Input 2 as bias missing here?
BTW, there's no 'num_output' attribute for conv, conv_transpose, fullyconnected, and no 'use_bias' attribute as well.
In other words, these attributes are not retrievable in graph, instead, we have to parse these attributes from the initializers, this design is obviously not friendly to frameworks other than caffe2.
Am I missing something?
The text was updated successfully, but these errors were encountered: