Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add grouped binary convolution support (1/3): the converter. #549

Merged
merged 1 commit into from
Nov 5, 2020

Conversation

AdamHillier
Copy link
Contributor

@AdamHillier AdamHillier commented Oct 22, 2020

What do these changes do?

This is first of a group of PRs to add support for grouped binary convolutions. I've split the work into three PRs to make review easier.

First, this PR adds support to the converter, by adding an attribute groups to the op definition. The correct value for groups is infered by dividing the input channels by the channels_in dimension of the filter. An error is raised if the size of each group (the input channels divided by groups) is not a multiple of 32 -- I have chosen to add this constraint because it significantly simplifies the implementation in the kernels.

How Has This Been Tested?

MLIR file-check tests have been added to check that the attribute is correctly set, and that the error is raised if the group-size is not a multiple of 32.

Benchmark Results

N/A.

Related issue number

#550, #551.

Comment on lines 128 to 126
if (total_input_channels % filter_input_channels != 0) {
mlir::emitError(filter_val.getLoc())
<< "Filter dimensions invalid: the number of filter input channels "
<< filter_input_channels
<< " does not divide the total number of input channels "
<< total_input_channels << "\n";
num_groups = 1;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically I don't think this is required, because the Conv2D op throws an error if this condition is violated, but it's probably better safe than sorry.

Comment on lines 136 to 134
if (num_groups > 1 && filter_input_channels % bitpacking_bitwidth != 0) {
mlir::emitError(filter_val.getLoc())
<< "Invalid binary group convolution: the number of input channels "
"per-group "
"must be a multiple of "
<< bitpacking_bitwidth << ", but is " << filter_input_channels << "\n";
num_groups = 1;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could choose not to throw an error here, and simply not apply the transformation if this condition is violated, so that there will be a float op in the model. I'm not sure what the right approach is.

@AdamHillier
Copy link
Contributor Author

Upon reflection and discussions with @Tombana, there's actually technically no need to have a groups argument, but only because of the constraint that the input-depth-per-group (if the number of groups is greater than one) is a multiple of 32. If we know we'll never see a flatbuffer with a grouped convolution that has a non-multiple-of-32 group size, then we can infer the number of groups from the (bitpacked) filter channels in dimension and (bitpacked) input channels in dimension.

I'm not sure what the right thing to do here is. Conciseness is nice, but I worry that relying on being able to infer the number of groups by relying on that constraint being satisfied might lead to subtle bugs in the future.

@Tombana
Copy link
Collaborator

Tombana commented Oct 23, 2020

Even if we can infer it in theory, I think it's good to have it as a separate parameter, even if only as an extra correctness checksum.

@AdamHillier AdamHillier added the feature New feature or request label Oct 23, 2020
@lgeiger
Copy link
Member

lgeiger commented Nov 2, 2020

I'm not sure what the right thing to do here is. Conciseness is nice, but I worry that relying on being able to infer the number of groups by relying on that constraint being satisfied might lead to subtle bugs in the future.

I'd actually prefer not to have a dedicated group attribute in the op. TensorFlow does the same for their Conv2DOp (although in their case I probably would've prefered a dedicated groups argument for the user facing code for better validation).

I think we can do all the necessary validation in Prepare even without a group argument, since we already have a channels_in argument and the shapes of the bitpacked input and bitpacked filters, or am I missing something a case that we can't catch that could lead to subtle bugs?

In both cases we should always make sure that the number of filters per group is divisible by 32 in the converter anyway.

@AdamHillier
Copy link
Contributor Author

I think I'm onboard with that.

In case somebody tries to convert a grouped Larq bconv with non-multiple-of-32 group size, do you think we should throw a converter error (as this PR currently does) or fall back to an emulated bsign + float conv?

@lgeiger
Copy link
Member

lgeiger commented Nov 2, 2020

In case somebody tries to convert a grouped Larq bconv with non-multiple-of-32 group size, do you think we should throw a converter error (as this PR currently does) or fall back to an emulated bsign + float conv?

I think it's fair to throw an error for now since TFLite doesn't support group convolutions anyway so it would fail during runtime anyway.

Add a validation check to the converter to ensure that grouped
convolutions have a group size that is a multiple of 32, and report
an error otherwise.
@AdamHillier
Copy link
Contributor Author

@lgeiger I've updated the PR so that it only performs the check, rather than adding the groups attribute.

Copy link
Member

@lgeiger lgeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

Do you think it would make sense to move this check into an op verifier for the bconv op as mentioned in #406 instead? This way the check would be op specific instead of limited to this transformation. Although I am not sure if we still have all of the information there since the bconv op will already have bitpacked inputs and migth either have float or binary filters.

@AdamHillier
Copy link
Contributor Author

AdamHillier commented Nov 5, 2020

Do you think it would make sense to move this check into an op verifier for the bconv op as mentioned in #406 instead? This way the check would be op specific instead of limited to this transformation. Although I am not sure if we still have all of the information there since the bconv op will already have bitpacked inputs and migth either have float or binary filters.

I really like this idea in theory, but you're right that it won't be possible once the filters have been bitpacked. E.g. the number of input channels could be 64, and the bitpacked filter channels in dimension could be 1, and we don't know if that's because it's 32 input channels to the filter (which is valid) or only 1 that's been padded by bitpacking (1 input channel as in a depthwise conv, which would not be valid).

@lgeiger
Copy link
Member

lgeiger commented Nov 5, 2020

I really like this idea in theory, but you're right that it won't be possible once the filters have been bitpacked.

That's true. In theory I think it woud still work since the filters will be bitpacked in a later pass, but this means that the verifier would be tied to this transformation anyway. Let's keep it like this for now.

@lgeiger lgeiger merged commit e4fb333 into master Nov 5, 2020
@lgeiger lgeiger deleted the grouped-convolutions-converter branch November 5, 2020 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants