Skip to content

Conversation

@nsthorat
Copy link
Contributor

@nsthorat nsthorat commented Nov 6, 2019

XNNPack expects kernels in the following format:
[output channels, kernel height, kernel width, input channels]

TensorFlow and TensorFlow.js use the following format:
[kernel height, kernel width, input channels, output channels]

This PR transposes the filter when an XNNPack kernel is created. Since XNNPack keeps a copy of the filter we transpose, call xnn pack, and immediately throw out the transposed kernel.

To share transposed I moved the body of Transpose.cc to a separate transpose_impl.cc/h (they must be named differently beyond capitalization or bazel gets confused).

This change is Reviewable

Nikhil Thorat added 2 commits November 6, 2019 15:05
Nikhil Thorat added 3 commits November 6, 2019 17:41
@nsthorat nsthorat changed the title WIP: Conv2d bug [WIP] Transpose the filter of the convolution before calling xnnpack. Nov 6, 2019
@googlebot
Copy link

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: no and removed cla: yes labels Nov 7, 2019
Nikhil Thorat added 3 commits November 7, 2019 10:07
@nsthorat
Copy link
Contributor Author

nsthorat commented Nov 7, 2019

@googlebot I consent.

@nsthorat nsthorat requested review from Maratyszcza, annxingyuan and dsmilkov and removed request for dsmilkov November 7, 2019 15:20
@nsthorat nsthorat changed the title [WIP] Transpose the filter of the convolution before calling xnnpack. [WASM] Transpose the filter of the convolution before calling xnnpack. Nov 7, 2019
Copy link
Collaborator

@Maratyszcza Maratyszcza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is more complicated than I expected. It would be sufficient to always do just a 2D transpose [M, N] -> [N, M], where M = kernel height, kernel width, input channels and N = output channels.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, and @Maratyszcza)

@nsthorat
Copy link
Contributor Author

nsthorat commented Nov 7, 2019

Most of the complexity is actually just moving transpose to a shared implementation, however you are right a direct 2d transpose is possible here (we do this internally anyways) but just to be sure I made it 2d.

Copy link
Collaborator

@Maratyszcza Maratyszcza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, and @nsthorat)


tfjs-backend-wasm/src/cc/kernels/Conv2D.cc, line 112 at r4 (raw file):

    // This can be transposed with a 2d transpose to move output_channels to the
    // outer most dimension.
    float* transposed_filter = new float[filter_info.size]();

It is better to avoid directly allocating memory via new as it can lead to memory leaks. A safer way would be to create an std::vector<float> transposed_filter(filter_info.size()). The memory for the vector will be automatically released when it goes out of scope.

Copy link
Collaborator

@Maratyszcza Maratyszcza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: for XNNPACK-related parts

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, and @nsthorat)

Copy link
Contributor Author

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, and @Maratyszcza)


tfjs-backend-wasm/src/cc/kernels/Conv2D.cc, line 112 at r4 (raw file):

Previously, Maratyszcza (Marat Dukhan) wrote…

It is better to avoid directly allocating memory via new as it can lead to memory leaks. A safer way would be to create an std::vector<float> transposed_filter(filter_info.size()). The memory for the vector will be automatically released when it goes out of scope.

Thanks! Done!

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! That transpose should be pretty fast and it's only a one-time cost.

Reviewed 4 of 11 files at r1, 1 of 6 files at r3, 2 of 3 files at r4, 1 of 1 files at r5.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, and @Maratyszcza)

@dsmilkov dsmilkov added cla: yes and removed cla: no labels Nov 7, 2019
@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@nsthorat nsthorat merged commit 0b0a569 into master Nov 7, 2019
@nsthorat nsthorat deleted the conv2d-bug branch November 7, 2019 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants