Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

two questions about the indirection in dwconv #236

Closed
dongxiao92 opened this issue Dec 4, 2019 · 7 comments
Closed

two questions about the indirection in dwconv #236

dongxiao92 opened this issue Dec 4, 2019 · 7 comments

Comments

@dongxiao92
Copy link

I'm trying to understand the indirect convolution algorithm used in xnnpack. It's a cool idea to implement convolution and thanks for contributing this project!
During the code reading, I find a few questions about the implementation of indirect convolution. I list them below.

  1. in XNNPACK/bench/f32-dwconv.cc, line 69
    I think the step_height represents how many pointers are for one single row of the output. But I cannot understand why it is calculated as kernel_size + (output_width * step_width - 1) * kernel_height. If I understand it correctly, it should be kernel_size + ((output_width-1) * step_width) * kernel_height. The first kernel_size is for one complete convolution window and the following part computes how many new pointers are needed in each step. Please correct me if I do wrong.

  2. in XNNPACK/src/indirection.c: xnn_indirection_init_dwconv2d
    In this function, we compute the input spatial location (input_x, input_y). The code checks if it's outside the input(input_x < input_width, input_y < input_height) but it does not compare input_x and input_y to zero. For example, when we compute the input spatial location for the output location (0, 0) and the padding size is not zero(input_padding_top>0, input_padding_left>0), the input_x and input_y will be negative. Is the corresponding address used to set indirection_buffer right in this situation?

@Maratyszcza
Copy link
Contributor

I'd need to look closer into 1.

Regarding 2. input_x and input_y are computed in unsigned arithmetic, where negative numbers wrap and become very big positive numbers. Thus, the comparison input_x < input_width is correct: if input_x is negative when viewed as a signed number, it is a very large positive number when viewed as an unsigned number, and input_x < input_width (unsigned comparison) is false.

@dongxiao92
Copy link
Author

@Maratyszcza Thanks for your reply. I didn't notice the 'size_t' type. So using unsigned type here can reduce one branch which should have been added, right?

@Maratyszcza
Copy link
Contributor

Right, using unsigned type removes half of comparisons (and branches)

@dongxiao92
Copy link
Author

dongxiao92 commented Dec 5, 2019

Thansk, I get this brilliant trick. What about the first problem?

@Maratyszcza
Copy link
Contributor

The first problem is a bug. It should be kernel_size + (output_width-1) * step_width * kernel_height as you suggested.

@dongxiao92
Copy link
Author

dongxiao92 commented Dec 5, 2019

@Maratyszcza got it. Thanks!

@Maratyszcza
Copy link
Contributor

Fixed in 03ff294

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants