Two questions about the library #148

RuABraun · 2019-11-14T01:23:57Z

It seems the F32 GEMM implementation quantizes the input and output? I got that from the usage here, where one has to pass min/max values of the output. I'm worried that the approximation will degrade accuracy significantly (I'm already quantising all the layers I can to int8), still have to test that, but just to confirm is there no SGEMM implementation that doesn't quantise the input and output?
The readme says

XNNPACK is a highly optimized library of floating-point neural network inference operators

however in the code there seems to be implementation for GEMM with int8 weights etc. ? I'm using QNNPACK at the moment for that, would it make sense to switch to XNNPACK for int8 layers?

Maratyszcza · 2019-11-14T08:22:18Z

F32 functions don't quantize computations. However, many F32 operators accept output_min and output_max arguments which enable clamping output to arbitrary range (helpful e.g. for fusing simple activation functions). If you don't want to clamp output, just set them to +-std::numeric_limits<float>::infinity()
XNNPACK is a fork of QNNPACK. Q8 operators in XNNPACK are remnants of QNNPACK code, but many internal optimizations were removed, and performance of these operators would be worse than in QNNPACK.

RuABraun · 2019-11-14T12:20:13Z

Okay thank you!

RuABraun · 2019-11-14T14:07:11Z

@Maratyszcza one more question, is the kernel shaped (input_dim, output_dim) or (output_dim, input_dim) ?

edit: seems to be (output_dim, input_dim)

Maratyszcza · 2019-11-14T18:04:22Z

For FullyConnected operator, it is (output_dim, input_dim). Generally operators use NHWC layout.

RuABraun · 2019-11-16T14:12:52Z

Works great!

Two more questions, any plans on adding an elementwise product operator? And is it somehow possible to make the FullyConnected add to the the output instead of setting it?

Maratyszcza · 2019-11-16T22:33:42Z

Elementwise product (including broadcasting support) landed just yesterday, see xnn_create_multiply_nd_f32 and xnn_setup_multiply_nd_f32. Currently Fully Connected operator doesn't support fused addition, but you can use separate Add operator (see xnn_create_add_nc_f32 and xnn_setup_add_nc_f32)

RuABraun · 2019-11-17T13:17:12Z

Haha perfect timing! Thanks

RuABraun · 2019-11-21T18:19:07Z

@Maratyszcza seems broadcasting support is not available for the add operation (for when one wants to implement batch normalization for example)? is my impression correct?

could disabling the check and setting the bias mean stride to 0 work as quick hack?

edit: hah looks like it does!

Maratyszcza · 2019-11-21T18:50:50Z

For batch normalization, I'd recommend to convert it into 1x1 depthwise convolution. XNNPACK has special optimizations for 1x1 DW convolution.

RuABraun · 2019-11-21T23:13:47Z

I'm not working with images, so my input/output matrixes are only 2D, I assume xnn_create_convolution2d... would crash then (don't see a 1D version)? Don't see an option to set stride to 0 in one direction.

Maratyszcza · 2019-11-21T23:27:13Z

You can set all height dimensions (input height, kernel height, height subsampling) to 1, this would be equivalent to 1D convolution.

RuABraun · 2019-11-22T00:00:16Z

Ah thanks, forgot to check the setup function.

RuABraun closed this as completed Nov 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two questions about the library #148

Two questions about the library #148

RuABraun commented Nov 14, 2019

Maratyszcza commented Nov 14, 2019 •

edited

RuABraun commented Nov 14, 2019

RuABraun commented Nov 14, 2019 •

edited

Maratyszcza commented Nov 14, 2019

RuABraun commented Nov 16, 2019

Maratyszcza commented Nov 16, 2019

RuABraun commented Nov 17, 2019

RuABraun commented Nov 21, 2019 •

edited

Maratyszcza commented Nov 21, 2019

RuABraun commented Nov 21, 2019 •

edited

Maratyszcza commented Nov 21, 2019 •

edited

RuABraun commented Nov 22, 2019

Two questions about the library #148

Two questions about the library #148

Comments

RuABraun commented Nov 14, 2019

Maratyszcza commented Nov 14, 2019 • edited

RuABraun commented Nov 14, 2019

RuABraun commented Nov 14, 2019 • edited

Maratyszcza commented Nov 14, 2019

RuABraun commented Nov 16, 2019

Maratyszcza commented Nov 16, 2019

RuABraun commented Nov 17, 2019

RuABraun commented Nov 21, 2019 • edited

Maratyszcza commented Nov 21, 2019

RuABraun commented Nov 21, 2019 • edited

Maratyszcza commented Nov 21, 2019 • edited

RuABraun commented Nov 22, 2019

Maratyszcza commented Nov 14, 2019 •

edited

RuABraun commented Nov 14, 2019 •

edited

RuABraun commented Nov 21, 2019 •

edited

RuABraun commented Nov 21, 2019 •

edited

Maratyszcza commented Nov 21, 2019 •

edited