-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two questions about the library #148
Comments
|
Okay thank you! |
@Maratyszcza one more question, is the kernel shaped edit: seems to be |
For FullyConnected operator, it is |
Works great! Two more questions, any plans on adding an elementwise product operator? And is it somehow possible to make the FullyConnected add to the the output instead of setting it? |
Elementwise product (including broadcasting support) landed just yesterday, see |
Haha perfect timing! Thanks |
@Maratyszcza seems broadcasting support is not available for the add operation (for when one wants to implement batch normalization for example)? is my impression correct? could disabling the check and setting the bias mean stride to 0 work as quick hack? edit: hah looks like it does! |
For batch normalization, I'd recommend to convert it into 1x1 depthwise convolution. XNNPACK has special optimizations for 1x1 DW convolution. |
I'm not working with images, so my input/output matrixes are only 2D, I assume |
You can set all height dimensions (input height, kernel height, height subsampling) to 1, this would be equivalent to 1D convolution. |
Ah thanks, forgot to check the setup function. |
It seems the F32 GEMM implementation quantizes the input and output? I got that from the usage here, where one has to pass min/max values of the output. I'm worried that the approximation will degrade accuracy significantly (I'm already quantising all the layers I can to int8), still have to test that, but just to confirm is there no SGEMM implementation that doesn't quantise the input and output?
The readme says
however in the code there seems to be implementation for GEMM with int8 weights etc. ? I'm using QNNPACK at the moment for that, would it make sense to switch to XNNPACK for int8 layers?
The text was updated successfully, but these errors were encountered: