-
Notifications
You must be signed in to change notification settings - Fork 16
[8.0] Add NEON support fot uint8/int8 [MOD-9081] #652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Change subnet id * Now subnet * Change subnet * add subnet * Try group id * Change to vpc id * change subnet * Change ami * Try without subnet * add security group again * Change the subnets * Change to ids * Change sg * psubnet * Try different * different * to a file * print * p * leave empty * empty * Try different account * Run 2 arm machines * Move both to us-west-2 * Try workflow * Change name * Changes * Change the secrets * Add supprted arch * Add defaults * Support all * Change the jq * Change machine to t4g * Change the name * Change the machine * fix the stop * only benchamrk * add the secrets * region secret * benchmark region * Change timeout * Added support for arch name in benchamrks * change th json * changed to v9.0 * Change the check * add v9 * Check alt version of armv9 * added check * add arc_arch * changed to CONCAT_WITH_UNDERSCORE_ARCH * change the check * Add full check * fix the instruct * Added the cmake * fix the support * put it back to cmake * back * change the condition * No armpl for now * cland format * remove the opt * Changed to one machine * Added BENCHMARK_ARCH * fix endif * Remove secrets call * pr changes * Changes * change to compile * add sve * add #endif * add armpl * add to cmake * remove armpl * add install * Add ARCH=$(uname -m) * change the path to armpl * suuport check for armv7 * change the armpl * Change or OR * add neon supported for spaces * add sve * add support * align * format * change error * change * Removed the ifdef * Add comments * clang * Change names * format * Try fp32 neon simd * add l2 * add cmake * add SVE * fix sve l2 * PR changes * Change to 1 * fix the l2 * fix format * add desciriopn for chunk == 1 * Change functions * Add include * Change the cast * add resudual * formatting * Move th consexpt * remove template armpl * Back to armpl * back to armpl_neon * include * armnpl * add choose * fix the residual div * raise the residuals values * back to char * Remove prefetch * Revert implemetion chooser * Remove armpl * Revert remove error * Remove comment * Remove empty line * format * Add support macos * add sudo * Add absolute path * find all libs * Change folder * Now set for real * Remove armpl from pull * change the templates * change chunk size to 1 * Back to 4 * Removed the for * Change to 2 sums * SVE L2 * Changed * Add get opt func * Change the var name * format * Pr fixes * add int8neon * l2 * one * add baseline * support overflow * add remining chunks * remaining_chunks * final_residual * PR * SVE IP , SVE2 IP & L2 * UINT8 support, remove int8_ip_sve * format * int8 * pr * add uint8 * pr fix * 4 sum * bm_spaces * 4 loads * add uint * changes * PR * changes * remove the mix max * add 2 sum * added conversion * small dim for intel only * add * Test smallDimChooser only for intel * align offset * align const expression * align cpu features function * format * test spaces * Add dotprod * add the aarch file * Changes for pr * Changes * Chnages * change to svadd_f32_x where possible * change to _x where possible * move low dim check to intel only * format * fix IP * pr fix * pr fix * format * Optimize, convert on final step * format * chunking * change to inline * format * fix * format * PR changes * format * PR * format * similir to i8 * guy's comments * fix unit_test * format * reinterpet comment * change to vmlal_s16 * format * using dot * fix uint8 * SVE2 -> SVE * for mat * fix comments * format :( * illegal * add l2 dotpros * REmove the test * format * format * pr changes * Changes * change the residual * format * remove extra * extra * change to uint * formant --------- Co-authored-by: Omer <lerman25@gmail.com> (cherry picked from commit 91a7bbb)
2 tasks
dor-forer
approved these changes
Apr 10, 2025
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## 8.0 #652 +/- ##
=======================================
Coverage 96.99% 96.99%
=======================================
Files 107 107
Lines 5716 5716
=======================================
Hits 5544 5544
Misses 172 172 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Backport of #625 to
8.0
.