Skip to content

[8.0] Add NEON support fot uint8/int8 [MOD-9081] #652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 10, 2025
Merged

Conversation

github-actions[bot]
Copy link

Description

Backport of #625 to 8.0.

* Change subnet id

* Now subnet

* Change subnet

* add subnet

* Try group id

* Change to vpc id

* change subnet

* Change ami

* Try without subnet

* add security group again

* Change the subnets

* Change to ids

* Change sg

* psubnet

* Try different

* different

* to a file

* print

* p

* leave empty

* empty

* Try different account

* Run 2 arm machines

* Move both to us-west-2

* Try workflow

* Change name

* Changes

* Change the secrets

* Add supprted arch

* Add defaults

* Support all

* Change the jq

* Change machine to t4g

* Change the name

* Change the machine

* fix the stop

* only benchamrk

* add the secrets

* region secret

* benchmark region

* Change timeout

* Added support for arch name in benchamrks

* change th json

* changed to v9.0

* Change the check

* add v9

* Check alt version of armv9

* added check

* add arc_arch

* changed to CONCAT_WITH_UNDERSCORE_ARCH

* change the check

* Add full check

* fix the instruct

* Added the cmake

* fix the support

* put it back to cmake

* back

* change the condition

* No armpl for now

* cland format

* remove the opt

* Changed to one machine

* Added BENCHMARK_ARCH

* fix endif

* Remove secrets call

* pr changes

* Changes

* change to compile

* add sve

* add #endif

* add armpl

* add to cmake

* remove armpl

* add install

* Add ARCH=$(uname -m)

* change the path to armpl

* suuport check for armv7

* change the armpl

* Change or OR

* add neon supported for spaces

* add sve

* add support

* align

* format

* change error

* change

* Removed the ifdef

* Add comments

* clang

* Change names

* format

* Try fp32 neon simd

* add l2

* add cmake

* add SVE

* fix sve l2

* PR changes

* Change to 1

* fix the l2

* fix format

* add desciriopn for chunk == 1

* Change functions

* Add include

* Change the cast

* add resudual

* formatting

* Move th consexpt

* remove template armpl

* Back to armpl

* back to armpl_neon

* include

* armnpl

* add choose

* fix the residual div

* raise the residuals values

* back to char

* Remove prefetch

* Revert implemetion chooser

* Remove armpl

* Revert remove error

* Remove comment

* Remove empty line

* format

* Add support macos

* add sudo

* Add absolute path

* find all libs

* Change folder

* Now set for real

* Remove armpl from pull

* change the templates

* change chunk size to 1

* Back to 4

* Removed the for

* Change to 2 sums

* SVE L2

* Changed

* Add get opt func

* Change the var name

* format

* Pr fixes

* add int8neon

* l2

* one

* add baseline

* support overflow

* add remining chunks

* remaining_chunks

* final_residual

* PR

* SVE IP , SVE2 IP & L2

* UINT8 support, remove int8_ip_sve

* format

* int8

* pr

* add uint8

* pr fix

* 4 sum

* bm_spaces

* 4 loads

* add uint

* changes

* PR

* changes

* remove the mix max

* add 2 sum

* added conversion

* small dim for intel only

* add

* Test smallDimChooser only for intel

* align offset

* align const expression

* align cpu features function

* format

* test spaces

* Add dotprod

* add the aarch file

* Changes for pr

* Changes

* Chnages

* change to svadd_f32_x where possible

* change to _x where possible

* move low dim check to intel only

* format

* fix IP

* pr fix

* pr fix

* format

* Optimize, convert on final step

* format

* chunking

* change to inline

* format

* fix

* format

* PR changes

* format

* PR

* format

* similir to i8

* guy's comments

* fix unit_test

* format

* reinterpet comment

* change to vmlal_s16

* format

* using dot

* fix uint8

* SVE2 -> SVE

* for mat

* fix comments

* format :(

* illegal

* add l2 dotpros

* REmove the test

* format

* format

* pr changes

* Changes

* change the residual

* format

* remove extra

* extra

* change to uint

* formant

---------

Co-authored-by: Omer <lerman25@gmail.com>
(cherry picked from commit 91a7bbb)
@GuyAv46 GuyAv46 marked this pull request as draft April 10, 2025 08:11
@GuyAv46 GuyAv46 marked this pull request as ready for review April 10, 2025 08:11
@GuyAv46 GuyAv46 requested a review from dor-forer April 10, 2025 08:11
@dor-forer dor-forer enabled auto-merge April 10, 2025 08:20
@dor-forer dor-forer added this pull request to the merge queue Apr 10, 2025
Copy link

codecov bot commented Apr 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.99%. Comparing base (e8b3779) to head (fdd1ce1).
Report is 1 commits behind head on 8.0.

Additional details and impacted files
@@           Coverage Diff           @@
##              8.0     #652   +/-   ##
=======================================
  Coverage   96.99%   96.99%           
=======================================
  Files         107      107           
  Lines        5716     5716           
=======================================
  Hits         5544     5544           
  Misses        172      172           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Merged via the queue into 8.0 with commit c0c249e Apr 10, 2025
14 checks passed
@dor-forer dor-forer deleted the backport-625-to-8.0 branch April 10, 2025 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant