neon mlal qs8 rsum use addw instead of mlal #6438

copybara-service · 2024-05-18T01:45:26Z

neon mlal qs8 rsum use addw instead of mlal

remove vone which was being lengthened with mlal and then multiplied by the input
use sliced accumulators for 16 bit accumulation
rename functions from neon_mlal to neon_addw
add const to mask variable in remainder handler for addw and neondot microkernels

- remove vone which was being lengthened with mlal and then multiplied by the input - use sliced accumulators for 16 bit accumulation - rename functions from neon_mlal to neon_addw - add const to mask variable in remainder handler for addw and neondot microkernels PiperOrigin-RevId: 635029345

copybara-service bot force-pushed the test_634595235 branch 3 times, most recently from 385e9f9 to 26b51c7 Compare May 18, 2024 11:05

copybara-service bot force-pushed the test_634595235 branch from 26b51c7 to fcb3669 Compare May 18, 2024 11:51

copybara-service bot merged commit fcb3669 into master May 18, 2024

copybara-service bot deleted the test_634595235 branch May 18, 2024 11:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

neon mlal qs8 rsum use addw instead of mlal #6438

neon mlal qs8 rsum use addw instead of mlal #6438

copybara-service bot commented May 18, 2024 •

edited

neon mlal qs8 rsum use addw instead of mlal #6438

neon mlal qs8 rsum use addw instead of mlal #6438

Conversation

copybara-service bot commented May 18, 2024 • edited

copybara-service bot commented May 18, 2024 •

edited