Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StdRS Unnecessary Sample Division by 10 #172

Open
aeoranday opened this issue Apr 16, 2024 · 0 comments
Open

StdRS Unnecessary Sample Division by 10 #172

aeoranday opened this issue Apr 16, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@aeoranday
Copy link
Member

aeoranday commented Apr 16, 2024

In the standard running sum AVX2 implementation, there is a division by 10 on the sample data that does not get reset.

__m256i first_part = _mm256_mullo_epi16(RS, R_factor);
//__m256i first_part_div = _mm256_div_epi16(RS, 10);
//__m256i second_part = s;
//__m256i second_part_div = _mm256_div_epi16(_mm256_abs_epi16(s), 10);
//RS = _mm256_div_epi16(_mm256_add_epi16(first_part, second_part), 10);
RS = swtpg_wibeth::_mm256_div_epi16(_mm256_add_epi16(first_part, s), 10);

This further affects the sum and requires an unexpected, low threshold.

This division is necessary for the running sum R factor, so this can be resolved by only dividing by 10 on that part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants