Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Water - HOST Tensor AVX2 Support and Vectorized HIP support #126

Merged
merged 26 commits into from
Jul 18, 2023

Conversation

sampath1117
Copy link

*added for U8 , F32 , I8 , F16 variants
*added for PKD3,PLN3,PLN1 with toggle
*added test case for water in new test suite
*added golden outputs for water

@sampath1117
Copy link
Author

@r-abishek
For water the QA tests are failing currently. I need to debug more on this

Please note

src/include/cpu/rpp_cpu_simd.hpp Show resolved Hide resolved
src/modules/cpu/kernel/water.hpp Outdated Show resolved Hide resolved
{
pSrcY = _mm_fmadd_ps(pWaterParams[1], pCosFactor, pDstY);
pSrcX = _mm_fmadd_ps(pWaterParams[0], pSinFactor, pDstX);
pDstX = _mm_add_ps(pDstX, xmm_p4);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure compute_water_src_loc_sse() and compute_water_src_loc() are doing the same thing? The sse seems to have an extra add_ps.

src/modules/cpu/kernel/water.hpp Outdated Show resolved Hide resolved
@r-abishek
Copy link
Owner

@sampath1117 Is this ready now?

@sampath1117
Copy link
Author

Hi @r-abishek
I have made all the changes and the PR is ready for internal review

p128[1] = _mm256_extractf128_ps(p[1], 0);
p128[2] = _mm256_extractf128_ps(p[2], 0);
_MM_TRANSPOSE4_PS(p128[0], p128[1], p128[2], p128[3]);
p128[0] = _mm256_castps256_ps128(pRow[0]);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some inline comments for better readability on vectorized code

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

__m128i p128[2];
const __m128i maskR1 = _mm_setr_epi8(0, 3, 6, 9, 12, 15, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80);
const __m128i maskG1 = _mm_setr_epi8(1, 4, 7, 10, 13, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80);
const __m128i maskB1 = _mm_setr_epi8(2, 5, 8, 11, 14, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here for reader to understand masks - say as to why maskR1 has 6 Rs and maskR2 has 2 Rs etc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Rpp32f dstX, dstY, sinFactor;
__m256 pDstX, pDstY, pSinFactor;
dstY = (Rpp32f)i;
sinFactor= std::sin((freqX * dstY) + phaseX);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before =

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@r-abishek
Copy link
Owner

Looks okay, but again, do we have CSVs here for passing QA tests with/without tolerance?

@r-abishek r-abishek added the enhancement New feature or request label Jul 14, 2023
@r-abishek r-abishek added this to the sow8ms4 milestone Jul 14, 2023
resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported
@sampath1117
Copy link
Author

Looks okay, but again, do we have CSVs here for passing QA tests with/without tolerance?

@r-abishek
I have regenerated the golden outputs with latest codes and QA tests are passing for both HOST and HIP

@r-abishek r-abishek changed the base branch from master to ar/opt_water July 18, 2023 03:14
Copy link
Owner

@r-abishek r-abishek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@r-abishek r-abishek merged commit 9da6be1 into r-abishek:ar/opt_water Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants