Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPP Crop and Patch on HOST and HIP #314

Merged
merged 59 commits into from
Apr 12, 2024

Conversation

r-abishek
Copy link
Member

  • Adds Crop and Patch Tensor for U8/F16/F32/I8 optimized on AVX2 and HIP
  • Adds relevant QA/unit/performance tests

HazarathKumarM and others added 30 commits January 10, 2024 00:44
updated test suite to match with previous QA mode configuration for crop and patch
added description for crop and patch function

removed unnecessary spaces in HOST kernel
…tion (ROCm#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
* Match all CMakeLists.txt license as per RPP's outermost LICENSE file

* Match all python files' license as per RPP's outermost LICENSE file

* Match all .hpp files' license as per RPP's outermost LICENSE file

* Match all .cpp files' license as per RPP's outermost LICENSE file

* Match all .h files' license as per RPP's outermost LICENSE file

* Remove all rights reserved as per LICENSE file

* Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc."

* Match all .cmake files' license as per RPP's outermost LICENSE file

* Match all .cpp.in files' license as per RPP's outermost LICENSE file

* Replace 283 occurrences in 282 files - 2023 to 2024

* Add "MIT License" title to 281 instances

* Add missing license
…inx (ROCm#301)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…inx (ROCm#302)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.33.1...v0.33.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…4.0 in /docs/sphinx (ROCm#304)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.33.2...v0.34.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (ROCm#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: r-abishek <abishek@multicorewareinc.com>
* Initial commit - Color Temperature HOST Tensor

* Initial commit - Color Temperature HIP Tensor

* Add color temperature golden outputs

* address review comments

* Use reinterpret_cast instead of static_cast

* Combine templated functions to support all datatypes into one
(got minor perf difference of order 3%)

Also fixes indentation

* Fix i8 datatype

* Cleanup

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (ROCm#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix PLN3 variant outputs

Also modifies reference outputs

* Update color_temperature.hpp license

* Delete color_temperature_u8_Tensor_PKD3.csv

* Delete color_temperature_u8_Tensor_PLN3.csv

---------

Co-authored-by: snehaa8 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
* added HOST support for voxel add kernel

* added HIP support for voxel add kernel

* added test suite support for add scalar

* added Doxygen support and modified hip kernel function names as per new standard

* added HOST support for voxel subtract kernel

* added HIP support for voxel subtract kernel

* added test suite support

* updated the golden outputs for subtract with correct values

* removed unnessary validation checks

* Remove double spaces

* Fix header

* Fix all retval docs

* Fix docs to add memory type

* Fix comment

* Add divider comment

* Use post-increment efficiently

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (ROCm#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted add and subtract scalar golden outputs to bin files

* changed copyright from 2023 to 2024

* Update add_scalar.hpp license

* Update subtract_scalar.hpp license

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
@kiritigowda kiritigowda self-assigned this Feb 27, 2024
@kiritigowda kiritigowda added the enhancement New feature or request label Feb 27, 2024
@kiritigowda kiritigowda changed the base branch from master to develop March 5, 2024 18:59
@kiritigowda kiritigowda requested a review from a team as a code owner March 5, 2024 19:00
@LakshmiKumar23
Copy link
Contributor

@r-abishek Tried to run the python runTests.py with case 33. I do not see any output. Same issue with HIP and HOST


svcbuild@Legolas:~/work/rpp/utilities/test_suite/HIP$ python3 runTests.py --case_list 33
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is Clang 14.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Failed
-- Using Turbo JPEG --
        Libraries:/lib/libturbojpeg.so
        Includes:/usr/include
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp=libomp (found version "5.0")
-- Found OpenMP: TRUE (found version "4.5")
-- test_suite/HIP set to build with rpp and TurboJpeg
-- test_suite/HIP set to build with rpp, hip and OpenCV
-- Warning: libniftiio must be installed to install test_suite/HIP/Tensor_voxel_hip successfully!
-- Configuring done
-- Generating done
-- Build files have been written to: /home/svcbuild/work/rpp/utilities/test_suite/HIP/build
[ 50%] Building CXX object CMakeFiles/Tensor_hip.dir/Tensor_hip.cpp.o
[100%] Linking CXX executable Tensor_hip
[100%] Built target Tensor_hip






##########################################################################################
Running all layout Inputs...
##########################################################################################

@r-abishek
Copy link
Member Author

@r-abishek Tried to run the python runTests.py with case 33. I do not see any output. Same issue with HIP and HOST


svcbuild@Legolas:~/work/rpp/utilities/test_suite/HIP$ python3 runTests.py --case_list 33
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is Clang 14.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Failed
-- Using Turbo JPEG --
        Libraries:/lib/libturbojpeg.so
        Includes:/usr/include
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp=libomp (found version "5.0")
-- Found OpenMP: TRUE (found version "4.5")
-- test_suite/HIP set to build with rpp and TurboJpeg
-- test_suite/HIP set to build with rpp, hip and OpenCV
-- Warning: libniftiio must be installed to install test_suite/HIP/Tensor_voxel_hip successfully!
-- Configuring done
-- Generating done
-- Build files have been written to: /home/svcbuild/work/rpp/utilities/test_suite/HIP/build
[ 50%] Building CXX object CMakeFiles/Tensor_hip.dir/Tensor_hip.cpp.o
[100%] Linking CXX executable Tensor_hip
[100%] Built target Tensor_hip






##########################################################################################
Running all layout Inputs...
##########################################################################################

@LakshmiKumar23 This very much seems like a hang / machine dependent issue since we are able to run it. Is it replicable?

This command should work:
python3 runTests.py --case_list 33 --batch_size 3

@rrawther
Copy link
Contributor

@LakshmiKumar23 Can you please review this PR?

@rrawther
Copy link
Contributor

@r-abishek Can you please resolve conflicts in this PR

Copy link
Contributor

@LakshmiKumar23 LakshmiKumar23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kiritigowda @r-abishek Test case 33 passes with both HIP and HOST. Ready to be merged

@kiritigowda kiritigowda merged commit ca1eb5e into ROCm:develop Apr 12, 2024
10 checks passed
kiritigowda added a commit that referenced this pull request Apr 12, 2024
* Add crop and patch HOST Tensor implementation

* Fix output mismatches

* Add golden outputs

* temporary commit

* fix patch region updation issue

* Crop and patch changes

* fixed the edge case for PLN variant

* added support for remianing layout variants for crop and patch HIP

* resolved bug in HOST kernel and updated golden outputs

updated test suite to match with previous QA mode configuration for crop and patch

* fixed the incorrect golden output for PKD3 variant

* fixed bug in non vector loop for HOST toggle variant

* added more comments in HIP kernel for better readability

added description for crop and patch function

removed unnecessary spaces in HOST kernel

* removed redundant files

* modified the kernel to use 1 pixel load and store due to performance issues with vectorized version

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted crop and patch golden outputs to bin files

* changed copyright from 2023 to 2024

* License - updates to 2024 and consistency changes (#298)

* Match all CMakeLists.txt license as per RPP's outermost LICENSE file

* Match all python files' license as per RPP's outermost LICENSE file

* Match all .hpp files' license as per RPP's outermost LICENSE file

* Match all .cpp files' license as per RPP's outermost LICENSE file

* Match all .h files' license as per RPP's outermost LICENSE file

* Remove all rights reserved as per LICENSE file

* Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc."

* Match all .cmake files' license as per RPP's outermost LICENSE file

* Match all .cpp.in files' license as per RPP's outermost LICENSE file

* Replace 283 occurrences in 282 files - 2023 to 2024

* Add "MIT License" title to 281 instances

* Add missing license

* Test - Update README.md for test_suite (#299)

* Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.33.1...v0.33.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update doc codeowners (#303)

* Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.33.2...v0.34.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Test suite - upgrade 5 qa perf (#305)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: r-abishek <abishek@multicorewareinc.com>

* RPP Color Temperature on HOST and HIP (#271)

* Initial commit - Color Temperature HOST Tensor

* Initial commit - Color Temperature HIP Tensor

* Add color temperature golden outputs

* address review comments

* Use reinterpret_cast instead of static_cast

* Combine templated functions to support all datatypes into one
(got minor perf difference of order 3%)

Also fixes indentation

* Fix i8 datatype

* Cleanup

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix PLN3 variant outputs

Also modifies reference outputs

* Update color_temperature.hpp license

* Delete color_temperature_u8_Tensor_PKD3.csv

* Delete color_temperature_u8_Tensor_PLN3.csv

---------

Co-authored-by: snehaa8 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272)

* added HOST support for voxel add kernel

* added HIP support for voxel add kernel

* added test suite support for add scalar

* added Doxygen support and modified hip kernel function names as per new standard

* added HOST support for voxel subtract kernel

* added HIP support for voxel subtract kernel

* added test suite support

* updated the golden outputs for subtract with correct values

* removed unnessary validation checks

* Remove double spaces

* Fix header

* Fix all retval docs

* Fix docs to add memory type

* Fix comment

* Add divider comment

* Use post-increment efficiently

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted add and subtract scalar golden outputs to bin files

* changed copyright from 2023 to 2024

* Update add_scalar.hpp license

* Update subtract_scalar.hpp license

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* updated copyright for crop and patch

* RPP Magnitude on HOST and HIP (#278)

* Initial commit - Magnitude HOST Tensor

* Add QA reference outputs

* Update runTests.py

* Initial commit - Magnitude HIP Tensor

* Add dual input support in testsuite

* Optimize HOST kernel further

* Optimize i8 datatype further

* Modify comments

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.31.0...v0.33.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update Copywright year

* Combine templated functions to support all datatypes

* Modify format of reference outputs

* Update rppi_arithmetic_operations.h license

* Update rppt_tensor_arithmetic_operations.h license

* Update host_tensor_arithmetic_operations.hpp

* Update magnitude.hpp license

* Update hip_tensor_arithmetic_operations.hpp license

* Delete magnitude_u8_Tensor_PKD3.csv

* Delete magnitude_u8_Tensor_PLN1.csv

* Delete magnitude_u8_Tensor_PLN3.csv

* Update rpp_test_suite_common.h license

* Update runTests.py license

* Update Tensor_hip.cpp license

* Update runTests.py license

* Update Tensor_host.cpp license

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* added memcpy + vectorized version for HIP non toggle layout variants

* modified toggle variants as per the latest changes

* modified variable names in hip kernels for better readability

categorized kernels into 2 sections

* minor change

* Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.34.0...v0.34.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Tensor Audio Support - Down Mixing (#296)

* Initial commit - Non slient region detection

Includes unittest setup

* Initial commit - To Decibels

Includes unittest setup

* Intial commit - pre_emphasis_filter

* Intial commit - down_mixing

* Replace vectors with arrays

* Cleanup

* Minor cleanup

* Optimize downmixing Kernel

Includes cleanup

* Replace Rpp64s with Rpp32s

* Cleanup

* Optimize and precompute cutOff

* Fix buffer used

* Fix buffer used

* Additional Cleanup

* Optimize post incrmeent operation

* Optimize post increment operation

* Update testsuite for Audio

* code cleanup

* Add Readme file for Audio test suite

* changes based on review comments

* minor change

* Remove unittest folders and updated README.md

* Remove unit tests

* minor change

* code cleanup

* added common header file for audio helper functions

* removed unncessary audio wav files

fixed bug in ROI updation for audio test suite

resolved issue in summary generation for performance tests in python

* removed log file

* added doxygen support for audio

* added doxygen changes for to_decibels

* updated test suite support for to_decibels

* minor change

* added doxygen changes for preemphasis filter

* updated changes for preemphasis filter in test suite

* removed the usage of getMax function and used std::max_element

* modularized code in test suite

* merge with latest changes

* minor change

* minor change

* minor change

* resolved codacy warnings

* Codacy fix - Remove unused cpuTime

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* resolved issue with file_system dependency in test suite

* Doxygen changes

changed malloc to new in NSR kernel

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HIP pkd3->pkd3 variant

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Improve readability and Cleanup

* Additional cleanup

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Fix codacy warnings

* Address other codacy warnings

* Update ricap.hpp with reference paper

* Add RICAP dataset path in readme

* Make changes to error codes returned

* Modify roi crop region for unit and perf tests

* RPP Tensor Water Augmentation on HOST and HIP (#181)

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Add Avx2 implementation for F32 and U8 toggle variants

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* change F32 load and store logic

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* removed golden outputs for water

* minor changes

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* changed generic nn F32 loads using gather and setr instructions

* added comments for latest changes

* minor change

* added definition for storing 32 and 64 bits from a 128bit register

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix build error

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* Boost deps fix for test suite

---------

Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Documentation - Readme & changelog updates (#251)

* readme and changelog updates for 6.0

* minor update

* added ctests for audio test suite for CI

made changes to add more clarity on the QA Tests results

* Cmake mods for ctest

* HOST-only build error bugfix

* added qa mode paramter to python audio script

added golden output map for QA testing of Non silent region detection

* minor change

* Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Resize Mirror Normalize Bugfix (#252)

* added fix for hipMemset

* remove pixel check for U8-F32 and U8-F16 for HOST codes

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

* added example for MMS calculation in comments for better understanding

* Sphinx - updates (#257)

* Sphinx - updates

* Doxygen - Updates

* Docs - Remove index.md

* updated info used to for running audio test suite

* removed bitdepth variable from audio test suite

* added more information on computing NSR outputs in the example added

* Fix doxygen for decibels

Also removes extra QA reference files

* move tensor_host_audio.cpp to host folder

* Fix build errors and qa tests in Audio Test suite

* Fix build errors and qa tests in Audio Test suite

* Add reference output and test samples for downmix

* Add down_mix in augmentation list and supported cases

* Remove auto-merge repeated funcs

* Improve clarity of header docs

* Remove blank line

* Improve clarity on header docs

* Add Doxygen comments

* minor change

* converted golden outputs to binary file for downmixing

* removed old golden output file for preemphasis and todecibels

* modified info for downmixing as per new changes

used handle memory for temporary buffers

* formatting changes

* moved the common code for SSE and AVX to outside

* Update down_mixing.hpp license

* Update rppt_tensor_audio_augmentations.h

* combined the srcLength and channels tensors into single tensor

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: Lisa <lisajdelaney@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sundarrajan98 <sundarrajan@multicorewareinc.com>

* RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306)

* added HIP support for voxel scalar multiply kernel

* added HOST support for voxel multiply kernel

added golden outputs for voxel multiply kernel

* merge with master

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted multiply scalar voxel golden outputs to bin files

* changed copyright from 2023 to 2024

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Test Suite Bugfix (#307)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

* Fix segmentation fault

* Revert QAmode to restrict HIP bitdepths

* Use Rpp64u for HOST while comparing outputs

* Fix ambiguous abs call

* Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data();

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: Pavel Tcherniaev <Pavel.Tcherniaev@amd.com>

* code cleanup

* added device helpers for making the code modularized

added ifdefs for AVX2 part in HOST codes

* moved layout toggle kernels to hip commons

* revert incorrect changes happened with merge

* fix build issue in HOST test suite

* added missing clock timer for water in HOST test suite

* remove duplicate line in readme

* fixed the issue with roi updation for hip and host test suite

* added doxygen output for crop and patch

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sam Wu <sam.wu2@amd.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: Lisa <lisajdelaney@gmail.com>
Co-authored-by: Sundarrajan98 <sundarrajan@multicorewareinc.com>
Co-authored-by: Pavel Tcherniaev <Pavel.Tcherniaev@amd.com>
Copy link
Contributor

@rrawther rrawther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r-abishek : added some comments. But it got merged. Please take a look


RpptROIPtr cropRoi = &cropRoiTensor[batchCount];
RpptROIPtr patchRoi = &patchRoiTensor[batchCount];
Rpp8u *srcPtr1Image, *srcPtr2Image, *dstPtrImage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this to beginning after line #48

__m128 p[4];
rpp_simd_load(rpp_load12_f32pkd3_to_f32pln3, srcPtrTemp_ps, p); // simd loads
rpp_simd_store(rpp_store12_f32pln3_to_f32pln3, dstPtrTemp_ps, dstPtrTemp_ps + 4, dstPtrTemp_ps + 8, p); // simd stores
for(int cnt = 0; cnt < 4; cnt++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we store directly to dstPtrTempR, G and B and get rid of this loop here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been the case only for f16 since pre-AVX512. Shall get back on this on a separate call

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants