Bugfix for Erase variant (#373) · sampath1117/rpp@5e5d891

Commit

Bugfix for Erase variant (ROCm#373)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* License - updates to 2024 and consistency changes (#298)

* Match all CMakeLists.txt license as per RPP's outermost LICENSE file

* Match all python files' license as per RPP's outermost LICENSE file

* Match all .hpp files' license as per RPP's outermost LICENSE file

* Match all .cpp files' license as per RPP's outermost LICENSE file

* Match all .h files' license as per RPP's outermost LICENSE file

* Remove all rights reserved as per LICENSE file

* Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc."

* Match all .cmake files' license as per RPP's outermost LICENSE file

* Match all .cpp.in files' license as per RPP's outermost LICENSE file

* Replace 283 occurrences in 282 files - 2023 to 2024

* Add "MIT License" title to 281 instances

* Add missing license

* Test - Update README.md for test_suite (#299)

* Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix for CI machine failure

* Add note on performance

* Update doc codeowners (#303)

* Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Test suite - upgrade 5 qa perf (#305)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: r-abishek <abishek@multicorewareinc.com>

* RPP Color Temperature on HOST and HIP (#271)

* Initial commit - Color Temperature HOST Tensor

* Initial commit - Color Temperature HIP Tensor

* Add color temperature golden outputs

* address review comments

* Use reinterpret_cast instead of static_cast

* Combine templated functions to support all datatypes into one
(got minor perf difference of order 3%)

Also fixes indentation

* Fix i8 datatype

* Cleanup

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix PLN3 variant outputs

Also modifies reference outputs

* Update color_temperature.hpp license

* Delete color_temperature_u8_Tensor_PKD3.csv

* Delete color_temperature_u8_Tensor_PLN3.csv

---------

Co-authored-by: snehaa8 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272)

* added HOST support for voxel add kernel

* added HIP support for voxel add kernel

* added test suite support for add scalar

* added Doxygen support and modified hip kernel function names as per new standard

* added HOST support for voxel subtract kernel

* added HIP support for voxel subtract kernel

* added test suite support

* updated the golden outputs for subtract with correct values

* removed unnessary validation checks

* Remove double spaces

* Fix header

* Fix all retval docs

* Fix docs to add memory type

* Fix comment

* Add divider comment

* Use post-increment efficiently

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted add and subtract scalar golden outputs to bin files

* changed copyright from 2023 to 2024

* Update add_scalar.hpp license

* Update subtract_scalar.hpp license

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* RPP Magnitude on HOST and HIP (#278)

* Initial commit - Magnitude HOST Tensor

* Add QA reference outputs

* Update runTests.py

* Initial commit - Magnitude HIP Tensor

* Add dual input support in testsuite

* Optimize HOST kernel further

* Optimize i8 datatype further

* Modify comments

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update Copywright year

* Combine templated functions to support all datatypes

* Modify format of reference outputs

* Update rppi_arithmetic_operations.h license

* Update rppt_tensor_arithmetic_operations.h license

* Update host_tensor_arithmetic_operations.hpp

* Update magnitude.hpp license

* Update hip_tensor_arithmetic_operations.hpp license

* Delete magnitude_u8_Tensor_PKD3.csv

* Delete magnitude_u8_Tensor_PLN1.csv

* Delete magnitude_u8_Tensor_PLN3.csv

* Update rpp_test_suite_common.h license

* Update runTests.py license

* Update Tensor_hip.cpp license

* Update runTests.py license

* Update Tensor_host.cpp license

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* Initial commit - Erase HOST Tensor

* Add support for i8, f32 and f16 datatypes

Also fixed outputs of PKD3->PKD3 variant of u8.

* Add reference outputs

* Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Tensor Audio Support - Down Mixing (#296)

* Initial commit - Non slient region detection

Includes unittest setup

* Initial commit - To Decibels

Includes unittest setup

* Intial commit - pre_emphasis_filter

* Intial commit - down_mixing

* Replace vectors with arrays

* Cleanup

* Minor cleanup

* Optimize downmixing Kernel

Includes cleanup

* Replace Rpp64s with Rpp32s

* Cleanup

* Optimize and precompute cutOff

* Fix buffer used

* Fix buffer used

* Additional Cleanup

* Optimize post incrmeent operation

* Optimize post increment operation

* Update testsuite for Audio

* code cleanup

* Add Readme file for Audio test suite

* changes based on review comments

* minor change

* Remove unittest folders and updated README.md

* Remove unit tests

* minor change

* code cleanup

* added common header file for audio helper functions

* removed unncessary audio wav files

fixed bug in ROI updation for audio test suite

resolved issue in summary generation for performance tests in python

* removed log file

* added doxygen support for audio

* added doxygen changes for to_decibels

* updated test suite support for to_decibels

* minor change

* added doxygen changes for preemphasis filter

* updated changes for preemphasis filter in test suite

* removed the usage of getMax function and used std::max_element

* modularized code in test suite

* merge with latest changes

* minor change

* minor change

* minor change

* resolved codacy warnings

* Codacy fix - Remove unused cpuTime

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* resolved issue with file_system dependency in test suite

* Doxygen changes

changed malloc to new in NSR kernel

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HIP pkd3->pkd3 variant

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Improve readability and Cleanup

* Additional cleanup

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Fix codacy warnings

* Address other codacy warnings

* Update ricap.hpp with reference paper

* Add RICAP dataset path in readme

* Make changes to error codes returned

* Modify roi crop region for unit and perf tests

* RPP Tensor Water Augmentation on HOST and HIP (#181)

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Add Avx2 implementation for F32 and U8 toggle variants

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* change F32 load and store logic

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* removed golden outputs for water

* minor changes

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* changed generic nn F32 loads using gather and setr instructions

* added comments for latest changes

* minor change

* added definition for storing 32 and 64 bits from a 128bit register

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix build error

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* Boost deps fix for test suite

---------

Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Documentation - Readme & changelog updates (#251)

* readme and changelog updates for 6.0

* minor update

* added ctests for audio test suite for CI

made changes to add more clarity on the QA Tests results

* Cmake mods for ctest

* HOST-only build error bugfix

* added qa mode paramter to python audio script

added golden output map for QA testing of Non silent region detection

* minor change

* Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Resize Mirror Normalize Bugfix (#252)

* added fix for hipMemset

* remove pixel check for U8-F32 and U8-F16 for HOST codes

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

* added example for MMS calculation in comments for better understanding

* Sphinx - updates (#257)

* Sphinx - updates

* Doxygen - Updates

* Docs - Remove index.md

* updated info used to for running audio test suite

* removed bitdepth variable from audio test suite

* added more information on computing NSR outputs in the example added

* Fix doxygen for decibels

Also removes extra QA reference files

* move tensor_host_audio.cpp to host folder

* Fix build errors and qa tests in Audio Test suite

* Fix build errors and qa tests in Audio Test suite

* Add reference output and test samples for downmix

* Add down_mix in augmentation list and supported cases

* Remove auto-merge repeated funcs

* Improve clarity of header docs

* Remove blank line

* Improve clarity on header docs

* Add Doxygen comments

* minor change

* converted golden outputs to binary file for downmixing

* removed old golden output file for preemphasis and todecibels

* modified info for downmixing as per new changes

used handle memory for temporary buffers

* formatting changes

* moved the common code for SSE and AVX to outside

* Update down_mixing.hpp license

* Update rppt_tensor_audio_augmentations.h

* combined the srcLength and channels tensors into single tensor

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: Lisa <lisajdelaney@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sundarrajan98 <sundarrajan@multicorewareinc.com>

* RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306)

* added HIP support for voxel scalar multiply kernel

* added HOST support for voxel multiply kernel

added golden outputs for voxel multiply kernel

* merge with master

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted multiply scalar voxel golden outputs to bin files

* changed copyright from 2023 to 2024

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Test Suite Bugfix (#307)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

* Fix segmentation fault

* Revert QAmode to restrict HIP bitdepths

* Use Rpp64u for HOST while comparing outputs

* Fix ambiguous abs call

* Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data();

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: Pavel Tcherniaev <Pavel.Tcherniaev@amd.com>

* Initial commit - Erase HIP Tensor

* Move hipHostMalloc outside perf iteration loop in HIP testsuite

* Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Link cleanup (#326)

* link updates

* update tables

* pare down index

* API cleanup

* consistency

* verbiage

* Update notes

* Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Voxel Flip on HIP and HOST (#285)

* added support for flip voxel

* added test suite support

* added golden outputs for flip voxel

made changes in test suite to run QA tests for flip

* updated golden outputs with correct values

* minor bug fix in the hip test suite

* made changes to variable names for better readability

fixed comments in test suite

minor cleanup

* combined the flip axis factor as ternary operator in HIP kernel

added new enum for error handling when source and destination layouts are not matching

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted flip voxel golden outputs to bin files

* changed copyright from 2023 to 2024

* Update flip_voxel.hpp license

* License - updates to 2024 and consistency changes (#298)

* Match all CMakeLists.txt license as per RPP's outermost LICENSE file

* Match all python files' license as per RPP's outermost LICENSE file

* Match all .hpp files' license as per RPP's outermost LICENSE file

* Match all .cpp files' license as per RPP's outermost LICENSE file

* Match all .h files' license as per RPP's outermost LICENSE file

* Remove all rights reserved as per LICENSE file

* Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc."

* Match all .cmake files' license as per RPP's outermost LICENSE file

* Match all .cpp.in files' license as per RPP's outermost LICENSE file

* Replace 283 occurrences in 282 files - 2023 to 2024

* Add "MIT License" title to 281 instances

* Add missing license

* Test - Update README.md for test_suite (#299)

* Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update doc codeowners (#303)

* Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Test suite - upgrade 5 qa perf (#305)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: r-abishek <abishek@multicorewareinc.com>

* RPP Color Temperature on HOST and HIP (#271)

* Initial commit - Color Temperature HOST Tensor

* Initial commit - Color Temperature HIP Tensor

* Add color temperature golden outputs

* address review comments

* Use reinterpret_cast instead of static_cast

* Combine templated functions to support all datatypes into one
(got minor perf difference of order 3%)

Also fixes indentation

* Fix i8 datatype

* Cleanup

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix PLN3 variant outputs

Also modifies reference outputs

* Update color_temperature.hpp license

* Delete color_temperature_u8_Tensor_PKD3.csv

* Delete color_temperature_u8_Tensor_PLN3.csv

---------

Co-authored-by: snehaa8 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272)

* added HOST support for voxel add kernel

* added HIP support for voxel add kernel

* added test suite support for add scalar

* added Doxygen support and modified hip kernel function names as per new standard

* added HOST support for voxel subtract kernel

* added HIP support for voxel subtract kernel

* added test suite support

* updated the golden outputs for subtract with correct values

* removed unnessary validation checks

* Remove double spaces

* Fix header

* Fix all retval docs

* Fix docs to add memory type

* Fix comment

* Add divider comment

* Use post-increment efficiently

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted add and subtract scalar golden outputs to bin files

* changed copyright from 2023 to 2024

* Update add_scalar.hpp license

* Update subtract_scalar.hpp license

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* RPP Magnitude on HOST and HIP (#278)

* Initial commit - Magnitude HOST Tensor

* Add QA reference outputs

* Update runTests.py

* Initial commit - Magnitude HIP Tensor

* Add dual input support in testsuite

* Optimize HOST kernel further

* Optimize i8 datatype further

* Modify comments

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update Copywright year

* Combine templated functions to support all datatypes

* Modify format of reference outputs

* Update rppi_arithmetic_operations.h license

* Update rppt_tensor_arithmetic_operations.h license

* Update host_tensor_arithmetic_operations.hpp

* Update magnitude.hpp license

* Update hip_tensor_arithmetic_operations.hpp license

* Delete magnitude_u8_Tensor_PKD3.csv

* Delete magnitude_u8_Tensor_PLN1.csv

* Delete magnitude_u8_Tensor_PLN3.csv

* Update rpp_test_suite_common.h license

* Update runTests.py license

* Update Tensor_hip.cpp license

* Update runTests.py license

* Update Tensor_host.cpp license

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Tensor Audio Support - Down Mixing (#296)

* Initial commit - Non slient region detection

Includes unittest setup

* Initial commit - To Decibels

Includes unittest setup

* Intial commit - pre_emphasis_filter

* Intial commit - down_mixing

* Replace vectors with arrays

* Cleanup

* Minor cleanup

* Optimize downmixing Kernel

Includes cleanup

* Replace Rpp64s with Rpp32s

* Cleanup

* Optimize and precompute cutOff

* Fix buffer used

* Fix buffer used

* Additional Cleanup

* Optimize post incrmeent operation

* Optimize post increment operation

* Update testsuite for Audio

* code cleanup

* Add Readme file for Audio test suite

* changes based on review comments

* minor change

* Remove unittest folders and updated README.md

* Remove unit tests

* minor change

* code cleanup

* added common header file for audio helper functions

* removed unncessary audio wav files

fixed bug in ROI updation for audio test suite

resolved issue in summary generation for performance tests in python

* removed log file

* added doxygen support for audio

* added doxygen changes for to_decibels

* updated test suite support for to_decibels

* minor change

* added doxygen changes for preemphasis filter

* updated changes for preemphasis filter in test suite

* removed the usage of getMax function and used std::max_element

* modularized code in test suite

* merge with latest changes

* minor change

* minor change

* minor change

* resolved codacy warnings

* Codacy fix - Remove unused cpuTime

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* resolved issue with file_system dependency in test suite

* Doxygen changes

changed malloc to new in NSR kernel

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HIP pkd3->pkd3 variant

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Improve readability and Cleanup

* Additional cleanup

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Fix codacy warnings

* Address other codacy warnings

* Update ricap.hpp with reference paper

* Add RICAP dataset path in readme

* Make changes to error codes returned

* Modify roi crop region for unit and perf tests

* RPP Tensor Water Augmentation on HOST and HIP (#181)

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Add Avx2 implementation for F32 and U8 toggle variants

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* change F32 load and store logic

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* removed golden outputs for water

* minor changes

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* changed generic nn F32 loads using gather and setr instructions

* added comments for latest changes

* minor change

* added definition for storing 32 and 64 bits from a 128bit register

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix build error

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* Boost deps fix for test suite

---------

Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Documentation - Readme & changelog updates (#251)

* readme and changelog updates for 6.0

* minor update

* added ctests for audio test suite for CI

made changes to add more clarity on the QA Tests results

* Cmake mods for ctest

* HOST-only build error bugfix

* added qa mode paramter to python audio script

added golden output map for QA testing of Non silent region detection

* minor change

* Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Resize Mirror Normalize Bugfix (#252)

* added fix for hipMemset

* remove pixel check for U8-F32 and U8-F16 for HOST codes

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

* added example for MMS calculation in comments for better understanding

* Sphinx - updates (#257)

* Sphinx - updates

* Doxygen - Updates

* Docs - Remove index.md

* updated info used to for running audio test suite

* removed bitdepth variable from audio test suite

* added more information on computing NSR outputs in the example added

* Fix doxygen for decibels

Also removes extra QA reference files

* move tensor_host_audio.cpp to host folder

* Fix build errors and qa tests in Audio Test suite

* Fix build errors and qa tests in Audio Test suite

* Add reference output and test samples for downmix

* Add down_mix in augmentation list and supported cases

* Remove auto-merge repeated funcs

* Improve clarity of header docs

* Remove blank line

* Improve clarity on header docs

* Add Doxygen comments

* minor change

* converted golden outputs to binary file for downmixing

* removed old golden output file for preemphasis and todecibels

* modified info for downmixing as per new changes

used handle memory for temporary buffers

* formatting changes

* moved the common code for SSE and AVX to outside

* Update down_mixing.hpp license

* Update rppt_tensor_audio_augmentations.h

* combined the srcLength and channels tensors into single tensor

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: Lisa <lisajdelaney@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sundarrajan98 <sundarrajan@multicorewareinc.com>

* RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306)

* added HIP support for voxel scalar multiply kernel

* added HOST support for voxel multiply kernel

added golden outputs for voxel multiply kernel

* merge with master

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted multiply scalar voxel golden outputs to bin files

* changed copyright from 2023 to 2024

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Test Suite Bugfix (#307)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

* Fix segmentation fault

* Revert QAmode to restrict HIP bitdepths

* Use Rpp64u for HOST while comparing outputs

* Fix ambiguous abs call

* Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data();

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: Pavel Tcherniaev <Pavel.Tcherniaev@amd.com>

* Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260)

* Minor Change

* Add Validation check for DST_FOLDER path

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* Add Validation checks for all options in testAllScript.sh

* Add sanity check for dual Input cases
Set Max Dimension and Max Image Dump
Replaced Fast DCT tag with Accurate DCT

* Regenerate golden outputs using accurate dct Flag
Add golden outputs for some new augmentations

* Fix Flip golden outputs mismatch
Fix PLN3 variants mismatch in QA mode

* Add MAX_BATCH_SIZE check
removed Augmentations function calls for failing Qa modes
code cleanup

* Add crop and gamma correction augmentations
code cleanup

* Add comments to functions in rpp_test_suite_common.h

* minor change

* code cleanup

* minor code changes

* Change roi and Image sizes for crop augmentation

* Change numIterations option to numRuns
Addressed PR comments

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* Add turboJpeg header to update maxHeight and maxWidth values

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Change the performance Timings logic

* Add Avx2 implementation for F32 and U8 toggle variants

* minor change to support u8_f16 and u8_f32 cases

* Regenerate LUT golden outputs with ACCURATE_DCT tag

* Minor code changes

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* Made changes to the runTests.py in Host to remove testAllScipts.sh

* Made changes to the runTests.py in HIP to remove testAllScipts.sh

* Initial commit - Image min and max Reduction kernel

Includes
* u8 datatype for both min and max HOST Tensor of all variants.
* Testsuite changes.

* NWC -initial code for min max PLN3 - PLN3

* made changes to split min and max kernels seperately

* splitted kernels for min and max

* made changes to print final max/min in the R,G,B channels

* fixed inaccuracies in min/max computation

* made changes to typecast intermediate output to output requested by user

added comments for the code

code cleanup and minor changes in test suite

* fixed build issues

removed image folders used for min, max and sum

reverted unwanted file changes

* minor changes in test suite

* removed support for unwanted test case in Tensor_hip.cpp

* Adds new option roi

* remove testAllScripts.sh

* Adds roi Option in HIP backend

* Implement f32 variants

* Implement f16 and i8 datatype variants

* change F32 load and store logic

* Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration

* minor code changes

* Initial commit - Image sum Reduction kernel

Includes u8 PLN1 -> PLN1 conversion for HOST Tensor

* Implement PKD3 and PLN3 for Image sum Tensor HOST

* Support i8, f16 and f32 datatypes

* Initial commit - Image sum Reduction HIP kernel

Includes u8 PLN1 -> PLN1 conversion for Tensor

* Implement PKD3 and PLN3 for Image sum Tensor HIP

* Add support in testsuite

Revert normalization for i8 HOST Tensor variants

* Fix HIP testsuite

Remove additional blanks for 1 channel output

* Modify print statement in HIP testsuite

* Improve readability for testsuite outputs

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* Fix HIP to support larger inputs

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* Cleanup

* removed golden outputs for water

* minor changes

* Cleanup

Support Reduction QA test in testsuite

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* Remove unused variables and C style casting

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* Optimize u8 datatype further

* Fix static_cast

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* added rotate case with golden outputs

changed generic bilinear HOST codes to match with HIP codes

* Add golden output for remaining all tensor augmentations

* fix python script issues

* Optimize u8 and i8 datatype

Uses uint and int internal processing instead of float

* Fix testsuite build errors

* minor change

* Fix QA check

* Modify api naming from image_sum to tensor_sum

Includes changes for both HOST and HIP

* Support HIP Backend for RICAP

* change rcm and rmn golden outputs

* Fix HIP pkd3->pkd3 variant

* changes based on review comments

* change test_suite folder to tests

* Optimize u8 and i8 datatype of HIP

Includes modification in naming of shared memory

* minor fix

* changed generic nn F32 loads using gather and setr instructions

* Optimize and cleanup U8 HIP

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Fix i8 datatype variants

Includes cleanup

* Fix the issues with color_to_greyscale

* remove the empty folder creation

* reverting back the folder name change

* minor change

* added comments for latest changes

* minor change

* Improve readability and Cleanup

* Fix QA for HIP

Includes cleanup

* resolved review comments

* minor change

* Modify api naming from image_ to tensor_ for HOST

* Add support for QA tests

* removed range check for RMN U8-F32 and U8-F16 variants

changed from hipMemset to hipMemsetAsync for RMN HIP Kernel

removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants

* Modify naming of shared memory with _smem in HIP

Includes cleanup

* Typecast and reuse markArr for HIP U8 and I8

* Cleanup and minor optimization

* minor fix

* fix codacy warnings

* Additional cleanup

* Cleanup and move #define

* Changed the complexity of if statements in runTests.py

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Codacy fixes

* Fix codacy warnings

* Codacy fix

* Address other codacy warnings

* cleanup

* Change Image functions to generic

* Update ricap.hpp with reference paper

* resolved minor issues happened with merge

* minor changes

* fixed minor issue with getting profiler times

* minor formatting changes

* resolved build issues in test suite

renamed the min and max kernel file names

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HIP pkd3->pkd3 variant

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Improve readability and Cleanup

* Additional cleanup

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Fix codacy warnings

* Address other codacy warnings

* Update ricap.hpp with reference paper

* Add RICAP dataset path in readme

* Make changes to error codes returned

* Modify roi crop region for unit and perf tests

* RPP Tensor Water Augmentation on HOST and HIP (#181)

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Add Avx2 implementation for F32 and U8 toggle variants

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* change F32 load and store logic

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* removed golden outputs for water

* minor changes

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* changed generic nn F32 loads using gather and setr instructions

* added comments for latest changes

* minor change

* added definition for storing 32 and 64 bits from a 128bit register

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix build error

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* Boost deps fix for test suite

---------

Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Documentation - Readme & changelog updates (#251)

* readme and changelog updates for 6.0

* minor update

* Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Resize Mirror Normalize Bugfix (#252)

* added fix for hipMemset

* remove pixel check for U8-F32 and U8-F16 for HOST codes

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

* Cmake fix to prevent warning

* Fix paths in new python scripts

* Sphinx - updates (#257)

* Sphinx - updates

* Doxygen - Updates

* Docs - Remove index.md

* Test suite fixes after tensor_min / tensor_max HOST merge

* Fix max case

* QA tests fix for hip and host

* naming convention changes as per new std

* Substitute imagePartial with partial

* Substitute imageMin/imageMax with min/max

* Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize

* Use variable instead of batchCount*4

* Use post increment effectivly

* Resolve codacy warnings

* Additional cleanup

* remove unused variable

* Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove auto merge boost

* Spaces formatting

* Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add support for mi300 (#269)

* Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Cleanup by removing oneliner functions as inline

* RPP Tensor Audio…

Loading branch information

16 people committed Jun 24, 2024

1 parent 82e47f5 commit 5e5d891

src/modules/hip/kernel/erase.hpp

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -117,12 +117,34 @@ RppStatus hip_exec_erase_tensor(T *srcPtr,
  
        int globalThreads_y = dstDescPtr->h;

        int globalThreads_z = handle.GetBatchSize();

        if ((srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NHWC))

        if (dstDescPtr->layout == RpptLayout::NHWC)

        {

            if (srcDescPtr->dataType == RpptDataType::U8)

            // if src layout is NHWC, copy src to dst

            if (srcDescPtr->layout == RpptLayout::NHWC)

            {

                hipMemcpyAsync(dstPtr, srcPtr, static_cast<size_t>(srcDescPtr->n * srcDescPtr->strides.nStride * sizeof(Rpp8u)), hipMemcpyDeviceToDevice, handle.GetStream());

                hipMemcpyAsync(dstPtr, srcPtr, static_cast<size_t>(srcDescPtr->n * srcDescPtr->strides.nStride * sizeof(T)), hipMemcpyDeviceToDevice, handle.GetStream());

                hipStreamSynchronize(handle.GetStream());

            }

            // if src layout is NCHW, convert src from NCHW to NHWC

            else if (srcDescPtr->layout == RpptLayout::NCHW)

            {

                globalThreads_x = (dstDescPtr->w + 7) >> 3;

                hipLaunchKernelGGL(convert_pln3_pkd3_hip_tensor,

                                   dim3(ceil((float)globalThreads_x/LOCAL_THREADS_X), ceil((float)globalThreads_y/LOCAL_THREADS_Y), ceil((float)globalThreads_z/LOCAL_THREADS_Z)),

                                   dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z),

                                   0,

                                   handle.GetStream(),

                                   srcPtr,

                                   make_uint3(srcDescPtr->strides.nStride, srcDescPtr->strides.cStride, srcDescPtr->strides.hStride),

                                   dstPtr,

                                   make_uint2(dstDescPtr->strides.nStride, dstDescPtr->strides.hStride),

                                   roiTensorPtrSrc);

                globalThreads_x = dstDescPtr->w;

                hipStreamSynchronize(handle.GetStream());

            }

            if (srcDescPtr->dataType == RpptDataType::U8)

            {

                hipLaunchKernelGGL(erase_pkd_hip_tensor,

                                   dim3(ceil((float)globalThreads_x / LOCAL_THREADS_X), ceil((float)globalThreads_y / LOCAL_THREADS_Y), ceil((float)globalThreads_z / LOCAL_THREADS_Z)),

                                   dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z),

    @@ -137,8 +159,6 @@ RppStatus hip_exec_erase_tensor(T *srcPtr,
  
            }

            else if (srcDescPtr->dataType == RpptDataType::F16)

            {

                hipMemcpyAsync(dstPtr, srcPtr, static_cast<size_t>(srcDescPtr->n * srcDescPtr->strides.nStride * sizeof(Rpp16f)), hipMemcpyDeviceToDevice, handle.GetStream());

                hipStreamSynchronize(handle.GetStream());

                hipLaunchKernelGGL(erase_pkd_hip_tensor,

                                   dim3(ceil((float)globalThreads_x / LOCAL_THREADS_X), ceil((float)globalThreads_y / LOCAL_THREADS_Y), ceil((float)globalThreads_z / LOCAL_THREADS_Z)),

                                   dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z),

    @@ -153,8 +173,6 @@ RppStatus hip_exec_erase_tensor(T *srcPtr,
  
            }

            else if (srcDescPtr->dataType == RpptDataType::F32)

            {

                hipMemcpyAsync(dstPtr, srcPtr, static_cast<size_t>(srcDescPtr->n * srcDescPtr->strides.nStride * sizeof(Rpp32f)), hipMemcpyDeviceToDevice, handle.GetStream());

                hipStreamSynchronize(handle.GetStream());

                hipLaunchKernelGGL(erase_pkd_hip_tensor,

                                   dim3(ceil((float)globalThreads_x / LOCAL_THREADS_X), ceil((float)globalThreads_y / LOCAL_THREADS_Y), ceil((float)globalThreads_z / LOCAL_THREADS_Z)),

                                   dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z),

    @@ -169,8 +187,6 @@ RppStatus hip_exec_erase_tensor(T *srcPtr,
  
            }

            else if (srcDescPtr->dataType == RpptDataType::I8)

            {

                hipMemcpyAsync(dstPtr, srcPtr, static_cast<size_t>(srcDescPtr->n * srcDescPtr->strides.nStride * sizeof(Rpp8s)), hipMemcpyDeviceToDevice, handle.GetStream());

                hipStreamSynchronize(handle.GetStream());

                hipLaunchKernelGGL(erase_pkd_hip_tensor,

                                   dim3(ceil((float)globalThreads_x / LOCAL_THREADS_X), ceil((float)globalThreads_y / LOCAL_THREADS_Y), ceil((float)globalThreads_z / LOCAL_THREADS_Z)),

                                   dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z),

    @@ -245,33 +261,6 @@ RppStatus hip_exec_erase_tensor(T *srcPtr,
  
                                   numBoxesTensor,

                                   roiTensorPtrSrc);

            }

            else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NHWC))

            {

                globalThreads_x = (dstDescPtr->w + 7) >> 3;

                hipLaunchKernelGGL(convert_pln3_pkd3_hip_tensor,

                                   dim3(ceil((float)globalThreads_x/LOCAL_THREADS_X), ceil((float)globalThreads_y/LOCAL_THREADS_Y), ceil((float)globalThreads_z/LOCAL_THREADS_Z)),

                                   dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z),

                                   0,

                                   handle.GetStream(),

                                   srcPtr,

                                   make_uint3(srcDescPtr->strides.nStride, srcDescPtr->strides.cStride, srcDescPtr->strides.hStride),

                                   dstPtr,

                                   make_uint2(dstDescPtr->strides.nStride, dstDescPtr->strides.hStride),

                                   roiTensorPtrSrc);

                hipStreamSynchronize(handle.GetStream());

                globalThreads_x = dstDescPtr->w;

                hipLaunchKernelGGL(erase_pkd_hip_tensor,

                                   dim3(ceil((float)globalThreads_x/LOCAL_THREADS_X), ceil((float)globalThreads_y/LOCAL_THREADS_Y), ceil((float)globalThreads_z/LOCAL_THREADS_Z)),

                                   dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z),

                                   0,

                                   handle.GetStream(),

                                   dstPtr,

                                   make_uint2(dstDescPtr->strides.nStride, dstDescPtr->strides.hStride),

                                   anchorBoxInfoTensor,

                                   colorsTensor,

                                   numBoxesTensor,

                                   roiTensorPtrSrc);

            }

        }

        return RPP_SUCCESS;

0 comments on commit `5e5d891`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `5e5d891`

Commit

There are no files selected for viewing

0 comments on commit 5e5d891

0 comments on commit `5e5d891`