RPP Tensor Audio Support - MelFilterBank (#332) · ROCm/rpp@30ce1d6

Commit

RPP Tensor Audio Support - MelFilterBank (#332)

* Initial commit - Non slient region detection

Includes unittest setup

* Initial commit - To Decibels

Includes unittest setup

* Intial commit - pre_emphasis_filter

* Intial commit - down_mixing

* Intial commit - slice_audio

* Intial commit - mel_filter_bank

* Replace vectors with arrays

* Cleanup

* Minor cleanup

* Optimize downmixing Kernel

Includes cleanup

* Remove unused variables in header file

* Add axes parameter

* Replace Rpp64s with Rpp32s

* Replace vectors with arrays

Includes optimization

* Cleanup

* Optimize and precompute cutOff

* Fix buffer used

* Fix buffer used

* Additional Cleanup

* Fix buffer allocation

Includes minor optimization

* Optimize post incrmeent operation

* Optimize post increment operation

* Optimize post increment operation

* Update testsuite for Audio

* code cleanup

* Add Readme file for Audio test suite

* changes based on review comments

* minor change

* Remove unittest folders and updated README.md

* Remove unit tests

* minor change

* code cleanup

* added common header file for audio helper functions

* removed unncessary audio wav files

fixed bug in ROI updation for audio test suite

resolved issue in summary generation for performance tests in python

* removed log file

* added doxygen support for audio

* added doxygen changes for to_decibels

* updated test suite support for to_decibels

* minor change

* added doxygen changes for preemphasis filter

* updated changes for preemphasis filter in test suite

* removed the usage of getMax function and used std::max_element

* modularized code in test suite

* merge with latest changes

* minor change

* minor change

* minor change

* resolved codacy warnings

* Codacy fix - Remove unused cpuTime

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* resolved issue with file_system dependency in test suite

* Doxygen changes

changed malloc to new in NSR kernel

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HIP pkd3->pkd3 variant

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Improve readability and Cleanup

* Additional cleanup

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Fix codacy warnings

* Address other codacy warnings

* Update ricap.hpp with reference paper

* Add RICAP dataset path in readme

* Make changes to error codes returned

* Modify roi crop region for unit and perf tests

* RPP Tensor Water Augmentation on HOST and HIP (#181)

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Add Avx2 implementation for F32 and U8 toggle variants

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* change F32 load and store logic

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* removed golden outputs for water

* minor changes

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* changed generic nn F32 loads using gather and setr instructions

* added comments for latest changes

* minor change

* added definition for storing 32 and 64 bits from a 128bit register

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix build error

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* Boost deps fix for test suite

---------

Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Documentation - Readme & changelog updates (#251)

* readme and changelog updates for 6.0

* minor update

* added ctests for audio test suite for CI

made changes to add more clarity on the QA Tests results

* Cmake mods for ctest

* HOST-only build error bugfix

* added qa mode paramter to python audio script

added golden output map for QA testing of Non silent region detection

* minor change

* Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Resize Mirror Normalize Bugfix (#252)

* added fix for hipMemset

* remove pixel check for U8-F32 and U8-F16 for HOST codes

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

* added example for MMS calculation in comments for better understanding

* Sphinx - updates (#257)

* Sphinx - updates

* Doxygen - Updates

* Docs - Remove index.md

* updated info used to for running audio test suite

* removed bitdepth variable from audio test suite

* added more information on computing NSR outputs in the example added

* Fix doxygen for decibels

Also removes extra QA reference files

* move tensor_host_audio.cpp to host folder

* Fix build errors and qa tests in Audio Test suite

* Fix build errors and qa tests in Audio Test suite

* Add reference output and test samples for downmix

* Add down_mix in augmentation list and supported cases

* move Tensor_host_audio.cpp to host folder

* fix qa mismatches

* move Tensor_host_audio.cpp to host folder

* fix qa mismatches

* Remove auto-merge repeated funcs

* Improve clarity of header docs

* Remove blank line

* Improve clarity on header docs

* Add Doxygen comments

* Add Doxygen comments

* Add Doxygen comments

* minor change

* converted golden outputs to binary file for downmixing

* removed old golden output file for preemphasis and todecibels

* modified info for downmixing as per new changes

used handle memory for temporary buffers

* formatting changes

* Initial commit - Spectrogram

* Add QA .bin reference file

* License - updates to 2024 and consistency changes (#298)

* Match all CMakeLists.txt license as per RPP's outermost LICENSE file

* Match all python files' license as per RPP's outermost LICENSE file

* Match all .hpp files' license as per RPP's outermost LICENSE file

* Match all .cpp files' license as per RPP's outermost LICENSE file

* Match all .h files' license as per RPP's outermost LICENSE file

* Remove all rights reserved as per LICENSE file

* Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc."

* Match all .cmake files' license as per RPP's outermost LICENSE file

* Match all .cpp.in files' license as per RPP's outermost LICENSE file

* Replace 283 occurrences in 282 files - 2023 to 2024

* Add "MIT License" title to 281 instances

* Add missing license

* Test - Update README.md for test_suite (#299)

* Address internal review comments

* Modify cmakelist

* Fix QA mismatch

* Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update doc codeowners (#303)

* Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Test suite - upgrade 5 qa perf (#305)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: r-abishek <abishek@multicorewareinc.com>

* RPP Color Temperature on HOST and HIP (#271)

* Initial commit - Color Temperature HOST Tensor

* Initial commit - Color Temperature HIP Tensor

* Add color temperature golden outputs

* address review comments

* Use reinterpret_cast instead of static_cast

* Combine templated functions to support all datatypes into one
(got minor perf difference of order 3%)

Also fixes indentation

* Fix i8 datatype

* Cleanup

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix PLN3 variant outputs

Also modifies reference outputs

* Update color_temperature.hpp license

* Delete color_temperature_u8_Tensor_PKD3.csv

* Delete color_temperature_u8_Tensor_PLN3.csv

---------

Co-authored-by: snehaa8 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272)

* added HOST support for voxel add kernel

* added HIP support for voxel add kernel

* added test suite support for add scalar

* added Doxygen support and modified hip kernel function names as per new standard

* added HOST support for voxel subtract kernel

* added HIP support for voxel subtract kernel

* added test suite support

* updated the golden outputs for subtract with correct values

* removed unnessary validation checks

* Remove double spaces

* Fix header

* Fix all retval docs

* Fix docs to add memory type

* Fix comment

* Add divider comment

* Use post-increment efficiently

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted add and subtract scalar golden outputs to bin files

* changed copyright from 2023 to 2024

* Update add_scalar.hpp license

* Update subtract_scalar.hpp license

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* RPP Magnitude on HOST and HIP (#278)

* Initial commit - Magnitude HOST Tensor

* Add QA reference outputs

* Update runTests.py

* Initial commit - Magnitude HIP Tensor

* Add dual input support in testsuite

* Optimize HOST kernel further

* Optimize i8 datatype further

* Modify comments

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update Copywright year

* Combine templated functions to support all datatypes

* Modify format of reference outputs

* Update rppi_arithmetic_operations.h license

* Update rppt_tensor_arithmetic_operations.h license

* Update host_tensor_arithmetic_operations.hpp

* Update magnitude.hpp license

* Update hip_tensor_arithmetic_operations.hpp license

* Delete magnitude_u8_Tensor_PKD3.csv

* Delete magnitude_u8_Tensor_PLN1.csv

* Delete magnitude_u8_Tensor_PLN3.csv

* Update rpp_test_suite_common.h license

* Update runTests.py license

* Update Tensor_hip.cpp license

* Update runTests.py license

* Update Tensor_host.cpp license

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Tensor Audio Support - Down Mixing (#296)

* Initial commit - Non slient region detection

Includes unittest setup

* Initial commit - To Decibels

Includes unittest setup

* Intial commit - pre_emphasis_filter

* Intial commit - down_mixing

* Replace vectors with arrays

* Cleanup

* Minor cleanup

* Optimize downmixing Kernel

Includes cleanup

* Replace Rpp64s with Rpp32s

* Cleanup

* Optimize and precompute cutOff

* Fix buffer used

* Fix buffer used

* Additional Cleanup

* Optimize post incrmeent operation

* Optimize post increment operation

* Update testsuite for Audio

* code cleanup

* Add Readme file for Audio test suite

* changes based on review comments

* minor change

* Remove unittest folders and updated README.md

* Remove unit tests

* minor change

* code cleanup

* added common header file for audio helper functions

* removed unncessary audio wav files

fixed bug in ROI updation for audio test suite

resolved issue in summary generation for performance tests in python

* removed log file

* added doxygen support for audio

* added doxygen changes for to_decibels

* updated test suite support for to_decibels

* minor change

* added doxygen changes for preemphasis filter

* updated changes for preemphasis filter in test suite

* removed the usage of getMax function and used std::max_element

* modularized code in test suite

* merge with latest changes

* minor change

* minor change

* minor change

* resolved codacy warnings

* Codacy fix - Remove unused cpuTime

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* resolved issue with file_system dependency in test suite

* Doxygen changes

changed malloc to new in NSR kernel

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HIP pkd3->pkd3 variant

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Improve readability and Cleanup

* Additional cleanup

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Fix codacy warnings

* Address other codacy warnings

* Update ricap.hpp with reference paper

* Add RICAP dataset path in readme

* Make changes to error codes returned

* Modify roi crop region for unit and perf tests

* RPP Tensor Water Augmentation on HOST and HIP (#181)

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Add Avx2 implementation for F32 and U8 toggle variants

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* change F32 load and store logic

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* removed golden outputs for water

* minor changes

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* changed generic nn F32 loads using gather and setr instructions

* added comments for latest changes

* minor change

* added definition for storing 32 and 64 bits from a 128bit register

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix build error

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* Boost deps fix for test suite

---------

Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Documentation - Readme & changelog updates (#251)

* readme and changelog updates for 6.0

* minor update

* added ctests for audio test suite for CI

made changes to add more clarity on the QA Tests results

* Cmake mods for ctest

* HOST-only build error bugfix

* added qa mode paramter to python audio script

added golden output map for QA testing of Non silent region detection

* minor change

* Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Resize Mirror Normalize Bugfix (#252)

* added fix for hipMemset

* remove pixel check for U8-F32 and U8-F16 for HOST codes

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

* added example for MMS calculation in comments for better understanding

* Sphinx - updates (#257)

* Sphinx - updates

* Doxygen - Updates

* Docs - Remove index.md

* updated info used to for running audio test suite

* removed bitdepth variable from audio test suite

* added more information on computing NSR outputs in the example added

* Fix doxygen for decibels

Also removes extra QA reference files

* move tensor_host_audio.cpp to host folder

* Fix build errors and qa tests in Audio Test suite

* Fix build errors and qa tests in Audio Test suite

* Add reference output and test samples for downmix

* Add down_mix in augmentation list and supported cases

* Remove auto-merge repeated funcs

* Improve clarity of header docs

* Remove blank line

* Improve clarity on header docs

* Add Doxygen comments

* minor change

* converted golden outputs to binary file for downmixing

* removed old golden output file for preemphasis and todecibels

* modified info for downmixing as per new changes

used handle memory for temporary buffers

* formatting changes

* moved the common code for SSE and AVX to outside

* Update down_mixing.hpp license

* Update rppt_tensor_audio_augmentations.h

* combined the srcLength and channels tensors into single tensor

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: Lisa <lisajdelaney@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sundarrajan98 <sundarrajan@multicorewareinc.com>

* RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306)

* added HIP support for voxel scalar multiply kernel

* added HOST support for voxel multiply kernel

added golden outputs for voxel multiply kernel

* merge with master

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted multiply scalar voxel golden outputs to bin files

* changed copyright from 2023 to 2024

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Test Suite Bugfix (#307)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

* Fix segmentation fault

* Revert QAmode to restrict HIP bitdepths

* Use Rpp64u for HOST while comparing outputs

* Fix ambiguous abs call

* Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data();

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: Pavel Tcherniaev <Pavel.Tcherniaev@amd.com>

* Fix build errors on OCL backend

* Fix spectrogram

Removes slice kernel

* Cleanup

Modify reference outputs

* Merge remote-tracking branch 'origin' into sn/audio_spectrogram_master_merge

* Fix build error in tensor testsuite

* Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Link cleanup (#326)

* link updates

* update tables

* pare down index

* API cleanup

* consistency

* verbiage

* Change to camelCase for variable naming

Also includes cleanup

* Cleanup testsuite for MFB

* Update notes

* Address review comments

* Revert change in runTests.py

* Modified codes to use handle memory

Also fixes reference output file

* Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Voxel Flip on HIP and HOST (#285)

* added support for flip voxel

* added test suite support

* added golden outputs for flip voxel

made changes in test suite to run QA tests for flip

* updated golden outputs with correct values

* minor bug fix in the hip test suite

* made changes to variable names for better readability

fixed comments in test suite

minor cleanup

* combined the flip axis factor as ternary operator in HIP kernel

added new enum for error handling when source and destination layouts are not matching

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted flip voxel golden outputs to bin files

* changed copyright from 2023 to 2024

* Update flip_voxel.hpp license

* License - updates to 2024 and consistency changes (#298)

* Match all CMakeLists.txt license as per RPP's outermost LICENSE file

* Match all python files' license as per RPP's outermost LICENSE file

* Match all .hpp files' license as per RPP's outermost LICENSE file

* Match all .cpp files' license as per RPP's outermost LICENSE file

* Match all .h files' license as per RPP's outermost LICENSE file

* Remove all rights reserved as per LICENSE file

* Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc."

* Match all .cmake files' license as per RPP's outermost LICENSE file

* Match all .cpp.in files' license as per RPP's outermost LICENSE file

* Replace 283 occurrences in 282 files - 2023 to 2024

* Add "MIT License" title to 281 instances

* Add missing license

* Test - Update README.md for test_suite (#299)

* Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update doc codeowners (#303)

* Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Test suite - upgrade 5 qa perf (#305)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: r-abishek <abishek@multicorewareinc.com>

* RPP Color Temperature on HOST and HIP (#271)

* Initial commit - Color Temperature HOST Tensor

* Initial commit - Color Temperature HIP Tensor

* Add color temperature golden outputs

* address review comments

* Use reinterpret_cast instead of static_cast

* Combine templated functions to support all datatypes into one
(got minor perf difference of order 3%)

Also fixes indentation

* Fix i8 datatype

* Cleanup

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix PLN3 variant outputs

Also modifies reference outputs

* Update color_temperature.hpp license

* Delete color_temperature_u8_Tensor_PKD3.csv

* Delete color_temperature_u8_Tensor_PLN3.csv

---------

Co-authored-by: snehaa8 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272)

* added HOST support for voxel add kernel

* added HIP support for voxel add kernel

* added test suite support for add scalar

* added Doxygen support and modified hip kernel function names as per new standard

* added HOST support for voxel subtract kernel

* added HIP support for voxel subtract kernel

* added test suite support

* updated the golden outputs for subtract with correct values

* removed unnessary validation checks

* Remove double spaces

* Fix header

* Fix all retval docs

* Fix docs to add memory type

* Fix comment

* Add divider comment

* Use post-increment efficiently

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted add and subtract scalar golden outputs to bin files

* changed copyright from 2023 to 2024

* Update add_scalar.hpp license

* Update subtract_scalar.hpp license

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* RPP Magnitude on HOST and HIP (#278)

* Initial commit - Magnitude HOST Tensor

* Add QA reference outputs

* Update runTests.py

* Initial commit - Magnitude HIP Tensor

* Add dual input support in testsuite

* Optimize HOST kernel further

* Optimize i8 datatype further

* Modify comments

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update Copywright year

* Combine templated functions to support all datatypes

* Modify format of reference outputs

* Update rppi_arithmetic_operations.h license

* Update rppt_tensor_arithmetic_operations.h license

* Update host_tensor_arithmetic_operations.hpp

* Update magnitude.hpp license

* Update hip_tensor_arithmetic_operations.hpp license

* Delete magnitude_u8_Tensor_PKD3.csv

* Delete magnitude_u8_Tensor_PLN1.csv

* Delete magnitude_u8_Tensor_PLN3.csv

* Update rpp_test_suite_common.h license

* Update runTests.py license

* Update Tensor_hip.cpp license

* Update runTests.py license

* Update Tensor_host.cpp license

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>

* Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Tensor Audio Support - Down Mixing (#296)

* Initial commit - Non slient region detection

Includes unittest setup

* Initial commit - To Decibels

Includes unittest setup

* Intial commit - pre_emphasis_filter

* Intial commit - down_mixing

* Replace vectors with arrays

* Cleanup

* Minor cleanup

* Optimize downmixing Kernel

Includes cleanup

* Replace Rpp64s with Rpp32s

* Cleanup

* Optimize and precompute cutOff

* Fix buffer used

* Fix buffer used

* Additional Cleanup

* Optimize post incrmeent operation

* Optimize post increment operation

* Update testsuite for Audio

* code cleanup

* Add Readme file for Audio test suite

* changes based on review comments

* minor change

* Remove unittest folders and updated README.md

* Remove unit tests

* minor change

* code cleanup

* added common header file for audio helper functions

* removed unncessary audio wav files

fixed bug in ROI updation for audio test suite

resolved issue in summary generation for performance tests in python

* removed log file

* added doxygen support for audio

* added doxygen changes for to_decibels

* updated test suite support for to_decibels

* minor change

* added doxygen changes for preemphasis filter

* updated changes for preemphasis filter in test suite

* removed the usage of getMax function and used std::max_element

* modularized code in test suite

* merge with latest changes

* minor change

* minor change

* minor change

* resolved codacy warnings

* Codacy fix - Remove unused cpuTime

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* resolved issue with file_system dependency in test suite

* Doxygen changes

changed malloc to new in NSR kernel

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HIP pkd3->pkd3 variant

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Improve readability and Cleanup

* Additional cleanup

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Fix codacy warnings

* Address other codacy warnings

* Update ricap.hpp with reference paper

* Add RICAP dataset path in readme

* Make changes to error codes returned

* Modify roi crop region for unit and perf tests

* RPP Tensor Water Augmentation on HOST and HIP (#181)

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Add Avx2 implementation for F32 and U8 toggle variants

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* change F32 load and store logic

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* removed golden outputs for water

* minor changes

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* changed generic nn F32 loads using gather and setr instructions

* added comments for latest changes

* minor change

* added definition for storing 32 and 64 bits from a 128bit register

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Fix build error

* CMakeLists - Version Update

1.5.0 - TOT Version

* CHANGELOG Updates

Version 1.5.0 placeholder

* Boost deps fix for test suite

---------

Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Documentation - Readme & changelog updates (#251)

* readme and changelog updates for 6.0

* minor update

* added ctests for audio test suite for CI

made changes to add more clarity on the QA Tests results

* Cmake mods for ctest

* HOST-only build error bugfix

* added qa mode paramter to python audio script

added golden output map for QA testing of Non silent region detection

* minor change

* Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Resize Mirror Normalize Bugfix (#252)

* added fix for hipMemset

* remove pixel check for U8-F32 and U8-F16 for HOST codes

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

* added example for MMS calculation in comments for better understanding

* Sphinx - updates (#257)

* Sphinx - updates

* Doxygen - Updates

* Docs - Remove index.md

* updated info used to for running audio test suite

* removed bitdepth variable from audio test suite

* added more information on computing NSR outputs in the example added

* Fix doxygen for decibels

Also removes extra QA reference files

* move tensor_host_audio.cpp to host folder

* Fix build errors and qa tests in Audio Test suite

* Fix build errors and qa tests in Audio Test suite

* Add reference output and test samples for downmix

* Add down_mix in augmentation list and supported cases

* Remove auto-merge repeated funcs

* Improve clarity of header docs

* Remove blank line

* Improve clarity on header docs

* Add Doxygen comments

* minor change

* converted golden outputs to binary file for downmixing

* removed old golden output file for preemphasis and todecibels

* modified info for downmixing as per new changes

used handle memory for temporary buffers

* formatting changes

* moved the common code for SSE and AVX to outside

* Update down_mixing.hpp license

* Update rppt_tensor_audio_augmentations.h

* combined the srcLength and channels tensors into single tensor

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com>
Co-authored-by: Lisa <lisajdelaney@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sundarrajan98 <sundarrajan@multicorewareinc.com>

* RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306)

* added HIP support for voxel scalar multiply kernel

* added HOST support for voxel multiply kernel

added golden outputs for voxel multiply kernel

* merge with master

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* converted multiply scalar voxel golden outputs to bin files

* changed copyright from 2023 to 2024

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Test Suite Bugfix (#307)

* experimental changes for adding qa mode for performance tests

* made changes to add display more information w.r.t QA results summary for performance tests

* minor changes

* Add changes to dump qa results to excel file

* Add performance QA for three new tensor functions

* update prerequisites in readme

* added changes to handle unsupported cases

* removed treshold dictionary and added performance Noise treshold
add new dataset for performance QA

* RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293)

* change golden outputs from .csv files to .bin files

* Changed comparision funtions to use .bin files

* Address review comments

* minor change

* Address review comments

* minor change

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>

* Changes to the performane summary dataframe

* minor changes

* Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI

* Update CMakeLists.txt fix

* Update CMakeLists.txt fix

* remove tabulate dependency

* Update README.md to remove tabulate pip install

* Fix for CI machine failure

* Add note on performance

* Fix segmentation fault

* Revert QAmode to restrict HIP bitdepths

* Use Rpp64u for HOST while comparing outputs

* Fix ambiguous abs call

* Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data();

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Snehaa Giridharan <snehaa@multicorewareinc.com>
Co-authored-by: Pavel Tcherniaev <Pavel.Tcherniaev@amd.com>

* Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313)

Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260)

* Minor Change

* Add Validation check for DST_FOLDER path

* added water HOST and HIP codes

* added water case in test suite

* added golden outputs for water

* Add Validation checks for all options in testAllScript.sh

* Add sanity check for dual Input cases
Set Max Dimension and Max Image Dump
Replaced Fast DCT tag with Accurate DCT

* Regenerate golden outputs using accurate dct Flag
Add golden outputs for some new augmentations

* Fix Flip golden outputs mismatch
Fix PLN3 variants mismatch in QA mode

* Add MAX_BATCH_SIZE check
removed Augmentations function calls for failing Qa modes
code cleanup

* Add crop and gamma correction augmentations
code cleanup

* Add comments to functions in rpp_test_suite_common.h

* minor change

* code cleanup

* minor code changes

* Change roi and Image sizes for crop augmentation

* Change numIterations option to numRuns
Addressed PR comments

* added omp thread changes for water augmentation

* experimental changes

* fixed output issue with AVX2 instructions

* added AVX2 support for PKD3 load function

minor changes in PLN variant load functions

* Add turboJpeg header to update maxHeight and maxWidth values

* nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion

* Change the performance Timings logic

* Add Avx2 implementation for F32 and U8 toggle variants

* minor change to support u8_f16 and u8_f32 cases

* Regenerate LUT golden outputs with ACCURATE_DCT tag

* Minor code changes

* Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation

* Made changes to the runTests.py in Host to remove testAllScipts.sh

* Made changes to the runTests.py in HIP to remove testAllScipts.sh

* Initial commit - Image min and max Reduction kernel

Includes
* u8 datatype for both min and max HOST Tensor of all variants.
* Testsuite changes.

* NWC -initial code for min max PLN3 - PLN3

* made changes to split min and max kernels seperately

* splitted kernels for min and max

* made changes to print final max/min in the R,G,B channels

* fixed inaccuracies in min/max computation

* made changes to typecast intermediate output to output requested by user

added comments for the code

code cleanup and minor changes in test suite

* fixed build issues

removed image folders used for min, max and sum

reverted unwanted file changes

* minor changes in test suite

* removed support for unwanted test case in Tensor_hip.cpp

* Adds new option roi

* remove testAllScripts.sh

* Adds roi Option in HIP backend

* Implement f32 variants

* Implement f16 and i8 datatype variants

* change F32 load and store logic

* Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration

* minor code changes

* Initial commit - Image sum Reduction kernel

Includes u8 PLN1 -> PLN1 conversion for HOST Tensor

* Implement PKD3 and PLN3 for Image sum Tensor HOST

* Support i8, f16 and f32 datatypes

* Initial commit - Image sum Reduction HIP kernel

Includes u8 PLN1 -> PLN1 conversion for Tensor

* Implement PKD3 and PLN3 for Image sum Tensor HIP

* Add support in testsuite

Revert normalization for i8 HOST Tensor variants

* Fix HIP testsuite

Remove additional blanks for 1 channel output

* Modify print statement in HIP testsuite

* Improve readability for testsuite outputs

* optimized the store function for F32 PLN3-PKD3

* reverted back irrelevant changes

* minor change

* Fix HIP to support larger inputs

* optimized load and store functions for water U8 and F32 variants in host

removed commented code

* Cleanup

* removed golden outputs for water

* minor changes

* Cleanup

Support Reduction QA test in testsuite

* renamed few functions and removed unused functions

updated i8 pln1 load as per the optimized u8 pln1 load

* fixed bug in i8 load function

* Remove unused variables and C style casting

* changed cast to c++ style

resolved spacing issues and added comments for AVX codes for better understanding

made changes to handle cases where QA Tests are not supported

* added golden outputs for water

* updated golden outputs with latest changes

* modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code

* fixed minor bug in I8 variants

* Optimize u8 datatype further

* Fix static_cast

* made to changes to resolve codacy warnings

* changed cast to c++ style in hip kernel

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* added rotate case with golden outputs

changed generic bilinear HOST codes to match with HIP codes

* Add golden output for remaining all tensor augmentations

* fix python script issues

* Optimize u8 and i8 datatype

Uses uint and int internal processing instead of float

* Fix testsuite build errors

* minor change

* Fix QA check

* Modify api naming from image_sum to tensor_sum

Includes changes for both HOST and HIP

* Support HIP Backend for RICAP

* change rcm and rmn golden outputs

* Fix HIP pkd3->pkd3 variant

* changes based on review comments

* change test_suite folder to tests

* Optimize u8 and i8 datatype of HIP

Includes modification in naming of shared memory

* minor fix

* changed generic nn F32 loads using gather and setr instructions

* Optimize and cleanup U8 HIP

* regenerated golden outputs for RICAP

minor changes in HOST shell script for handling RICAP in QA mode

* minor bug fix in RICAP HIP kernels

* Fix i8 datatype variants

Includes cleanup

* Fix the issues with color_to_greyscale

* remove the empty folder creation

* reverting back the folder name change

* minor change

* added comments for latest changes

* minor change

* Improve readability and Cleanup

* Fix QA for HIP

Includes cleanup

* resolved review comments

* minor change

* Modify api naming from image_ to tensor_ for HOST

* Add support for QA tests

* removed range check for RMN U8-F32 and U8-F16 variants

changed from hipMemset to hipMemsetAsync for RMN HIP Kernel

removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants

* Modify naming of shared memory with _smem in HIP

Includes cleanup

* Typecast and reuse markArr for HIP U8 and I8

* Cleanup and minor optimization

* minor fix

* fix codacy warnings

* Additional cleanup

* Cleanup and move #define

* Changed the complexity of if statements in runTests.py

* Cleanup testsuite

Includes new golden outputs

* Additional testuite fixes

* Minor cleanup

* Codacy fixes

* Fix codacy warnings

* Codacy fix

* Address other codacy warnings

* cleanup

* Change Image functions to generic

* Update ricap.hpp with reference paper

* resolved minor issues happened with merge

* minor changes

* fixed minor issue with getting profiler times

* minor formatting changes

* resolved build issues in test suite

renamed the min and max kernel file names

* RPP RICAP Tensor for HOST and HIP (#213)

* Initial commit - Ricap HOST Tensor

Includes testsuite changes

* Add QA tests for RICAP

Used three_images_224x224_src1 folder to create golden outputs

* Add three_images_224x224_src1 into TEST_IMAGES

* Support HIP Backend for RICAP

* Fix HI…

Loading branch information

15 people committed May 8, 2024

1 parent 2decd32 commit 30ce1d6

include/rppdefs.h

-Original file line number
+Diff line change
@@ Expand Up / @@ -369,10 +369,13 @@ typedef enum @@
      */
     typedef enum
     {
-        NCHW,
-        NHWC,
-        NCDHW,
-        NDHWC
+        NCHW,   // BatchSize-Channels-Height-Width
+        NHWC,   // BatchSize-Height-Width-Channels
+        NCDHW,  // BatchSize-Channels-Depth-Height-Width
+        NDHWC,  // BatchSize-Depth-Height-Width-Channels
+        NHW,    // BatchSize-Height-Width
+        NFT,    // BatchSize-Frequency-Time -> Frequency Major used for Spectrogram / MelfilterBank
+        NTF     // BatchSize-Time-Frequency -> Time Major used for Spectrogram / MelfilterBank
     } RpptLayout;
     /*! \brief RPPT Tensor 2D ROI type enum
@@ Expand Down Expand Up / @@ -434,6 +437,15 @@ typedef enum @@
         TF,      //Time Major
     } RpptSpectrogramLayout;
+    /*! \brief RPPT Mel Scale Formula
+     * \ingroup group_rppdefs
+     */
+    typedef enum
+    {
+        SLANEY = 0,  // Follows Slaney’s MATLAB Auditory Modelling Work behavior
+        HTK,         // Follows O’Shaughnessy’s book formula, consistent with Hidden Markov Toolkit(HTK), m = 2595 * log10(1 + (f/700))
+    } RpptMelScaleFormula;
     /*! \brief RPPT Tensor 2D ROI LTRB struct
      * \ingroup group_rppdefs
      */
@@ Expand Down @@

include/rppt_tensor_audio_augmentations.h

-Original file line number
+Diff line change
@@ Expand Up @@
      */
     RppStatus rppt_spectrogram_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcLengthTensor, bool centerWindows, bool reflectPadding, Rpp32f *windowFunction, Rpp32s nfft, Rpp32s power, Rpp32s windowLength, Rpp32s windowStep, RpptSpectrogramLayout layout, rppHandle_t rppHandle);
+    /*! \brief Mel filter bank augmentation HOST backend
+     * \details Mel filter bank augmentation for audio data
+     * \param[in] srcPtr source tensor in HOST memory
+     * \param[in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32, layout - NFT / NTF)
+     * \param[out] dstPtr destination tensor in HOST memory
+     * \param[in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32, layout - NFT / NTF)
+     * \param[in] srcDimsTensor source audio buffer length and number of channels (1D tensor in HOST memory, of size batchSize * 2)
+     * \param[in] maxFreq maximum frequency if not provided maxFreq = sampleRate / 2
+     * \param[in] minFreq minimum frequency
+     * \param[in] melFormula formula used to convert frequencies from hertz to mel and from mel to hertz (SLANEY / HTK)
+     * \param[in] numFilter number of mel filters
+     * \param[in] sampleRate sampling rate of the audio
+     * \param[in] normalize boolean variable that determine whether to normalize weights / not
+     * \param[in] rppHandle RPP HOST handle created with <tt>\ref rppCreateWithBatchSize()</tt>
+     * \return A <tt> \ref RppStatus</tt> enumeration.
+     * \retval RPP_SUCCESS Successful completion.
+     * \retval RPP_ERROR* Unsuccessful completion.
+     */
+    RppStatus rppt_mel_filter_bank_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcDims, Rpp32f maxFreq, Rpp32f minFreq, RpptMelScaleFormula melFormula, Rpp32s numFilter, Rpp32f sampleRate, bool normalize, rppHandle_t rppHandle);
     /*! \brief Resample augmentation on HOST backend
     * \details Resample augmentation for audio data
     * \param[in] srcPtr source tensor in HOST memory
@@ Expand Down @@

src/modules/cpu/host_tensor_audio_augmentations.hpp

-Original file line number
+Diff line change
@@ Expand Up / @@ -30,6 +30,7 @@ SOFTWARE. @@
     #include "kernel/pre_emphasis_filter.hpp"
     #include "kernel/down_mixing.hpp"
     #include "kernel/spectrogram.hpp"
+    #include "kernel/mel_filter_bank.hpp"
     #include "kernel/resample.hpp"
     #endif // HOST_TENSOR_AUDIO_AUGMENTATIONS_HPP

src/modules/cpu/kernel/mel_filter_bank.hpp

-Original file line number
+Diff line change
@@ -0,0 +1,252 @@
+    /*
+    MIT License
+    Copyright (c) 2019 - 2024 Advanced Micro Devices, Inc.
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to deal
+    in the Software without restriction, including without limitation the rights
+    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+    copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+    The above copyright notice and this permission notice shall be included in all
+    copies or substantial portions of the Software.
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+    SOFTWARE.
+    */
+    #include "rppdefs.h"
+    #include "rpp_cpu_simd.hpp"
+    #include "rpp_cpu_common.hpp"
+    struct BaseMelScale
+    {
+        public:
+            virtual Rpp32f hz_to_mel(Rpp32f hz) = 0;
+            virtual Rpp32f mel_to_hz(Rpp32f mel) = 0;
+            virtual ~BaseMelScale() = default;
+    };
+    struct HtkMelScale : public BaseMelScale
+    {
+        Rpp32f hz_to_mel(Rpp32f hz) { return 1127.0f * std::log(1.0f + (hz / 700.0f)); }
+        Rpp32f mel_to_hz(Rpp32f mel) { return 700.0f * (std::exp(mel / 1127.0f) - 1.0f); }
+        public:
+            ~HtkMelScale() {};
+    };
+    struct SlaneyMelScale : public BaseMelScale
+    {
+        const Rpp32f freqLow = 0;
+        const Rpp32f fsp = 200.0 / 3.0;
+        const Rpp32f minLogHz = 1000.0;
+        const Rpp32f minLogMel = (minLogHz - freqLow) / fsp;
+        const Rpp32f stepLog = 0.068751777;  // Equivalent to std::log(6.4) / 27.0;
+        const Rpp32f invMinLogHz = 1.0f / 1000.0;
+        const Rpp32f invStepLog = 1.0f / stepLog;
+        const Rpp32f invFsp = 1.0f / fsp;
+        Rpp32f hz_to_mel(Rpp32f hz)
+        {
+            Rpp32f mel = 0.0f;
+            if (hz >= minLogHz)
+                mel = minLogMel + std::log(hz * invMinLogHz) * invStepLog;
+            else
+                mel = (hz - freqLow) * invFsp;
+            return mel;
+        }
+        Rpp32f mel_to_hz(Rpp32f mel)
+        {
+            Rpp32f hz = 0.0f;
+            if (mel >= minLogMel)
+                hz = minLogHz * std::exp(stepLog * (mel - minLogMel));
+            else
+                hz = freqLow + mel * fsp;
+            return hz;
+        }
+        public:
+            ~SlaneyMelScale() {};
+    };
+    RppStatus mel_filter_bank_host_tensor(Rpp32f *srcPtr,
+                                          RpptDescPtr srcDescPtr,
+                                          Rpp32f *dstPtr,
+                                          RpptDescPtr dstDescPtr,
+                                          Rpp32s *srcDimsTensor,
+                                          Rpp32f maxFreqVal,    // check unused
+                                          Rpp32f minFreqVal,
+                                          RpptMelScaleFormula melFormula,
+                                          Rpp32s numFilter,
+                                          Rpp32f sampleRate,
+                                          bool normalize,
+                                          rpp::Handle& handle)
+    {
+        BaseMelScale *melScalePtr;
+        switch(melFormula)
+        {
+            case RpptMelScaleFormula::HTK:
+                melScalePtr = new HtkMelScale;
+                break;
+            case RpptMelScaleFormula::SLANEY:
+            default:
+                melScalePtr = new SlaneyMelScale();
+                break;
+        }
+        Rpp32u numThreads = handle.GetNumThreads();
+        Rpp32u batchSize = srcDescPtr->n;
+        Rpp32f *scratchMem = handle.GetInitHandle()->mem.mcpu.scratchBufferHost;
+        Rpp32f maxFreq = sampleRate / 2;
+        Rpp32f minFreq = minFreqVal;
+        // Convert lower, higher frequencies to mel scale and find melStep
+        Rpp64f melLow = melScalePtr->hz_to_mel(minFreq);
+        Rpp64f melHigh = melScalePtr->hz_to_mel(maxFreq);
+        Rpp64f melStep = (melHigh - melLow) / (numFilter + 1);
+        omp_set_dynamic(0);
+    #pragma omp parallel for num_threads(numThreads)
+        for(int batchCount = 0; batchCount < batchSize; batchCount++)
+        {
+            Rpp32f *srcPtrTemp = srcPtr + batchCount * srcDescPtr->strides.nStride;
+            Rpp32f *dstPtrTemp = dstPtr + batchCount * dstDescPtr->strides.nStride;
+            // Extract nfft, number of Frames, numBins
+            Rpp32s nfft = (srcDimsTensor[batchCount * 2] - 1) * 2;
+            Rpp32s numBins = nfft / 2 + 1;
+            Rpp32s numFrames = srcDimsTensor[batchCount * 2 + 1];
+            // Find hzStep
+            Rpp64f hzStep = static_cast<Rpp64f>(sampleRate) / nfft;
+            Rpp64f invHzStep = 1.0 / hzStep;
+            // Find fftBinStart and fftBinEnd
+            Rpp32s fftBinStart = std::ceil(minFreq * invHzStep);
+            Rpp32s fftBinEnd = std::ceil(maxFreq * invHzStep);
+            fftBinEnd = std::min(fftBinEnd, numBins);
+            // Set/Fill normFactors, weightsDown and intervals
+            Rpp32f *normFactors = scratchMem + (batchCount * numFilter);
+            std::fill(normFactors, normFactors + numFilter, 1.f);           // normFactors contain numFilter values of type float
+            Rpp32f *weightsDown = scratchMem + (batchSize * numFilter) + (batchCount * numBins);
+            memset(weightsDown, 0, sizeof(numBins * sizeof(Rpp32f)));       // weightsDown contain numBins values of type float
+            Rpp32s *intervals = reinterpret_cast<Rpp32s *>(weightsDown + (batchSize * numBins));
+            std::fill(intervals, intervals + numBins, -1);                  // intervals contain numBins values of type integer
+            Rpp32s fftBin = fftBinStart;
+            Rpp64f mel0 = melLow, mel1 = melLow + melStep;
+            Rpp64f fIter = fftBin * hzStep;
+            for (int interval = 0; interval < numFilter + 1; interval++, mel0 = mel1, mel1 += melStep)
+            {
+                Rpp64f f0 = melScalePtr->mel_to_hz(mel0);
+                Rpp64f f1 = melScalePtr->mel_to_hz(interval == numFilter ? melHigh : mel1);
+                Rpp64f slope = 1. / (f1 - f0);
+                if (normalize && interval < numFilter)
+                {
+                    Rpp64f f2 = melScalePtr->mel_to_hz(mel1 + melStep);
+                    normFactors[interval] = 2.0 / (f2 - f0);
+                }
+                for (; fftBin < fftBinEnd && fIter < f1; fftBin++, fIter = fftBin * hzStep)
+                {
+                    weightsDown[fftBin] = (f1 - fIter) * slope;
+                    intervals[fftBin] = interval;
+                }
+            }
+            Rpp32u maxFrames = std::min(static_cast<Rpp32u>(numFrames + 8), dstDescPtr->strides.hStride);
+            Rpp32u maxAlignedLength = maxFrames & ~7;
+            Rpp32u vectorIncrement = 8;
+            // Set ROI values in dst buffer to 0.0
+            for(int i = 0; i < numFilter; i++)
+            {
+                Rpp32f *dstPtrRow = dstPtrTemp + i * dstDescPtr->strides.hStride;
+                Rpp32u vectorLoopCount = 0;
+                for(; vectorLoopCount < maxAlignedLength; vectorLoopCount += 8)
+                {
+                    _mm256_storeu_ps(dstPtrRow, avx_p0);
+                    dstPtrRow += 8;
+                }
+                for(; vectorLoopCount < maxFrames; vectorLoopCount++)
+                    *dstPtrRow++ = 0.0f;
+            }
+            Rpp32u alignedLength = numFrames & ~7;
+            __m256 pSrc, pDst;
+            Rpp32f *srcRowPtr = srcPtrTemp + fftBinStart * srcDescPtr->strides.hStride;
+            for (int64_t fftBin = fftBinStart; fftBin < fftBinEnd; fftBin++)
+            {
+                auto filterUp = intervals[fftBin];
+                auto weightUp = 1.0f - weightsDown[fftBin];
+                auto filterDown = filterUp - 1;
+                auto weightDown = weightsDown[fftBin];
+                if (filterDown >= 0)
+                {
+                    Rpp32f *dstRowPtrTemp = dstPtrTemp + filterDown * dstDescPtr->strides.hStride;
+                    Rpp32f *srcRowPtrTemp = srcRowPtr;
+                    if (normalize)
+                        weightDown *= normFactors[filterDown];
+                    __m256 pWeightDown = _mm256_set1_ps(weightDown);
+                    int vectorLoopCount = 0;
+                    for(; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrement)
+                    {
+                        pSrc = _mm256_loadu_ps(srcRowPtrTemp);
+                        pSrc = _mm256_mul_ps(pSrc, pWeightDown);
+                        pDst = _mm256_loadu_ps(dstRowPtrTemp);
+                        pDst = _mm256_add_ps(pDst, pSrc);
+                        _mm256_storeu_ps(dstRowPtrTemp, pDst);
+                        dstRowPtrTemp += vectorIncrement;
+                        srcRowPtrTemp += vectorIncrement;
+                    }
+                    for (; vectorLoopCount < numFrames; vectorLoopCount++)
+                        (*dstRowPtrTemp++) += weightDown * (*srcRowPtrTemp++);
+                }
+                if (filterUp >= 0 && filterUp < numFilter)
+                {
+                    Rpp32f *dstRowPtrTemp = dstPtrTemp + filterUp *  dstDescPtr->strides.hStride;
+                    Rpp32f *srcRowPtrTemp = srcRowPtr;
+                    if (normalize)
+                        weightUp *= normFactors[filterUp];
+                    __m256 pWeightUp = _mm256_set1_ps(weightUp);
+                    int vectorLoopCount = 0;
+                    for(; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrement)
+                    {
+                        pSrc = _mm256_loadu_ps(srcRowPtrTemp);
+                        pSrc = _mm256_mul_ps(pSrc, pWeightUp);
+                        pDst = _mm256_loadu_ps(dstRowPtrTemp);
+                        pDst = _mm256_add_ps(pDst, pSrc);
+                        _mm256_storeu_ps(dstRowPtrTemp, pDst);
+                        dstRowPtrTemp += vectorIncrement;
+                        srcRowPtrTemp += vectorIncrement;
+                    }
+                    for (; vectorLoopCount < numFrames; vectorLoopCount++)
+                        (*dstRowPtrTemp++) += weightUp * (*srcRowPtrTemp++);
+                }
+                srcRowPtr += srcDescPtr->strides.hStride;
+            }
+        }
+        delete melScalePtr;
+        return RPP_SUCCESS;
+    }

src/modules/rppt_tensor_audio_augmentations.cpp

-Original file line number
+Diff line change
@@ Expand Up / @@ -197,6 +197,46 @@ RppStatus rppt_spectrogram_host(RppPtr_t srcPtr, @@
         }
     }
+    /******************** mel_filter_bank ********************/
+    RppStatus rppt_mel_filter_bank_host(RppPtr_t srcPtr,
+                                        RpptDescPtr srcDescPtr,
+                                        RppPtr_t dstPtr,
+                                        RpptDescPtr dstDescPtr,
+                                        Rpp32s* srcDimsTensor,
+                                        Rpp32f maxFreq,
+                                        Rpp32f minFreq,
+                                        RpptMelScaleFormula melFormula,
+                                        Rpp32s numFilter,
+                                        Rpp32f sampleRate,
+                                        bool normalize,
+                                        rppHandle_t rppHandle)
+    {
+        if (srcDescPtr->layout != RpptLayout::NFT) return RPP_ERROR_INVALID_SRC_LAYOUT;
+        if (dstDescPtr->layout != RpptLayout::NFT) return RPP_ERROR_INVALID_DST_LAYOUT;
+        if ((srcDescPtr->dataType == RpptDataType::F32) && (dstDescPtr->dataType == RpptDataType::F32))
+        {
+            mel_filter_bank_host_tensor(static_cast<Rpp32f*>(srcPtr),
+                                        srcDescPtr,
+                                        static_cast<Rpp32f*>(dstPtr),
+                                        dstDescPtr,
+                                        srcDimsTensor,
+                                        maxFreq,
+                                        minFreq,
+                                        melFormula,
+                                        numFilter,
+                                        sampleRate,
+                                        normalize,
+                                        rpp::deref(rppHandle));
+            return RPP_SUCCESS;
+        }
+        else
+        {
+            return RPP_ERROR_NOT_IMPLEMENTED;
+        }
+    }
     /******************** resample ********************/
     RppStatus rppt_resample_host(RppPtr_t srcPtr,
@@ Expand Down @@

0 comments on commit `30ce1d6`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `30ce1d6`

Commit

There are no files selected for viewing

0 comments on commit 30ce1d6

0 comments on commit `30ce1d6`