Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSE optimization for flip and rotate #10

Closed
wants to merge 4 commits into from

Conversation

rrawther
Copy link
Contributor

AVX_simd added for flip

@rrawther rrawther changed the title Rr/simd adv SSE optimization for flip Nov 15, 2019
@rrawther rrawther changed the title SSE optimization for flip SSE optimization for flip and rotate Nov 15, 2019
@LakshmiKumar23
Copy link
Contributor

@Reza-Najafi waiting for @rrawther to fix conflicts

@Reza-Najafi
Copy link

Reza-Najafi commented Nov 15, 2019 via email

LokeshBonta pushed a commit to LokeshBonta/rpp that referenced this pull request Aug 6, 2020
* Fix Bug in ColorTwist (ROCm#6)

* Fix Bug in ColorTwist (ROCm#6) (ROCm#8) (ROCm#9)

* Update
kiritigowda pushed a commit that referenced this pull request Aug 25, 2020
* Changed Channel extract and channel combine function call

* updated erode dilate kernals [OCL]

* Non Working [FULLY BUILD] code for min_max_loc and mean_stddev

* Updated Rain GPU kernel for multiple destination image calls [OCL]

* Updated Median, Non Max and Histogram and added support for mean

* Updated tensor [OCL]

* updated table lookup [OCL]

* small updates in mean and stddev [OCL]

* Full functioning code for mean and standard deviation [OCL]

* Added Support to Min Max Location [OCL]

* Added support for gaussian_image_pyramid [OCL]

* Added support for laplacian_image_pyramid [OCL]

* small modification in LIP [OCL]

* small modification in Min Max Location and Mean stddev [OCL]

* box filter hisEq [OCL]

* Added support for gaussian filter

* Added support for bin in Histogram [OCL]

* updated sobel [OCL]

* Update in Temperature [CPU]

* FIX SNP CPU half noise issue [OCL]

* fin small change in Absolute difference [OCL]

* Small changes in Custom convolution and table lookup [OCL}

* Fix regressions due to scripting [cl & CPU].

* fix histogram [OCL]

* Updated snow [OCL]

* updated snow [CPU]

* small update in Snow [OCL]

* Modify filter_operations to add gaussian_filter with same backend as blur

* Fix issue with rain Grey Scale [OCL]

* Fix Rain GPU Transparancy [OCL]

* Add Kernel Caching using Map/Kernelmanger

* Resolved histogram grayscale issue in GPU

* Resolved histogram grayscale issue in GPU

* Fix the bug in warp affine planar call

* Fix issue with resize crop validation [cl & CPU].

* Cl_enque_buffer, the argument is set to CL_FALSE

* Fix Gamma correction [OCL]

* minor changes to gamma_correction, vignette commons, flip functionalities

* Fix Jitter with new Implementation [CPU & OCL]

* Modify brightness bug that gave patches in output

* Fix the buy with Lens correction [OCL].

* Fix Median filter issue

* Modify rotate to match GPU functionality

* Fix median filter

* merge abi-dev-host-ms4 to main-hipcl-dev

* Fix a round about fix for Hue and Saturation Shift

* Modify scale to match GPU functionality

* Fix a round about fix for Hue and Saturation Shift

* Fix syntax error in hsvkernel

* changed CL_False to CL_True in minmax location

* Resolve merg

* Modify warp affine to match GPU - inversion exists

* Add validation for Warp Affine Matrix

* changes in Warp Affine

* Add Blocking calls [CL_TRUE flag is on]

* Removed validation printf statements in the library

* Removed a syntax error

* Add extra validation for contrast

* Fix issue with rain [CPU]

* Added support to new Pixelate [OCL & CPU]

* Fix issue with Fish eye [OCL]

* Modify Histogram Implementation

* Histogram Balance Fix

* Update Readme.md

Amended the list

* Update Readme.md

* Fix Histogram Planar Version

* Add new support to Histogram [OCL]

* Remove all files to include batch version

* Move Mem-Mgmt_HIP branch files to master

* Update Readme.md

* Put all the recent changes or RPP here

* Fix Border issues in crop mirror normalize and crop

* Fix Crop mirror normalize border issue

* Add RPP UnitTests

* Add f32 support for crop_mirror_normalize

* Add f32 support for crop

* Add f32 support for resize_crop_mirror

* Add f32 support for resize and resize_crop

* Add f32 support for color_twist

* Correct blur

* Add f32 support for rotate

* Add f16 host support for rotate, resize, resize_crop, crop, resize_crop_mirror, crop_mirror_normalize, color_twist

* Major changes to host test suite

* Separate host test suites for pkd3 and pln1

* modify rpp_unittests host

* correct additional folder creation and readme

* Minor correction in pln1/pkd3 host test scripts

* Add basic float tensor support

* Add FP32 and FP16 support for Crop function

* Fix bug in crop

* crop mirror normalize report

* Float Support for Rotate GPU

* Add Kernel Support in OCL for colorTwist and resize funtionalities

* Add float support for ColorTwist and Resize Crop Mirror - FP16 and FP32

* Code Refactoring and Rotate Support for FP16 and FP32

* Fix Rotate Float issue

* Fix FP32 Rotate Issue

* Add Resize Function

* Add Resize Crop Mirror in GPU OCL

* Fix Typo

* Add Resize Crop GPU FP16 and FP32 support

* Update rppdefs.h

* Crop Mirror Normalize Support is added

* Support for ColorTwist in Float space

* Update Colortwwist.cl - temp

* Remove MIOPEN dependency in RPP build set-up

* Update colortwist.cl

* Fix Bug in ColorTwist

* Fix Bug in ColorTwist (#6)

* API refactoring for fused_functions

* Fix make_data-type bug and code formatting

* Testsuite for Float Support Functions

* Removed the brace in switchcase

* Add free statements for unreleased memomry and f16 fix for colortwist

* rename folders

* Fix Resize for U8 case

* minor change in BatchPD host

* Fix type error in resize.cl

* Fix float errors for resize fucntions

* foramt file

* Fix Bug in ColorTwist (#6) (#8)

* Fix Bug in ColorTwist (#6) (#8) (#9)

* Update

* update (#10)

* Fix Bug in ColorTwist (#6)

* Fix Bug in ColorTwist (#6) (#8) (#9)

* Update

* Format files

* New Changes (#11)

* Fix Bug in ColorTwist (#6)

* Fix Bug in ColorTwist (#6) (#8) (#9)

* Update

* Format files

* Correct f16 color twist host bug

* Change test suite to input 0-1 normalized values for all f16/f32 functionalities

* Refactor API code for geometry_transforms

* Added Testsuite for Float Functions in OCL

* AMD Docs

* Create install.rst

* Update index.rst

* Add host support for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize

* Add host support in test suite for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize

* Add host support for i8 in resize, crop, cmn, rotate, resize_crop, resize_crop_mirror and color_twist

* Add host test suite support for i8

* Add host support for u8->i8 in crop, resize, crop_mirror_normalize

* modify test suite

* Add host plan1 test suite to SOW3_HOST

* crop mirror normalize full support in w.r.t type change and layout change

* Add API calls for CMN function for new set of variations

* Fix bug with respect to I8

* change type info in kernels

* Fix cmn bub

* Support I8 for Rotate

* Int 8 support for colortwist and code refactoring

* Add int8 support for resize crop mirror function

* resize crop mirror int8 support is added

* Crop various variations are added

* Add crop support for all the conversions

* Add host support for resize outputFormatToggle

* Add host support for crop outputFormatToggle

* Add host support for rotate outputFormatToggle

* Add host support for resize_crop outputFormatToggle

* Add host support for resize_crop_mirror outputFormatToggle

* Add host support for crop_mirror_normalize outputFormatToggle

* Add host support for color_twist outputFormatToggle and all other pln->pkd support

* Add missing pln3 API for crop host

* Major modifications in test suite and ReadMe for pkd3, pln3 and pln1 inputs for host

* Modify resize kernel

* Add outputtoggle in the API and functions

* Add new changes to all the fused function w.r.t to outputFormatToggle

* Add pln3 api for Crop on GPU

* add missing API for resize cro

* Fix compilation bugs

* Remove unnecessary functions  and fix build bug

* Add ocl testing framework

* Fix bug in rotate helper

* Minor temp changes in test code to accomodate PKD3 input U8 cases with toggle format

* Correct resize_u8_i8_pkd

* Fix resize kenel issues for output toogle change

* colortwist bug fix

* Fix colortwist bug

* resize tensor fix

* Minor mods to both pln3 and pkd3 test suite to accomodate CMN's ability to do U8 format toggles

* Corrections in PLN3 input funcitons for host

* Fix bugs in Fused function new code

* Add changes relatedd to planar format in padded

* Fix issues with pln3 colortwist

* Fix issue with test suite

* Add pln3 testing and fix issues

* Modify a few things in test script

* Fix pln3 issue for FP16 for Rotate

* Fix index issues with Test suit

* Add output layout toggle for host API

* ix pln3 issues in test suite
Fix pln1 issues in testsuite
Fix other minor bugs

* Change paramerter order in resize pd pln host

* remove print statements

* Update README.MD

* Codacy issues corrections in utilities/rpp-unittests

* Codacy issues corrections for resize kernel

* Codacy issues corrections in utilities/rpp-unittests OCL/HIP

* Codacy issues corrections in utilities/rpp-unittests

* Codacy issues corrections in utilities/rpp-unittests

* Fix some codecy issues

* Remove some Codecy issues in rpp unnittests

* Remove a few codecy issues

* Remove Print statements

Co-authored-by: Muthukumaravel <muthukumaravel@multicorewareinc.com>
Co-authored-by: shobana-mcw <shobana@multicorewareinc.com>
Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com>
Co-authored-by: LokeshBonta <you@example.com>
Co-authored-by: Reza <Seyedreza.Najafi@amd.com>
Co-authored-by: Swetha B S <swetha@multicorewareinc.com>
kiritigowda pushed a commit that referenced this pull request Oct 29, 2020
* Modify phase for visualization

* Pre-MS4 optimizations on arithmetic_operations

* Pre-MS4 optimizations on arithmetic_operations

* Pre-MS4 optimizations on morphological_transforms

* Added support for table lookup [OCL]

* Fix issues with pixelate greyscale.

* Pre-MS4 optimizations on color_model_conversions

* Modify sobel_filter functionality to match GPU impl.

* mean and stddev base function [OCL]

* Changed Channel extract and channel combine function call

* updated erode dilate kernals [OCL]

* Non Working [FULLY BUILD] code for min_max_loc and mean_stddev

* Updated Rain GPU kernel for multiple destination image calls [OCL]

* Updated Median, Non Max and Histogram and added support for mean

* Updated tensor [OCL]

* updated table lookup [OCL]

* small updates in mean and stddev [OCL]

* Full functioning code for mean and standard deviation [OCL]

* Added Support to Min Max Location [OCL]

* Added support for gaussian_image_pyramid [OCL]

* Added support for laplacian_image_pyramid [OCL]

* small modification in LIP [OCL]

* small modification in Min Max Location and Mean stddev [OCL]

* box filter hisEq [OCL]

* Added support for gaussian filter

* Added support for bin in Histogram [OCL]

* updated sobel [OCL]

* Update in Temperature [CPU]

* FIX SNP CPU half noise issue [OCL]

* fin small change in Absolute difference [OCL]

* Small changes in Custom convolution and table lookup [OCL}

* Fix regressions due to scripting [cl & CPU].

* fix histogram [OCL]

* Updated snow [OCL]

* updated snow [CPU]

* small update in Snow [OCL]

* Modify filter_operations to add gaussian_filter with same backend as blur

* Fix issue with rain Grey Scale [OCL]

* Fix Rain GPU Transparancy [OCL]

* Add Kernel Caching using Map/Kernelmanger

* Resolved histogram grayscale issue in GPU

* Resolved histogram grayscale issue in GPU

* Fix the bug in warp affine planar call

* Fix issue with resize crop validation [cl & CPU].

* Cl_enque_buffer, the argument is set to CL_FALSE

* Fix Gamma correction [OCL]

* minor changes to gamma_correction, vignette commons, flip functionalities

* Fix Jitter with new Implementation [CPU & OCL]

* Modify brightness bug that gave patches in output

* Fix the buy with Lens correction [OCL].

* Fix Median filter issue

* Modify rotate to match GPU functionality

* Fix median filter

* merge abi-dev-host-ms4 to main-hipcl-dev

* Fix a round about fix for Hue and Saturation Shift

* Modify scale to match GPU functionality

* Fix a round about fix for Hue and Saturation Shift

* Fix syntax error in hsvkernel

* changed CL_False to CL_True in minmax location

* Resolve merg

* Modify warp affine to match GPU - inversion exists

* Add validation for Warp Affine Matrix

* changes in Warp Affine

* Add Blocking calls [CL_TRUE flag is on]

* Removed validation printf statements in the library

* Removed a syntax error

* Add extra validation for contrast

* Fix issue with rain [CPU]

* Added support to new Pixelate [OCL & CPU]

* Fix issue with Fish eye [OCL]

* Modify Histogram Implementation

* Histogram Balance Fix

* Update Readme.md

Amended the list

* Update Readme.md

* Fix Histogram Planar Version

* Add new support to Histogram [OCL]

* Remove all files to include batch version

* Move Mem-Mgmt_HIP branch files to master

* Update Readme.md

* Put all the recent changes or RPP here

* Fix Border issues in crop mirror normalize and crop

* Fix Crop mirror normalize border issue

* Add RPP UnitTests

* Add f32 support for crop_mirror_normalize

* Add f32 support for crop

* Add f32 support for resize_crop_mirror

* Add f32 support for resize and resize_crop

* Add f32 support for color_twist

* Correct blur

* Add f32 support for rotate

* Add f16 host support for rotate, resize, resize_crop, crop, resize_crop_mirror, crop_mirror_normalize, color_twist

* Major changes to host test suite

* Separate host test suites for pkd3 and pln1

* modify rpp_unittests host

* correct additional folder creation and readme

* Minor correction in pln1/pkd3 host test scripts

* Add basic float tensor support

* Add FP32 and FP16 support for Crop function

* Fix bug in crop

* crop mirror normalize report

* Float Support for Rotate GPU

* Add Kernel Support in OCL for colorTwist and resize funtionalities

* Add float support for ColorTwist and Resize Crop Mirror - FP16 and FP32

* Code Refactoring and Rotate Support for FP16 and FP32

* Fix Rotate Float issue

* Fix FP32 Rotate Issue

* Add Resize Function

* Add Resize Crop Mirror in GPU OCL

* Fix Typo

* Add Resize Crop GPU FP16 and FP32 support

* Update rppdefs.h

* Crop Mirror Normalize Support is added

* Support for ColorTwist in Float space

* Update Colortwwist.cl - temp

* Update colortwist.cl

* Fix Bug in ColorTwist

* Fix Bug in ColorTwist (#6)

* API refactoring for fused_functions

* Fix make_data-type bug and code formatting

* Testsuite for Float Support Functions

* Removed the brace in switchcase

* Add free statements for unreleased memomry and f16 fix for colortwist

* rename folders

* Fix Resize for U8 case

* minor change in BatchPD host

* Fix type error in resize.cl

* Fix float errors for resize fucntions

* foramt file

* Fix Bug in ColorTwist (#6) (#8)

* Fix Bug in ColorTwist (#6) (#8) (#9)

* Update

* update (#10)

* Fix Bug in ColorTwist (#6)

* Fix Bug in ColorTwist (#6) (#8) (#9)

* Update

* Format files

* New Changes (#11)

* Fix Bug in ColorTwist (#6)

* Fix Bug in ColorTwist (#6) (#8) (#9)

* Update

* Format files

* Correct f16 color twist host bug

* Change test suite to input 0-1 normalized values for all f16/f32 functionalities

* Refactor API code for geometry_transforms

* Added Testsuite for Float Functions in OCL

* Add host support for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize

* Add host support in test suite for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize

* Add host support for i8 in resize, crop, cmn, rotate, resize_crop, resize_crop_mirror and color_twist

* Add host test suite support for i8

* Add host support for u8->i8 in crop, resize, crop_mirror_normalize

* modify test suite

* Add host plan1 test suite to SOW3_HOST

* crop mirror normalize full support in w.r.t type change and layout change

* Add API calls for CMN function for new set of variations

* Fix bug with respect to I8

* change type info in kernels

* Fix cmn bub

* Support I8 for Rotate

* Int 8 support for colortwist and code refactoring

* Add int8 support for resize crop mirror function

* resize crop mirror int8 support is added

* Crop various variations are added

* Add crop support for all the conversions

* Add host support for resize outputFormatToggle

* Add host support for crop outputFormatToggle

* Add host support for rotate outputFormatToggle

* Add host support for resize_crop outputFormatToggle

* Add host support for resize_crop_mirror outputFormatToggle

* Add host support for crop_mirror_normalize outputFormatToggle

* Add host support for color_twist outputFormatToggle and all other pln->pkd support

* Add missing pln3 API for crop host

* Major modifications in test suite and ReadMe for pkd3, pln3 and pln1 inputs for host

* Modify resize kernel

* Add outputtoggle in the API and functions

* Add new changes to all the fused function w.r.t to outputFormatToggle

* Add pln3 api for Crop on GPU

* add missing API for resize cro

* Fix compilation bugs

* Remove unnecessary functions  and fix build bug

* Add ocl testing framework

* Fix bug in rotate helper

* Minor temp changes in test code to accomodate PKD3 input U8 cases with toggle format

* Correct resize_u8_i8_pkd

* Fix resize kenel issues for output toogle change

* colortwist bug fix

* Fix colortwist bug

* resize tensor fix

* Minor mods to both pln3 and pkd3 test suite to accomodate CMN's ability to do U8 format toggles

* Corrections in PLN3 input funcitons for host

* Fix bugs in Fused function new code

* Add changes relatedd to planar format in padded

* Fix issues with pln3 colortwist

* Fix issue with test suite

* Add pln3 testing and fix issues

* Modify a few things in test script

* Fix pln3 issue for FP16 for Rotate

* Fix index issues with Test suit

* Add output layout toggle for host API

* ix pln3 issues in test suite
Fix pln1 issues in testsuite
Fix other minor bugs

* Change paramerter order in resize pd pln host

* remove print statements

* Add unittest

* Fix HIP backend issues

* able to build hip

* Changed cmakelists for linking issues

* Change include hip/hip_hcc.h to hip/hip_ext.h to avoid warning

Co-authored-by: Muthukumaravel <muthukumaravel@multicorewareinc.com>
Co-authored-by: shobana-mcw <shobana@multicorewareinc.com>
Co-authored-by: LokeshBonta <you@example.com>
Co-authored-by: Reza <Seyedreza.Najafi@amd.com>
Co-authored-by: LokeshBonta <lokeshpsn93@gmail.com>
Co-authored-by: Lokesh Bonta <lokeswara@multicorewareinc.com>
Co-authored-by: Swetha B S <swetha@multicorewareinc.com>
@rrawther rrawther deleted the rr/simd_adv branch April 27, 2021 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants