Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2 intrinsic optimization #71

Merged
merged 8 commits into from Feb 21, 2019

Conversation

Projects
None yet
5 participants
@AkilRavi
Copy link
Contributor

AkilRavi commented Jan 11, 2019

BiPredClipping_AVX2
BiPredClippingOnTheFly_AVX2
invTransform32

@hassount hassount requested a review from ttrigui Jan 11, 2019

@ttrigui
Copy link

ttrigui left a comment

Code reviewed ( removed unused transform SSE code and added c kernel for BiPredClippingOnTheFly).
Code tested and ready to be merged

@hassount hassount requested a review from tianjunwork Jan 14, 2019

agopikrishna13 and others added some commits Jan 9, 2019

Add AVX2 intrinsic code for invTransform32
  EstimateInvTransform32x32_AVX2   7.83x    44473.62        348382.25
Add BiPredClipping_AVX2 intrinsic code
HEVC_BiPredClipping width 4     2.23x    259.01          576.49
HEVC_BiPredClipping width 8     8.55x    247.48          2116.88
HEVC_BiPredClipping width 16     13.71x   608.02          8337.47
HEVC_BiPredClipping width 32     31.22x   2011.41         62787.17
HEVC_BiPredClipping width 64     65.04x   8295.15         539476.88
+ remove unused transform SSE code
+ add c kernel for BiPredClippingOnTheFly

@AkilRavi AkilRavi force-pushed the AkilRavi:master branch from 1544158 to 855a5b2 Jan 21, 2019

@tianjunwork

This comment has been minimized.

Copy link
Contributor

tianjunwork commented Jan 23, 2019

Hi @ttrigui, I am not familiar with ASM to do code review. Could you let me know how I can help to get this patch merged? Thanks.

add avx2 code for transform8x8
hevc_fwd_txfm8     4.22x    1220.40         5150.50
@AkilRavi

This comment has been minimized.

Copy link
Contributor Author

AkilRavi commented Jan 23, 2019

Added AVX2 intrinsic codes for

  • EncodeQuantizedCoefficients_SSE2
  • EstimateQuantizedCoefficients_Lossy_SSE2
  • Transform8x8_SSE4_1_INTRIN
@AkilRavi

This comment has been minimized.

Copy link
Contributor Author

AkilRavi commented Jan 25, 2019

Fixed Linux build error.

@tianjunwork tianjunwork merged commit fa0702c into OpenVisualCloud:master Feb 21, 2019

tianjunwork added a commit that referenced this pull request Feb 21, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.