Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alf filter avx2 #42

Merged
merged 3 commits into from Mar 4, 2023
Merged

Alf filter avx2 #42

merged 3 commits into from Mar 4, 2023

Commits on Feb 10, 2023

  1. Configuration menu
    Copy the full SHA
    a1b02cf View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2023

  1. vvcdec: alf, add avx2 for luma and chroma filter

    got 11%~26% performance for 1080P and 4k video
    
    clip                                        before      after   delta
    RitualDance_1920x1080_60_10_420_32_LD.26        35          43    22.8%
    RitualDance_1920x1080_60_10_420_37_RA.266       43          48    11.6%
    Tango2_3840x2160_60_10_420_27_LD.266            7.9         10    26.5%
    nuomi2021 committed Mar 4, 2023
    Configuration menu
    Copy the full SHA
    e7d3ef9 View commit details
    Browse the repository at this point in the history
  2. checkasm: add support for vvc alf

    checkasm: all 128 tests passed
    vvc_alf_filter_chroma_4x4_10_c: 657.0
    vvc_alf_filter_chroma_4x4_10_avx2: 138.0
    vvc_alf_filter_chroma_4x8_10_c: 1264.7
    vvc_alf_filter_chroma_4x8_10_avx2: 253.5
    vvc_alf_filter_chroma_4x12_10_c: 1841.7
    vvc_alf_filter_chroma_4x12_10_avx2: 375.5
    vvc_alf_filter_chroma_4x16_10_c: 2442.7
    vvc_alf_filter_chroma_4x16_10_avx2: 491.7
    vvc_alf_filter_chroma_4x20_10_c: 3057.0
    vvc_alf_filter_chroma_4x20_10_avx2: 607.2
    vvc_alf_filter_chroma_4x24_10_c: 3667.0
    vvc_alf_filter_chroma_4x24_10_avx2: 747.5
    vvc_alf_filter_chroma_4x28_10_c: 4286.7
    vvc_alf_filter_chroma_4x28_10_avx2: 849.0
    vvc_alf_filter_chroma_4x32_10_c: 4886.0
    vvc_alf_filter_chroma_4x32_10_avx2: 967.5
    vvc_alf_filter_chroma_8x4_10_c: 1250.5
    vvc_alf_filter_chroma_8x4_10_avx2: 261.0
    vvc_alf_filter_chroma_8x8_10_c: 2430.7
    vvc_alf_filter_chroma_8x8_10_avx2: 494.7
    vvc_alf_filter_chroma_8x12_10_c: 3631.2
    vvc_alf_filter_chroma_8x12_10_avx2: 734.5
    vvc_alf_filter_chroma_8x16_10_c: 13675.7
    vvc_alf_filter_chroma_8x16_10_avx2: 972.0
    vvc_alf_filter_chroma_8x20_10_c: 6212.0
    vvc_alf_filter_chroma_8x20_10_avx2: 1211.0
    vvc_alf_filter_chroma_8x24_10_c: 7440.7
    vvc_alf_filter_chroma_8x24_10_avx2: 1447.0
    vvc_alf_filter_chroma_8x28_10_c: 8460.5
    vvc_alf_filter_chroma_8x28_10_avx2: 1682.5
    vvc_alf_filter_chroma_8x32_10_c: 9665.2
    vvc_alf_filter_chroma_8x32_10_avx2: 1917.7
    vvc_alf_filter_chroma_12x4_10_c: 1865.2
    vvc_alf_filter_chroma_12x4_10_avx2: 391.7
    vvc_alf_filter_chroma_12x8_10_c: 3625.2
    vvc_alf_filter_chroma_12x8_10_avx2: 739.0
    vvc_alf_filter_chroma_12x12_10_c: 5427.5
    vvc_alf_filter_chroma_12x12_10_avx2: 1094.2
    vvc_alf_filter_chroma_12x16_10_c: 7237.7
    vvc_alf_filter_chroma_12x16_10_avx2: 1447.2
    vvc_alf_filter_chroma_12x20_10_c: 9035.2
    vvc_alf_filter_chroma_12x20_10_avx2: 1805.2
    vvc_alf_filter_chroma_12x24_10_c: 11135.7
    vvc_alf_filter_chroma_12x24_10_avx2: 2158.2
    vvc_alf_filter_chroma_12x28_10_c: 12644.0
    vvc_alf_filter_chroma_12x28_10_avx2: 2511.2
    vvc_alf_filter_chroma_12x32_10_c: 14441.7
    vvc_alf_filter_chroma_12x32_10_avx2: 2888.0
    vvc_alf_filter_chroma_16x4_10_c: 2410.0
    vvc_alf_filter_chroma_16x4_10_avx2: 251.7
    vvc_alf_filter_chroma_16x8_10_c: 4943.0
    vvc_alf_filter_chroma_16x8_10_avx2: 479.0
    vvc_alf_filter_chroma_16x12_10_c: 7235.5
    vvc_alf_filter_chroma_16x12_10_avx2: 9751.0
    vvc_alf_filter_chroma_16x16_10_c: 10142.7
    vvc_alf_filter_chroma_16x16_10_avx2: 935.5
    vvc_alf_filter_chroma_16x20_10_c: 12029.0
    vvc_alf_filter_chroma_16x20_10_avx2: 1174.5
    vvc_alf_filter_chroma_16x24_10_c: 14414.2
    vvc_alf_filter_chroma_16x24_10_avx2: 1410.5
    vvc_alf_filter_chroma_16x28_10_c: 16813.0
    vvc_alf_filter_chroma_16x28_10_avx2: 1713.0
    vvc_alf_filter_chroma_16x32_10_c: 19228.5
    vvc_alf_filter_chroma_16x32_10_avx2: 2256.0
    vvc_alf_filter_chroma_20x4_10_c: 3015.2
    vvc_alf_filter_chroma_20x4_10_avx2: 371.7
    vvc_alf_filter_chroma_20x8_10_c: 6170.2
    vvc_alf_filter_chroma_20x8_10_avx2: 721.0
    vvc_alf_filter_chroma_20x12_10_c: 9019.7
    vvc_alf_filter_chroma_20x12_10_avx2: 1102.7
    vvc_alf_filter_chroma_20x16_10_c: 12040.2
    vvc_alf_filter_chroma_20x16_10_avx2: 1422.5
    vvc_alf_filter_chroma_20x20_10_c: 15010.7
    vvc_alf_filter_chroma_20x20_10_avx2: 1765.7
    vvc_alf_filter_chroma_20x24_10_c: 18017.7
    vvc_alf_filter_chroma_20x24_10_avx2: 2124.7
    vvc_alf_filter_chroma_20x28_10_c: 21025.5
    vvc_alf_filter_chroma_20x28_10_avx2: 2488.2
    vvc_alf_filter_chroma_20x32_10_c: 31128.5
    vvc_alf_filter_chroma_20x32_10_avx2: 3205.2
    vvc_alf_filter_chroma_24x4_10_c: 3701.2
    vvc_alf_filter_chroma_24x4_10_avx2: 494.7
    vvc_alf_filter_chroma_24x8_10_c: 7613.0
    vvc_alf_filter_chroma_24x8_10_avx2: 957.2
    vvc_alf_filter_chroma_24x12_10_c: 10816.7
    vvc_alf_filter_chroma_24x12_10_avx2: 1427.7
    vvc_alf_filter_chroma_24x16_10_c: 14390.5
    vvc_alf_filter_chroma_24x16_10_avx2: 1948.2
    vvc_alf_filter_chroma_24x20_10_c: 17989.5
    vvc_alf_filter_chroma_24x20_10_avx2: 2363.7
    vvc_alf_filter_chroma_24x24_10_c: 21581.7
    vvc_alf_filter_chroma_24x24_10_avx2: 2839.7
    vvc_alf_filter_chroma_24x28_10_c: 25179.2
    vvc_alf_filter_chroma_24x28_10_avx2: 3313.2
    vvc_alf_filter_chroma_24x32_10_c: 28776.2
    vvc_alf_filter_chroma_24x32_10_avx2: 4154.7
    vvc_alf_filter_chroma_28x4_10_c: 4331.2
    vvc_alf_filter_chroma_28x4_10_avx2: 624.2
    vvc_alf_filter_chroma_28x8_10_c: 8445.0
    vvc_alf_filter_chroma_28x8_10_avx2: 1197.7
    vvc_alf_filter_chroma_28x12_10_c: 12684.5
    vvc_alf_filter_chroma_28x12_10_avx2: 1786.7
    vvc_alf_filter_chroma_28x16_10_c: 16924.5
    vvc_alf_filter_chroma_28x16_10_avx2: 2378.7
    vvc_alf_filter_chroma_28x20_10_c: 38361.0
    vvc_alf_filter_chroma_28x20_10_avx2: 2967.0
    vvc_alf_filter_chroma_28x24_10_c: 25329.0
    vvc_alf_filter_chroma_28x24_10_avx2: 3564.2
    vvc_alf_filter_chroma_28x28_10_c: 29514.0
    vvc_alf_filter_chroma_28x28_10_avx2: 4151.7
    vvc_alf_filter_chroma_28x32_10_c: 33673.2
    vvc_alf_filter_chroma_28x32_10_avx2: 5125.0
    vvc_alf_filter_chroma_32x4_10_c: 4945.2
    vvc_alf_filter_chroma_32x4_10_avx2: 485.7
    vvc_alf_filter_chroma_32x8_10_c: 9658.7
    vvc_alf_filter_chroma_32x8_10_avx2: 943.7
    vvc_alf_filter_chroma_32x12_10_c: 16177.7
    vvc_alf_filter_chroma_32x12_10_avx2: 1443.7
    vvc_alf_filter_chroma_32x16_10_c: 19336.0
    vvc_alf_filter_chroma_32x16_10_avx2: 1876.0
    vvc_alf_filter_chroma_32x20_10_c: 24153.0
    vvc_alf_filter_chroma_32x20_10_avx2: 2323.0
    vvc_alf_filter_chroma_32x24_10_c: 28917.7
    vvc_alf_filter_chroma_32x24_10_avx2: 2806.2
    vvc_alf_filter_chroma_32x28_10_c: 33738.7
    vvc_alf_filter_chroma_32x28_10_avx2: 3454.0
    vvc_alf_filter_chroma_32x32_10_c: 38531.5
    vvc_alf_filter_chroma_32x32_10_avx2: 4103.2
    vvc_alf_filter_luma_4x4_10_c: 1076.2
    vvc_alf_filter_luma_4x4_10_avx2: 240.0
    vvc_alf_filter_luma_4x8_10_c: 2113.2
    vvc_alf_filter_luma_4x8_10_avx2: 454.5
    vvc_alf_filter_luma_4x12_10_c: 3179.2
    vvc_alf_filter_luma_4x12_10_avx2: 669.0
    vvc_alf_filter_luma_4x16_10_c: 4146.5
    vvc_alf_filter_luma_4x16_10_avx2: 885.0
    vvc_alf_filter_luma_4x20_10_c: 5168.2
    vvc_alf_filter_luma_4x20_10_avx2: 1106.0
    vvc_alf_filter_luma_4x24_10_c: 6168.2
    vvc_alf_filter_luma_4x24_10_avx2: 1357.0
    vvc_alf_filter_luma_4x28_10_c: 7330.0
    vvc_alf_filter_luma_4x28_10_avx2: 1539.5
    vvc_alf_filter_luma_4x32_10_c: 8202.0
    vvc_alf_filter_luma_4x32_10_avx2: 1803.7
    vvc_alf_filter_luma_8x4_10_c: 2100.5
    vvc_alf_filter_luma_8x4_10_avx2: 479.7
    vvc_alf_filter_luma_8x8_10_c: 4079.5
    vvc_alf_filter_luma_8x8_10_avx2: 898.2
    vvc_alf_filter_luma_8x12_10_c: 6209.2
    vvc_alf_filter_luma_8x12_10_avx2: 1328.7
    vvc_alf_filter_luma_8x16_10_c: 8177.5
    vvc_alf_filter_luma_8x16_10_avx2: 1765.0
    vvc_alf_filter_luma_8x20_10_c: 10400.5
    vvc_alf_filter_luma_8x20_10_avx2: 2196.2
    vvc_alf_filter_luma_8x24_10_c: 12222.7
    vvc_alf_filter_luma_8x24_10_avx2: 2626.0
    vvc_alf_filter_luma_8x28_10_c: 14235.5
    vvc_alf_filter_luma_8x28_10_avx2: 3065.2
    vvc_alf_filter_luma_8x32_10_c: 16702.2
    vvc_alf_filter_luma_8x32_10_avx2: 3494.2
    vvc_alf_filter_luma_12x4_10_c: 3142.0
    vvc_alf_filter_luma_12x4_10_avx2: 699.5
    vvc_alf_filter_luma_12x8_10_c: 6093.2
    vvc_alf_filter_luma_12x8_10_avx2: 1335.5
    vvc_alf_filter_luma_12x12_10_c: 9098.7
    vvc_alf_filter_luma_12x12_10_avx2: 1988.5
    vvc_alf_filter_luma_12x16_10_c: 12237.5
    vvc_alf_filter_luma_12x16_10_avx2: 2635.0
    vvc_alf_filter_luma_12x20_10_c: 15240.7
    vvc_alf_filter_luma_12x20_10_avx2: 3289.5
    vvc_alf_filter_luma_12x24_10_c: 18262.0
    vvc_alf_filter_luma_12x24_10_avx2: 3937.2
    vvc_alf_filter_luma_12x28_10_c: 21283.0
    vvc_alf_filter_luma_12x28_10_avx2: 4585.2
    vvc_alf_filter_luma_12x32_10_c: 24299.7
    vvc_alf_filter_luma_12x32_10_avx2: 5333.5
    vvc_alf_filter_luma_16x4_10_c: 5729.7
    vvc_alf_filter_luma_16x4_10_avx2: 446.2
    vvc_alf_filter_luma_16x8_10_c: 8256.5
    vvc_alf_filter_luma_16x8_10_avx2: 876.7
    vvc_alf_filter_luma_16x12_10_c: 12178.7
    vvc_alf_filter_luma_16x12_10_avx2: 1332.7
    vvc_alf_filter_luma_16x16_10_c: 16262.5
    vvc_alf_filter_luma_16x16_10_avx2: 1734.5
    vvc_alf_filter_luma_16x20_10_c: 20263.7
    vvc_alf_filter_luma_16x20_10_avx2: 2147.2
    vvc_alf_filter_luma_16x24_10_c: 24789.7
    vvc_alf_filter_luma_16x24_10_avx2: 2591.7
    vvc_alf_filter_luma_16x28_10_c: 28894.5
    vvc_alf_filter_luma_16x28_10_avx2: 3228.7
    vvc_alf_filter_luma_16x32_10_c: 33360.0
    vvc_alf_filter_luma_16x32_10_avx2: 4117.5
    vvc_alf_filter_luma_20x4_10_c: 5076.0
    vvc_alf_filter_luma_20x4_10_avx2: 674.2
    vvc_alf_filter_luma_20x8_10_c: 10138.2
    vvc_alf_filter_luma_20x8_10_avx2: 1323.5
    vvc_alf_filter_luma_20x12_10_c: 15171.5
    vvc_alf_filter_luma_20x12_10_avx2: 2026.5
    vvc_alf_filter_luma_20x16_10_c: 20315.0
    vvc_alf_filter_luma_20x16_10_avx2: 2611.0
    vvc_alf_filter_luma_20x20_10_c: 25367.0
    vvc_alf_filter_luma_20x20_10_avx2: 3259.5
    vvc_alf_filter_luma_20x24_10_c: 30443.5
    vvc_alf_filter_luma_20x24_10_avx2: 3898.5
    vvc_alf_filter_luma_20x28_10_c: 35439.7
    vvc_alf_filter_luma_20x28_10_avx2: 4645.5
    vvc_alf_filter_luma_20x32_10_c: 40609.0
    vvc_alf_filter_luma_20x32_10_avx2: 5849.0
    vvc_alf_filter_luma_24x4_10_c: 6245.5
    vvc_alf_filter_luma_24x4_10_avx2: 901.2
    vvc_alf_filter_luma_24x8_10_c: 12166.7
    vvc_alf_filter_luma_24x8_10_avx2: 1754.7
    vvc_alf_filter_luma_24x12_10_c: 18223.2
    vvc_alf_filter_luma_24x12_10_avx2: 2621.5
    vvc_alf_filter_luma_24x16_10_c: 24287.2
    vvc_alf_filter_luma_24x16_10_avx2: 3474.2
    vvc_alf_filter_luma_24x20_10_c: 38042.2
    vvc_alf_filter_luma_24x20_10_avx2: 4335.7
    vvc_alf_filter_luma_24x24_10_c: 36462.0
    vvc_alf_filter_luma_24x24_10_avx2: 5199.5
    vvc_alf_filter_luma_24x28_10_c: 42502.7
    vvc_alf_filter_luma_24x28_10_avx2: 6133.5
    vvc_alf_filter_luma_24x32_10_c: 48675.5
    vvc_alf_filter_luma_24x32_10_avx2: 7575.0
    vvc_alf_filter_luma_28x4_10_c: 7101.5
    vvc_alf_filter_luma_28x4_10_avx2: 1128.2
    vvc_alf_filter_luma_28x8_10_c: 14185.7
    vvc_alf_filter_luma_28x8_10_avx2: 2189.0
    vvc_alf_filter_luma_28x12_10_c: 21278.7
    vvc_alf_filter_luma_28x12_10_avx2: 3347.2
    vvc_alf_filter_luma_28x16_10_c: 28338.2
    vvc_alf_filter_luma_28x16_10_avx2: 4462.7
    vvc_alf_filter_luma_28x20_10_c: 37076.7
    vvc_alf_filter_luma_28x20_10_avx2: 5729.0
    vvc_alf_filter_luma_28x24_10_c: 42612.2
    vvc_alf_filter_luma_28x24_10_avx2: 6508.7
    vvc_alf_filter_luma_28x28_10_c: 49686.0
    vvc_alf_filter_luma_28x28_10_avx2: 7666.0
    vvc_alf_filter_luma_28x32_10_c: 65345.2
    vvc_alf_filter_luma_28x32_10_avx2: 9330.2
    vvc_alf_filter_luma_32x4_10_c: 8329.5
    vvc_alf_filter_luma_32x4_10_avx2: 887.7
    vvc_alf_filter_luma_32x8_10_c: 16941.7
    vvc_alf_filter_luma_32x8_10_avx2: 1736.0
    vvc_alf_filter_luma_32x12_10_c: 73347.7
    vvc_alf_filter_luma_32x12_10_avx2: 2584.2
    vvc_alf_filter_luma_32x16_10_c: 32359.5
    vvc_alf_filter_luma_32x16_10_avx2: 3442.7
    vvc_alf_filter_luma_32x20_10_c: 40482.5
    vvc_alf_filter_luma_32x20_10_avx2: 4318.5
    vvc_alf_filter_luma_32x24_10_c: 48674.7
    vvc_alf_filter_luma_32x24_10_avx2: 5174.2
    vvc_alf_filter_luma_32x28_10_c: 56715.7
    vvc_alf_filter_luma_32x28_10_avx2: 6124.5
    vvc_alf_filter_luma_32x32_10_c: 66720.0
    vvc_alf_filter_luma_32x32_10_avx2: 7577.2
    nuomi2021 committed Mar 4, 2023
    Configuration menu
    Copy the full SHA
    f94f48d View commit details
    Browse the repository at this point in the history