Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

30% performance regression on Krita with AVX since 1.4.0 #283

Closed
amyspark opened this issue Jun 29, 2021 · 1 comment
Closed

30% performance regression on Krita with AVX since 1.4.0 #283

amyspark opened this issue Jun 29, 2021 · 1 comment

Comments

@amyspark
Copy link
Contributor

amyspark commented Jun 29, 2021

Vc version / revision Operating System Compiler & Version Compiler Flags Assembler & Version CPU
1afaf68 Windows MinGW GCC 7.3.0 -O2 -g -DNDEBUG GNU assembler (GNU Binutils) 2.30 Ryzen 7 2700x

Testcase

N/A. This is an occurrence throughout the benchmark suite of Krita as of KDE/krita@2943863a2f. This is being tracked in a MR of mine, graphics/krita!933.

Actual Results

Upgrading Vc results in ~30% less single threaded performance, down to ~9% in maximum concurrence (which caps out at 8 threads in my system).

Expected Results

Upgrading Vc version should result in same or faster performance.

Remarks

I've bisected the tree starting with the release of 1.3.3 up to the present head. I've needed to keep the patch in #241 applied throughout the process because otherwise it crashes on an uint_v aligned store.

The regression was introduced in:

11a93f026faadfd903394cf6131e6ddaedd13b37 is the first bad commit
commit 11a93f026faadfd903394cf6131e6ddaedd13b37
Author: Matthias Kretz <kretz@kde.org>
Date:   Fri Sep 21 11:59:23 2018 +0200

    * Use fixed_size_simd<int, Size> as IndexType for all ABIs
    * Define integral AVX compares only when AVX2 is available
    * Refactor binary operators to handle (mixed) fixed_size_simd(_mask) and
      Simd(Mask)Array better (or at all)
    * Modify Simd(Mask)Array operations to return fixed_size_simd(_mask)
    * Refactor some enable_ifs to move from default arg to template
      parameter list. This is especially important to work around
      constructor inheritance bugs in GCC.
    * Make sure fixed_size unconditionally passes via the stack (to solve
      potential ABI breakage on linking TUs compiled with different -march
      flags).
    * Disallow implicit narrowing conversions on binary operators (e.g.
      `simd + long`)

    Signed-off-by: Matthias Kretz <kretz@kde.org>

 Vc/avx/vector.h                    |  12 +-
 Vc/avx/vector.tcc                  |  28 ++--
 Vc/common/operators.h              | 109 ++++++-------
 Vc/common/simd_cast_caller.tcc     |  40 ++---
 Vc/common/simdarray.h              | 326 +++++++++++++++++++++----------------        
 Vc/common/simdarrayfwd.h           |  68 ++++++--
 Vc/common/simdmaskarray.h          | 128 +++++++++------
 Vc/common/vector.h                 |   2 +-
 Vc/scalar/vector.h                 |   2 +-
 Vc/sse/vector.h                    |   4 +-
 tests/gather.cpp                   |   2 +-
 tests/gatherinterleavedmemory.cpp  |   2 +-
 tests/mask.cpp                     |   2 +-
 tests/scatterinterleavedmemory.cpp |  14 +-
 tests/simdize.cpp                  |  22 +--
 tests/virtest                      |   2 +-
 16 files changed, 423 insertions(+), 340 deletions(-)

Even when reverting this commit (which can only be done, so far as I tried, in 1.4.0), there's still a ~9% performance loss. Further bisecting between 1.3.3 and 1.4.0 (including git revert 11a93f026faadfd903394cf6131e6ddaedd13b37) reveals the offending commit:

dcef14be9a674555c6b19e05b987a92e29d9ec99 is the first bad commit
commit dcef14be9a674555c6b19e05b987a92e29d9ec99
Author: Matthias Kretz <kretz@kde.org>
Date:   Tue Sep 18 09:39:16 2018 +0200

    GCC: Optimize constant folding of compares

    Signed-off-by: Matthias Kretz <kretz@kde.org>

 Vc/avx/intrinsics.h | 60 ++++++++++++++++++++++++++++++++++-------------------        
 1 file changed, 39 insertions(+), 21 deletions(-)

Note: I apologize in advance if this isn't the correct place to report this, I've not found where I can contact you.

@amyspark amyspark changed the title 30% performance regression on Krita since Vc 1.3.3 (or commit 9586fce4) 30% performance regression on Krita with AVX since commit 11a93f026faadfd903394cf6131e6ddaedd13b37 Jun 30, 2021
@amyspark amyspark changed the title 30% performance regression on Krita with AVX since commit 11a93f026faadfd903394cf6131e6ddaedd13b37 30% performance regression on Krita with AVX since 1.4.0 Jun 30, 2021
@bernhardmgruber
Copy link
Collaborator

Thank you for your amazing effort to nail down this performance regression!

Unfortunately, Vc's author and maintainer moved on to work on standardization work around std::simd and Vc is now maintained by a small group of Vc users around CERN. This maintenance mainly includes collecting and reviewing patches from Vc's community. I can only offer you to try to fix the regression yourself and create a PR. We will try to review and get it in.

I am sorry if that is not the answer you were hoping for!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants