Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TableGen][NFCI] Speed up generating *GenRegisterInfo.inc files on builds with expensive checks. #67340

Merged
merged 1 commit into from
Sep 25, 2023

Conversation

kosarev
Copy link
Collaborator

@kosarev kosarev commented Sep 25, 2023

This is mostly AMDGPU-specific. When the expensive checks are enabled, generating of AMDGPUGenRegisterInfo.inc currently takes about 20 minutes on my machine for release+asserts builds, which effectively prevents such testing from regular use. This patch fixes this by reducing the time to about 2 minutes.

Generation times for AMDGPUGenRegisterInfo.inc without expensive checks and other *GenRegisterInfo.inc files with and without the expensive checks remain approximately the same.

The patch doesn't cause any changes in the contents of the generated files.

The root cause of the current poor performance is that where glibcxx is used, enabling the expensive checks defines _GLIBCXX_DEBUG, which enables various consistency checks in the library. One such check is in std::binary_search() to make sure the range is ordered. As CodeGenRegisterClass::contains() relies on std::binary_search() and it is called very a large number of times from within CodeGenRegBank::inferMatchingSuperRegClass(), the libcxx checks heavily affect the runtimes.

…ilds with expensive checks.

This is mostly AMDGPU-specific. When the expensive checks are enabled,
generating of AMDGPUGenRegisterInfo.inc currently takes about 20 minutes
on my machine for release+asserts builds, which effectively prevents such
testing from regular use. This patch fixes this by reducing the time to
about 2 minutes.

Generation times for AMDGPUGenRegisterInfo.inc without expensive checks
and other *GenRegisterInfo.inc files with and without the expensive
checks remain approximately the same.

The patch doesn't cause any changes in the contents of the generated
files.

The root cause of the current poor performance is that where glibcxx is
used, enabling the expensive checks defines _GLIBCXX_DEBUG, which enables
various consistency checks in the library. One such check is in
std::binary_search() to make sure the range is ordered. As
CodeGenRegisterClass::contains() relies on std::binary_search() and it is
called very a large number of times from within
CodeGenRegBank::inferMatchingSuperRegClass(), the libcxx checks heavily
affect the runtimes.
@kosarev kosarev added enhancement Improving things as opposed to bug fixing, e.g. new or missing feature infrastructure Bugs about LLVM infrastructure tablegen labels Sep 25, 2023
@kosarev kosarev requested review from resistor, arsenm, dwblaikie and a team September 25, 2023 15:06
@kosarev
Copy link
Collaborator Author

kosarev commented Sep 25, 2023

This helped working on #67245.

Note that the Debian expensive-checks bot, https://lab.llvm.org/buildbot/#/builders/16, explicitly undefines _GLIBCXX_DEBUG, which is a bit unfortunate because when the checks are not disabled, they do catch some problems on the current ToT:

Failed Tests (16):
  Clang :: Misc/warning-flags-tree.c
  Clang :: Misc/warning-wall.c
  LLVM :: CodeGen/PowerPC/fp-strict.ll
  LLVM :: CodeGen/PowerPC/pr55463.ll
  LLVM :: CodeGen/PowerPC/register-pressure.ll
  LLVM :: CodeGen/PowerPC/spe.ll
  LLVM :: CodeGen/X86/code_placement_ext_tsp.ll
  LLVM :: CodeGen/X86/code_placement_ext_tsp_large.ll
  LLVM :: Transforms/Inline/nested-inline.ll
  LLVM :: Transforms/SampleProfile/csspgo-import-list.ll
  LLVM-Unit :: ADT/./ADTTests/14/47
  LLVM-Unit :: ADT/./ADTTests/22/47
  LLVM-Unit :: ADT/./ADTTests/30/47
  LLVM-Unit :: ADT/./ADTTests/38/47
  LLVM-Unit :: ADT/./ADTTests/46/47
  LLVM-Unit :: ADT/./ADTTests/7/47

#67337 should fix the ADT failures.

As a side note, I don't know why AMDGPU is that much different in regards to the number of (generated) register classes; it might be worth looking into that at some point.

Tagging Galina @gkistanova for awarness.

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me (if this restores sufficient performance, perhaps you can re-enable the glibcxx debug checks on the buildbot?)

@kosarev kosarev merged commit f7f8bae into llvm:main Sep 25, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving things as opposed to bug fixing, e.g. new or missing feature infrastructure Bugs about LLVM infrastructure tablegen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants