Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static constructors should be purged from LLVM #12316

Open
lattner opened this issue Feb 8, 2012 · 17 comments
Open

Static constructors should be purged from LLVM #12316

lattner opened this issue Feb 8, 2012 · 17 comments
Labels
bugzilla Issues migrated from bugzilla code-cleanup llvm:core

Comments

@lattner
Copy link
Collaborator

lattner commented Feb 8, 2012

Bugzilla Link 11944
Version 1.0
OS All
CC @asl,@d0k,@dwblaikie,@jryans,@stoklund,@yuanfang-chen

Extended Description

See http://llvm.org/docs/CodingStandards.html#ci_static_ctors. We should really purge all the remaining static constructors and destructors, then turn on the clang -Wglobal-constructors warning flag to prevent regressing in the future.

@d0k
Copy link
Member

d0k commented Feb 8, 2012

The biggest static ctor contributor in LLVM is the cl::opt command line interface. If we want to remove static ctors from LLVM we will need a replacement for that library.

@dwblaikie
Copy link
Collaborator

dwblaikie commented Feb 8, 2012

The biggest static ctor contributor in LLVM is the cl::opt command line
interface. If we want to remove static ctors from LLVM we will need a
replacement for that library.

Any idea how we do that? Given that cl::opt is a registration system so that parsing command line arguments can populate/use those registrations - the trick (trivial type for the global, converting ctor for the 'real' type) done with the global Attributes doesn't apply here - because without a real ctor the registration would not occur.

The only other trick I know of is pinning data into particular linker sections that could then be walked at runtime - it'd remove the static ctor but I've only ever done something like that on Windows & not sure if/how we'd do it in a portable way for LLVM.

Other ideas/thoughts?

@lattner
Copy link
Collaborator Author

lattner commented Feb 8, 2012

There are a couple of ways to handle cl::opt, but they're pretty ugly and will require some widespread API changes. That's probably ok though, because cl::opt really needs a redesign for performance and functionality anyway.

Anyway, I'd suggest starting with everything that is not cl::opt. Last I looked, tablegen was emitting some very large arrays that had static constructors into each target. There are probably a bunch of other similar accidental ones.

@dwblaikie
Copy link
Collaborator

dwblaikie commented Feb 8, 2012

Current -Wglobal-constructors violations

Anyway, I'd suggest starting with everything that is not cl::opt.

Hmm - was hoping to ensure we had a plan/something that works for the hard case before worrying about the mechanical/relatively trivial stuff.

Last I
looked, tablegen was emitting some very large arrays that had static
constructors into each target. There are probably a bunch of other similar
accidental ones.

Yeah, there's a lot in the tablegen'd target stuff. I've attached a file with all the current -Wglobal-constructors violations in LLVM+Clang as of r150077. I haven't categorized the different kinds of violations (separating opt from other things, etc) but "grep warning: errors.log | sort | uniq" gives a general idea of where the violations are.

@d0k
Copy link
Member

d0k commented Feb 8, 2012

Most of the tblgen'd madness went away during the MCTargetDesc refactoring. The remaining bit are the TargetRegisterClasses and the MVT arrays that accompany them.

The meat of it is already migrated to MC, but TargetRegisterClass itself is tricky because it contains virtual methods. Jakob wanted to fix this eventually, but I don't know if there's any short term plan.

@lattner
Copy link
Collaborator Author

lattner commented Feb 9, 2012

The way to handle the virtual method issue is to have tblgen generate a POD struct, and then have the "virtual classes that the code generator actually forwards to" just contain the instance that tblgen poops out.

@lattner
Copy link
Collaborator Author

lattner commented Feb 9, 2012

There are two pieces to the cl::opt refactor that I can think of. The first (easier) half of the issue it that cl::opt (and cl::list) all contain an instance of a non-pod type. A straight-forward way to fix the string case is to rework all the cl::opt stuff so that ParseCommandLineOptions makes a copy of "argv" in a BumpPointerAllocator (which is then released at llvm_shutdown time) and each cl::opt contains a StringRef that points back to these strings. Each time the "get" method is called on cl::opt, it would do an "atop" to reparse the StringRef. cl::list would be handled in a similar way, where it would actually contain an ArrayRef and the array data for the list of entries is created as ParseCommandLineOptions time, and owned by the same BumpPointerAllocator.

@lattner
Copy link
Collaborator Author

lattner commented Feb 9, 2012

The second half of the cl::opt problem (and the more significant one) is the "static constructor to register it" problem. This dovetails (tangentially) with the problem that cl::opts are in a global namespace and that they can collide. It would be "really great" if cl::opts in the X86 target where in some x86-specific namespace and could not collide with ppc target options, for example. Similarly for mid-level optimizer passes: the loop unroll pass should just have a "threshold" cl::opt, and it would be accessible as "-loop-unroll-threshold" and the inliner would have a "threshold" cl::opt which would be accessible as "-inline-threshold".

The best way that I can think of to handle this and the registration problem together is to change cl::opt into a macro that expands into the existing global variables, as well as an initialization function for each. This initialization function would then be called by the existing pass initialization logic and would take the pass name to scope the option with. This way, the cl::opts would be initialized iff the passes and targets are initialized.

@lattner
Copy link
Collaborator Author

lattner commented Feb 9, 2012

Looking at David's list (thanks!) I'd prioritize fixing these classes of issues (since they are auto generated causing us to have a LOT of them):

build/lib/Target/X86/X86GenRegisterInfo.inc:3113:20: warning: declaration requires a global constructor [-Wglobal-constructors]
static const EVT GR32_NOAX_and_GR32_NOREXVTs[] = {

This should be an easy fix to use MVT::SimpleValueType. These should never contain unusual types that would require EVT.

build/lib/Target/X86/X86GenRegisterInfo.inc:3275:12: warning: declaration requires a global constructor [-Wglobal-constructors]
GR8Class GR8RegClass;
^~~~~~~~~~~

This can be fixed by splitting the auto generated portion out into a separate POD type generated by tblgen.

It's worth mentioning that I don't actually care about gtest static ctors, or static ctors in llvm/tools, since they don't affect clients linking to llvm libraries.

@dwblaikie
Copy link
Collaborator

dwblaikie commented Feb 9, 2012

Thanks for all your thoughts/notes (& any more you provide), Chris - I'll certainly take a look at the cases/approaches you've suggested (they haven't quite distilled in my head yet, but I'm getting there)

It's worth mentioning that I don't actually care about gtest static ctors, or
static ctors in llvm/tools, since they don't affect clients linking to llvm
libraries.

Right - we'll figure out how to turn on flags for only the right projects when we reach that point. I'm not such a purist as to think it best to remove them rather than bend the build system to our whim.

@d0k
Copy link
Member

d0k commented Feb 9, 2012

This should be an easy fix to use MVT::SimpleValueType. These should never
contain unusual types that would require EVT.

Done in r150173. The VT tables have hardly any users so this was surprisingly simple.

build/lib/Target/X86/X86GenRegisterInfo.inc:3275:12: warning: declaration
requires a global constructor [-Wglobal-constructors]
GR8Class GR8RegClass;
^~~~~~~~~~~

This can be fixed by splitting the auto generated portion out into a separate
POD type generated by tblgen.

Most of the stuff is pImpl'd into MCRegisterClass which is POD. TargetRegisterClass itself contains a virtual method (getRawAllocationOrder) which gets overriden by TableGen'd code. I don't see any way to do this via value initialization. The virtual method has to be removed before this can be fixed.

Another thing I noticed is that TargetRegisterClass also has a virtual destructor, presumably to silence a warning that GCC emits. The destructor should be made protected & non-virtual (the warning was fixed to accommodate this in GCC 4.3 or 4.4) so we don't have static dtors.

@d0k
Copy link
Member

d0k commented Feb 9, 2012

WRT virtual dtors: LLVM is smart enough to devirtualize them here and with r150174 globalopt properly removes the calls to it, so it's not a big deal, just one extra pointer in the vtable

@lattner
Copy link
Collaborator Author

lattner commented Feb 9, 2012

Jakob, do you have any suggestions on how we can eliminate the regclass static ctors?

@dwblaikie
Copy link
Collaborator

dwblaikie commented Feb 9, 2012

Current -Wglobal-constructors violations
Here's the updated violations 2196 down from 2383 (-8%).

I'm still not sure I understand the TableGen code enough to track down/fix issues there, but I'll keep poking at whatever I can get my teeth into.

@stoklund
Copy link
Mannequin

stoklund mannequin commented Feb 9, 2012

First of all, please make sure you are not chasing windmills here. With a hot buffer cache, a noop llc invocation runs in 0.5 ms, including static constructors and all those pesky data relocations. I do see the problem when linking LLVM statically into a larger application and more code needs to be paged in, though.

Some of these data structures are very hot, so be careful to not regress LLVM performance by adding extra indirections.

That said, I would be perfectly happy if TargetRegisterClass didn't have a type list at all. Types have no business in codegen after isel. We only have a couple of places that use the list, and performance isn't critical. A slightly more expensive interface based on a SimpleTypeValue list would be fine.

We only subclass TargetRegisterClass to provide specializations of getRawAllocationOrder(), and many classes simply use the default implementation. I don't plan to add new virtual functions, so we could replace the vtable pointer with a function pointer:

class TargetRegisterClass {
ArrayRef (*OrderFunc)(const MachineFunction&);
...

ArrayRef getRawAllocationOrder(const MachineFunction &MF) const {
return OrderFunc ? OrderFunc(MF) : makeArrayRef(begin(), getNumRegs());
}
}

Then subclassing is no longer necessary, and TargetRegisterClass can be non-polymorphic and list-initialized.

@d0k
Copy link
Member

d0k commented Mar 1, 2012

TargetRegisterClasses are now value-initialized (r151806).

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 4, 2014

I wonder how much of these static constructors could be removed by the switch to C++11 and constexpr: Change to and make the constructor as well, and initialization happens at compile-time. Most likely this would fix tablegen issues such as those mentioned in comment 9.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla code-cleanup llvm:core
Projects
None yet
Development

No branches or pull requests

4 participants