-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compression mangling for versioning namespaces in std #69
Conversation
The intent is to shorten mangled names for common types shipped by libc++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems fine, @rjmccall mentioned something about even and odd numbering, though I'm not sure that's necessary since as far as I'm aware only libc++
uses versioned namespaces (do correct me if I'm wrong, I have little experience with GCC's libstdcxx).
This would obviously be ABI breaking which is fine since ABI version 2 is already ABI breaking (I think that part goes without saying, enabling unstable ABI which basically switches to v2, without specifying the ABI version will still use the v1 namespace completely breaking any application dynamically linking against it early on).
Enabling the use of said ABI (and if agreed upon, shorthand mangling) does require explicitly opting for a different ABI version via LIBCXX_ABI_VERSION=2
(for libc++
).
That wasn't me, that was JF. And no, we shouldn't be making an assumption that there are only two standard library implementations or requiring that the implementations coordinate to keep versioning unique. |
Would you suggest explicitly requesting short mangling via some commandline option to |
::std::__[0-9]+::allocator<char> > | ||
<substitution> ::= Siv<inline-ns>v # ::std::__[0-9]+::basic_istream<char, std::__[0-9]+::char_traits<char> > | ||
<substitution> ::= Sov<inline-ns>v # ::std::__[0-9]+::basic_ostream<char, std::__[0-9]+::char_traits<char> > | ||
<substitution> ::= Sdv<inline-ns>v # ::std::__[0-9]+::basic_iostream<char, std::__[0-9]+::char_traits<char> > |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional suffices like turning Ss
into Ssv0v
don't work in the mangling grammar; types have to be self-limiting or else you get ambiguities. For example, you want _Z3fooISsv1vEv
to demangle to foo<std::__1::string>()
, but it already demangles to foo<std::string,void,v>()
. You have to move this earlier in the production, like Sv0_s
.
As a general matter, I'm comfortable with opening up the substitution namespace fairly liberally. What I think we can do here is this:
The big question here is what counts as a "library". If we reserve this for just standard libraries, with the expectation that those libraries will just have a handful of substitutions each, then it's not out of the question for manglers and demanglers alike to continue to hard-code all those substitutions indefinitely. On the other hand:
For the time being, I think we should constrain this to major standard library implementations. [footnote] I'm sure maintainers have a better idea of this than I do, but off the top of my head:
|
Yes this is fine from
I don't suggest any library aside from a C++ standard library (aka, This proposal is merely to tackle the loss of short mangling schemes caused by inline namespace-based versioning being introduced.
Don't forget debug data especially on embedded devices where because of the naming, with |
Right, I was just thinking it through, not trying to say that you'd suggested any of that. |
There may be some room to add more shorthand manglings if we're going for an ABIv2 change, though personally for me it's difficult to think of possible combinations, I think I guess @ldionne would be best regarding advice on that because of his experience with idiomatic C++ and familiarity with how frequently certain standard library constructs are used. |
Remember that substitutions don't have to be fully applied: it might not be useful to have substitutions for any exact specializations of |
Also to clarify, would you not be okay with mangler/demangler drawing assumptions about ABI versions from symbols and using compressed mangling by default for ABIv2 and above? This is not a breaking change for anyone aside from Fuchsia but they are happy with breaking ABI and they don't even mind testing those new breaking changes out in practice. This would only apply to ABIv2 and up and only when But I personally don't see a reason to not make this the default for libc++, I think Eric and Louis are both on board with this idea (though it would still be great to introduce new shorthand forms since I'd rather break the unstable ABI once, so it would be a good idea to have all of that ready prior to actually breaking it). Would be nice to hear any suggestion from C++ experts like Louis who are familiar with what would be of most benefit or whoever else we can CC on this. Obviously not everything needs a short mangling but I think now, as C++ standards advance and bring more into the stdlibs, there's no better time for extending the short mangling forms. And ultimately it's a win-win situation for pretty much everyone, and it's one of the reasons I'm trying to push for those changes as they're of massive help for semi-embedded systems that may still have a full I do apologize for it being |
I'm still not entirely sure what you're proposing. The ABI is allowed to add abbreviations for declarations which it can assume haven't previously existed. That applies regardless of any notion of ABI versioning: if the committee adds a That means it's fine for us to add new abbreviations for entities in I would strongly object to the mangling rule being target-dependent. Fuchsia is of course welcome to say that it uses a specific ABI version of libc++, just like it's welcome to tailor all the other symbols in its ABI in a myriad of other ways. That decision should just mean that its ABI is expressed using different entities — which happen to take advantage of some new target-independent abbreviations — not that Fuchsia's compilers actually change the standard ABI rules. For example, Fuchsia is welcome to make |
Libstdc++ has a non-default (and not widely used) configuration that uses versioned namespaces. Past versions used |
@rjmccall Well I was proposing changing the rules for ABIv2 and up which would imply compiler changes. As @jwakely said this isn't going to clash with libstdcxx since they use a different subset of the versioning namespace. I was just wondering how you felt about making the shorthand mangling default for ABIv2 (ie. if So this should not affect libstdcxx until they're ready to roll out support for new ABI at which point we can swap opt-in for opt-out once they're happy and we know which ABI version we should assume is going to support these proposed new mangling schemes (since it looks like it's going to be different for libstdcxx). Same goes for all other stdlib vendors, I wouldn't suggest enabling it by default without getting a blessing from the stdlib vendor. However as far as Does that make sense? Again if you see issues with that I won't push for implicitly enabling this for anything yet, especially considering we still need to have a concrete proposal up. I agree with your changes with regarding to abbreviations that you mentioned in review, that's not the issue I was raising. Anyway, before discussing all this, I think it's best to come up with a proposal first and see what could be added to the new scheme. A few extra characters because of the versioned namespaces aren't a problem, I wasn't implying it is (sorry if it seemed like I had an issue with that). I understand the necessity of it, and I definitely wasn't trying to suggest having this be in the root, unversioned namespace. With regards to Fuchsia all I was trying to say is that they don't mind testing breaking ABI changes, not that the project needs special ways of mangling that would deviate from the to-be ABIv2 short manglings, they're happy with just using ABIv2/Unstable ABI as is for now, regardless of what direction it goes. Thank you. |
"ABIv2" is a collection of ideas, not a concrete variant ABI. Up to now, none of the "v2" ideas have introduced semantic inconsistencies with the "v1" mangling rules. There are two major reasons that I know of. The first reason is that such changes would subtly break all the existing tooling which assumes that the mangling rules are the same across all platforms. Consider something like The second reason is that there are several other major flaws in the mangling grammar, so if you're truly interested in getting optimal manglings on a new target, you shouldn't stop at changing the standard substitutions. Decreasing the size of an individual symbol is much less important than decreasing the overall size of the symbol table, and the best way to do that is to take advantage of common substrings between symbols, and particularly common prefixes. A prefix-tree symbol table is a good idea just given common C idioms (consider So if you're really interested in using a "v2" mangling scheme that abandons consistency with the old scheme, I would recommend deeper changes than just tinkering with substitutions. If you're interested in something that maintains consistency with the old scheme, you should add new substitutions, not repurpose the existing ones. If you're dead-set on getting 2-byte substitutions instead of 3/4-byte substitutions, and you don't care about supporting existing standard libraries on your target, you should just have your standard library declare its entities directly in namespace |
No no I wasn't implying that I was dead set on that aspect, in fact I said if it's just a few characters I don't see the problem. Sorry I think I'm hard to understand sometimes. I absolutely do care about compatibility and am happy with having slightly longer "short" mangling because of the namespaces, I just want to eliminate major sources of cruft in debug info like what Again big sorry if I gave the wrong impression, I absolutely support the idea of having slightly longer manglings to not violate compatibility, regardless of whether it's 3-4 or 5 or 6 extra characters it's still a huge win as far as I'm concerned since at the moment I still think we're at a disagreement regarding the ABI break as far as I don't want to spend too much time discussing this aspect yet so I'm happy to make it strictly opt-in for now since I think we're going a bit off track here and I think we should focus on the actual scheme and possible inclusions as well. This will be a break in unstable ABI though I'd like to keep it to a single break hence me wanting to move onto the actual proposal, and I don't really want to waste anyone's time. and the more I talk the more confusion there seems to be :( Thank you and big apologies for any misunderstandings, I'm generally not great with RFCs so please excuse any ambiguities that may have caused confusion. Also I think as far as demanglers go, it should be possible to apply that logic in reverse providing the mangling scheme is unambiguous enough and extrapolate the ABI version from symbol names. |
Okay, I think we're on the same page here. When you're talking about "ABIv2", you talking about proposals to revamp the ABI of libc++. That's generally up to libc++ as a project and is ultimately off-topic here. If you're okay with slightly longer abbreviations, then I think the path forward here is quite straightforward:
|
I think And while Clang has reserved manglings for certain very niche things like SEH on IA64 filters, as one example, something more trivial and common like standards for mangling Since you have a better understanding of mangling schemes, what prefix would be safe (to avoid conflicts) and yet most compact to use for ABI version |
Okay. Let me try to be very concrete, because apparently we are not communicating well. What I would like to do is add a prefix, let's say Separately, the Itanium ABI will remember that you are using the namespaces I'll talk to the rest of the Itanium ABI project to summarize that idea and hopefully get consensus on it. |
Actually, I now have a somewhat more systematized idea for what the suffix following |
One option that has not been considered yet is to counteract the std inline namespace with an std inline namespace context. For example, if This seems ideal when different |
This PR was prompted by a thread on libcxx-dev where I did actually bring something like that as a "mode prefix". There are two basic problems with it:
|
Alright, understood, that sounds reasonable if the rest of the IA64 ABI committee can reach a consensus on it and regarding the part that follows the suffix (which you said is currently being discussed). Thank you for the clear explanation and apologies for dragging this out due to various misunderstandings. |
How is this work going now? Can we have some option to disable inline namespace in libcxx? |
This is a strawman proposal to add substitutions for inline namespaces (fixing #42). I've never touched the Itanium ABI before so this is most likely wrong and/or too naive, but this is at least something concrete to get us started with.
A couple of notes on my approach:
std::__1
, because that's already in use.v
on both sides to avoid clashing with things likeSt3
. There may be a better way of doing this.@zygoloid In #42, you say:
Can you explain what you mean by that? I suspect this will throw my approach down the drain (but that's fine).
Fixes #42