New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Issue 15831 - Mangle back references to avoid huge symbol names #5855

Merged
merged 9 commits into from Jul 16, 2017

Conversation

Projects
None yet
@rainers
Member

rainers commented Jun 9, 2016

This is another implementation of a mangling scheme that uses back references for reoccuring types and identifiers. It also get's rid of an ambiguity in the existing mangling and encapsulates mangling better in dmangle.d.

see https://issues.dlang.org/show_bug.cgi?id=15831.

Updated:
The bad symbol 1.s.s.s.s.s.foo mangles as _D13testexpansion__T1sTSQw__TQjTSQBf__TQtTSQBp__TQBdTSQCa__TQBoTiZQBuFiZ6ResultZQCiFQBfZQqZQCtFQCbZQBbZQDfFQCxZQBnZQDrFQDsZQBz3fooMFNaNfZv with 138 characters (instead of 1153). Please also note some extra characters due to the gratuitously long __T prefix.

The worse symbol 1.s.s.s.s.s.s.s.s.s.s.foo grows to 253 characters (instead of 37640).

Here is a rough sketch of the implementation:

  • for every type that has been mangled before in the same symbol, encode it as QAAa, where AAa is the relative position of the first occurrence encoded as a base 26 number with the last character using a lower case letter instead of upper case letters to detect the end of the number
  • for every identifier that has been mangled before, encode it as QAAa aswell. To distinguish the two types of back references the demangler needs to lookup the character at the referenced position.

To capture all back references it is no longer allowed to build the mangled string from mangled strings of its sub components. It's also not feasible to insert characters into the mangled strings (currently done for the length of a mangled template instance) as it can invalidate back references.

Accompanying PRs: dlang/druntime#1830 and dlang/dlang.org#1664

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 9, 2016

Member

Seems we had the same idea.

Member

UplinkCoder commented Jun 9, 2016

Seems we had the same idea.

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 9, 2016

Member

What was wrong with the encapsulation ?

Member

UplinkCoder commented Jun 9, 2016

What was wrong with the encapsulation ?

@rainers rainers referenced this pull request Jun 9, 2016

Closed

[WIP] back-refernce mangle #5853

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jun 9, 2016

Member

What was wrong with the encapsulation ?

Mangling was also done in dtemplate.d (the code versioned out in this PR).

Member

rainers commented Jun 9, 2016

What was wrong with the encapsulation ?

Mangling was also done in dtemplate.d (the code versioned out in this PR).

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jun 9, 2016

Member

Seems we had the same idea.

gcc developers had this idea 20+ years ago.

Member

rainers commented Jun 9, 2016

Seems we had the same idea.

gcc developers had this idea 20+ years ago.

Show outdated Hide outdated src/dmangle.d
@@ -88,6 +94,9 @@ private immutable char[TMAX] mangleChar =
// X // variadic T t...)
// Y // variadic T t,...)
// Z // not variadic, end of parameters
// # // Type backward reference

This comment has been minimized.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

the backrefernce can distinguish itself
no need to add so many prefixes I think.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

the backrefernce can distinguish itself
no need to add so many prefixes I think.

This comment has been minimized.

@rainers

rainers Jun 9, 2016

Member

Yeah, that might work. There are some ambiguities in the current name mangling, though, so I'm not 100% sure we'd need more context.

@rainers

rainers Jun 9, 2016

Member

Yeah, that might work. There are some ambiguities in the current name mangling, though, so I'm not 100% sure we'd need more context.

This comment has been minimized.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

ambiguities ?
Do they pose a problem ?

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

ambiguities ?
Do they pose a problem ?

This comment has been minimized.

@rainers

rainers Jun 9, 2016

Member

ambiguities ?

A simple bugzilla search already lists 4: https://issues.dlang.org/buglist.cgi?quicksearch=mangl%20ambig

Do they pose a problem ?

Sure. Not sure whether it gets worse with back references.

@rainers

rainers Jun 9, 2016

Member

ambiguities ?

A simple bugzilla search already lists 4: https://issues.dlang.org/buglist.cgi?quicksearch=mangl%20ambig

Do they pose a problem ?

Sure. Not sure whether it gets worse with back references.

Show outdated Hide outdated src/dmangle.d
buf.printf("$%d", cast(int)*p);
return true;
}
symbols[cast(void*)s] = symbols.length;

This comment has been minimized.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

use the position in the mangle string instead.
the the de-mangler does not have to keep state.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

use the position in the mangle string instead.
the the de-mangler does not have to keep state.

This comment has been minimized.

@rainers

rainers Jun 9, 2016

Member

I thought about that, too. Could work, but again, we'll have to revisit the grammar for ambiguities. The type/symbol index yields smaller numbers and might be a bit more readable.

@rainers

rainers Jun 9, 2016

Member

I thought about that, too. Could work, but again, we'll have to revisit the grammar for ambiguities. The type/symbol index yields smaller numbers and might be a bit more readable.

This comment has been minimized.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

mangles are not meant to be readable.
let's fix ddmangle next.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

mangles are not meant to be readable.
let's fix ddmangle next.

This comment has been minimized.

@rainers

rainers Jun 9, 2016

Member

let's fix ddmangle next.

Let's fix the build and unittests first ;-)

@rainers

rainers Jun 9, 2016

Member

let's fix ddmangle next.

Let's fix the build and unittests first ;-)

This comment has been minimized.

@rainers

rainers Jun 9, 2016

Member

BTW: Let's rather wait for some more feedback before putting too much work into this approach.

@rainers

rainers Jun 9, 2016

Member

BTW: Let's rather wait for some more feedback before putting too much work into this approach.

This comment has been minimized.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

This approach is efficient. compressing after the fact is not.
I doubt that there are more alternatives.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

This approach is efficient. compressing after the fact is not.
I doubt that there are more alternatives.

Show outdated Hide outdated src/dmangle.d
buf.printf("@%d", cast(int)*p);
return true;
}
idents[cast(void*)id] = idents.length;

This comment has been minimized.

@UplinkCoder

UplinkCoder Jun 9, 2016

Member

ditto

@UplinkCoder
@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 9, 2016

Member

I see.
Let's collaborate on this PR then.

Member

UplinkCoder commented Jun 9, 2016

I see.
Let's collaborate on this PR then.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jun 10, 2016

Member

Fixed compilation, but still has some link errors.

Member

rainers commented Jun 10, 2016

Fixed compilation, but still has some link errors.

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 10, 2016

Member

I'll check it out

Member

UplinkCoder commented Jun 10, 2016

I'll check it out

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 10, 2016

Member

How would you de-mangle this ?

Member

UplinkCoder commented Jun 10, 2016

How would you de-mangle this ?

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jun 11, 2016

Member

How would you de-mangle this ?

Do you expect any specific trouble? I'll switch to buffer positions for a simpler demangler once the link problems are solved. AFAICT the mangling fails for named template mixins.

Member

rainers commented Jun 11, 2016

How would you de-mangle this ?

Do you expect any specific trouble? I'll switch to buffer positions for a simpler demangler once the link problems are solved. AFAICT the mangling fails for named template mixins.

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 11, 2016

Member

The link problem that I can see occur when dmd tries to link to druntime because suddenly the mangle is different.
About the demangler the trouble with using a counter of encouterd symbols over the position.
Is that you need to build a table that translates from counter to symbolname index.
This would not be necsessary If we just use the position.

Member

UplinkCoder commented Jun 11, 2016

The link problem that I can see occur when dmd tries to link to druntime because suddenly the mangle is different.
About the demangler the trouble with using a counter of encouterd symbols over the position.
Is that you need to build a table that translates from counter to symbolname index.
This would not be necsessary If we just use the position.

@JackStouffer

This comment has been minimized.

Show comment
Hide comment
@JackStouffer

JackStouffer Jun 21, 2016

Member

What's the status of this?

Member

JackStouffer commented Jun 21, 2016

What's the status of this?

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jun 21, 2016

Member

What's the status of this?

AFAICT, it mostly works on Windows (but a unittest in std.traits for mangledName), but fails on linux. I suspect that it has to do with shared library generation. I also noticed a few possible improvements regarding symbol back references.

The identifier encoding and using positional back references will make it difficult to implement std.traits.mangledName and core.demangle.mangle. BTW: can these two be merged?

Member

rainers commented Jun 21, 2016

What's the status of this?

AFAICT, it mostly works on Windows (but a unittest in std.traits for mangledName), but fails on linux. I suspect that it has to do with shared library generation. I also noticed a few possible improvements regarding symbol back references.

The identifier encoding and using positional back references will make it difficult to implement std.traits.mangledName and core.demangle.mangle. BTW: can these two be merged?

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 23, 2016

Member

@rainers
It it quite useful to handle mangle-parent specially as symbols tend to have the same parents over and over again.
This has quite an impact!

Member

UplinkCoder commented Jun 23, 2016

@rainers
It it quite useful to handle mangle-parent specially as symbols tend to have the same parents over and over again.
This has quite an impact!

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jun 24, 2016

Member

It it quite useful to handle mangle-parent specially as symbols tend to have the same parents over and over again.

Yes, this was one of the "possible improvements" I mentioned. With just a positional back reference, it's a bit difficult to detect where the parent symbol ends, though. We could reverse the order of qualified names to figure that.

Member

rainers commented Jun 24, 2016

It it quite useful to handle mangle-parent specially as symbols tend to have the same parents over and over again.

Yes, this was one of the "possible improvements" I mentioned. With just a positional back reference, it's a bit difficult to detect where the parent symbol ends, though. We could reverse the order of qualified names to figure that.

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jun 24, 2016

Member

I'd rather use either a length.

Member

UplinkCoder commented Jun 24, 2016

I'd rather use either a length.

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
@UplinkCoder

UplinkCoder Jul 5, 2016

Member

@rainers I finally figure out what a mangle really is :)
a mangle is a string-representation of a symbol-resolve query against a type resolved AST.
I think soon I'll have a more refined approach then the current one.

The Problem is not only string-compression, but also query optimization.

Member

UplinkCoder commented Jul 5, 2016

@rainers I finally figure out what a mangle really is :)
a mangle is a string-representation of a symbol-resolve query against a type resolved AST.
I think soon I'll have a more refined approach then the current one.

The Problem is not only string-compression, but also query optimization.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 23, 2016

Member

Sorry, accidentally pushed a corresponding druntime branch to the wrong repository: https://github.com/dlang/druntime/tree/mangle_backref

Before making things worse: what's the correct way to get rid of it?

Member

rainers commented Jul 23, 2016

Sorry, accidentally pushed a corresponding druntime branch to the wrong repository: https://github.com/dlang/druntime/tree/mangle_backref

Before making things worse: what's the correct way to get rid of it?

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 23, 2016

Member

Thanks for the quick answer. git push --force fails, though:

Total 0 (delta 0), reused 0 (delta 0)
remote: error: GH006: Protected branch update failed for refs/heads/master.
remote: error: Cannot force-push to a protected branch
To https://github.com/dlang/druntime.git
 ! [remote rejected] master -> master (protected branch hook declined)
error: failed to push some refs to 'https://github.com/dlang/druntime.git'

That's probably a good sign, so I cannot really ruin master by accident.

Member

rainers commented Jul 23, 2016

Thanks for the quick answer. git push --force fails, though:

Total 0 (delta 0), reused 0 (delta 0)
remote: error: GH006: Protected branch update failed for refs/heads/master.
remote: error: Cannot force-push to a protected branch
To https://github.com/dlang/druntime.git
 ! [remote rejected] master -> master (protected branch hook declined)
error: failed to push some refs to 'https://github.com/dlang/druntime.git'

That's probably a good sign, so I cannot really ruin master by accident.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 23, 2016

Member

I've just removed it in the github branch list: https://github.com/dlang/druntime/branches

Member

rainers commented Jul 23, 2016

I've just removed it in the github branch list: https://github.com/dlang/druntime/branches

@MartinNowak

This comment has been minimized.

Show comment
Hide comment
@MartinNowak

MartinNowak Jul 25, 2016

Member

Seems we had the same idea.

Have you guys talked about this earlier and/or did both of you work on an implementation?

Member

MartinNowak commented Jul 25, 2016

Seems we had the same idea.

Have you guys talked about this earlier and/or did both of you work on an implementation?

@MartinNowak

This comment has been minimized.

Show comment
Hide comment
@MartinNowak

MartinNowak Jul 25, 2016

Member

Just to bring up a similar point. While this seems like a good fix the technical liability of huge symbol names, it doesn't make them more displayable. Could we possibly come up w/ better human readable symbols that are still unique? That for sure would also solve the technical issue.
If not we could of course go the route of this PR and come up w/ some sensible (non-unique abbreviations) for pretty printing.

Member

MartinNowak commented Jul 25, 2016

Just to bring up a similar point. While this seems like a good fix the technical liability of huge symbol names, it doesn't make them more displayable. Could we possibly come up w/ better human readable symbols that are still unique? That for sure would also solve the technical issue.
If not we could of course go the route of this PR and come up w/ some sensible (non-unique abbreviations) for pretty printing.

@schveiguy

This comment has been minimized.

Show comment
Hide comment
@schveiguy

schveiguy Jul 25, 2016

Member

@MartinNowak I had suggested this at some point:

auto foo(T)(T t) {...}
foo(MyComplexType); // => foo!(T = MyComplexType)(T)

Having an implementation like Reiner and Stefan have tried will help with this mechanism as well, as it does a very similar thing.

Member

schveiguy commented Jul 25, 2016

@MartinNowak I had suggested this at some point:

auto foo(T)(T t) {...}
foo(MyComplexType); // => foo!(T = MyComplexType)(T)

Having an implementation like Reiner and Stefan have tried will help with this mechanism as well, as it does a very similar thing.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 25, 2016

Member

Have you guys talked about this earlier and/or did both of you work on an implementation?

Well, I got involved because I saw this going the wrong direction (Walter's compression) or having incomplete implementations (too many missed back references).

Member

rainers commented Jul 25, 2016

Have you guys talked about this earlier and/or did both of you work on an implementation?

Well, I got involved because I saw this going the wrong direction (Walter's compression) or having incomplete implementations (too many missed back references).

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 25, 2016

Member

Could we possibly come up w/ better human readable symbols that are still unique?

What do you have in mind? Detect and encode recursion somehow? The example from the bug report 1.s.s.s.s.s isn't a recursive template, but a chain of calls.

Member

rainers commented Jul 25, 2016

Could we possibly come up w/ better human readable symbols that are still unique?

What do you have in mind? Detect and encode recursion somehow? The example from the bug report 1.s.s.s.s.s isn't a recursive template, but a chain of calls.

@mathias-lang-sociomantic

This comment has been minimized.

Show comment
Hide comment
@mathias-lang-sociomantic

mathias-lang-sociomantic Jul 25, 2016

Member

That's probably a good sign, so I cannot really ruin master by accident.

That's what you get for running random command :P
For reference the command given to you deleted the local branch then force push your current branch to the tracked upstream (that also mean you are tracking the blessed repo, which is rarely a good thing).
The correct command to remove a branch in a repo is git push remote-name :branch-name (but the Github UI is also a perfect fit).

it doesn't make them more displayable.

I'm not sure how much more displayable we can make them. There is always going to be unreadable symbols. IMHO time would be better invested into tooling support than something human-readable.

Member

mathias-lang-sociomantic commented Jul 25, 2016

That's probably a good sign, so I cannot really ruin master by accident.

That's what you get for running random command :P
For reference the command given to you deleted the local branch then force push your current branch to the tracked upstream (that also mean you are tracking the blessed repo, which is rarely a good thing).
The correct command to remove a branch in a repo is git push remote-name :branch-name (but the Github UI is also a perfect fit).

it doesn't make them more displayable.

I'm not sure how much more displayable we can make them. There is always going to be unreadable symbols. IMHO time would be better invested into tooling support than something human-readable.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 25, 2016

Member

auto foo(T)(T t) {...}
foo(MyComplexType); // => foo!(T = MyComplexType)(T)

I like that. It avoids the exponential explosion of the symbol length for demangled names, too.

Member

rainers commented Jul 25, 2016

auto foo(T)(T t) {...}
foo(MyComplexType); // => foo!(T = MyComplexType)(T)

I like that. It avoids the exponential explosion of the symbol length for demangled names, too.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 25, 2016

Member

That's what you get for running random command :P

Yeah, I wasn't really sure I should try that and should have waited for more comments... Fortunately, it didn't break anything.

Member

rainers commented Jul 25, 2016

That's what you get for running random command :P

Yeah, I wasn't really sure I should try that and should have waited for more comments... Fortunately, it didn't break anything.

@wilzbach

This comment has been minimized.

Show comment
Hide comment
@wilzbach

wilzbach Aug 7, 2016

Member

That's what you get for running random command :P
Yeah, I wasn't really sure I should try that and should have waited for more comments... Fortunately, it didn't break anything.

A bit OT, but we do have force-push protection for master and stable on all three repos enabled, right? O_o

Member

wilzbach commented Aug 7, 2016

That's what you get for running random command :P
Yeah, I wasn't really sure I should try that and should have waited for more comments... Fortunately, it didn't break anything.

A bit OT, but we do have force-push protection for master and stable on all three repos enabled, right? O_o

Show outdated Hide outdated src/toir.c
"4simd10__simd_stoFNaNbNiNfE2$7$3XMMfNhG16vZ%4d",
"4simd6__simdFNaNbNiNfE2$7$3XMMNhG16v%3hZ%3h",
"4simd6__simdFNaNbNiNfE2$7$3XMMNhG16v%3hhZ%3h",
"4simd6__simdFNaNbNiNfE2$7$3XMMNhG16vZ%3h",

This comment has been minimized.

@UplinkCoder

UplinkCoder Sep 4, 2016

Member

how can it do backrefs here ?
AFAICS it looses the ´core´

@UplinkCoder

UplinkCoder Sep 4, 2016

Member

how can it do backrefs here ?
AFAICS it looses the ´core´

This comment has been minimized.

@rainers

rainers Sep 4, 2016

Member

how can it do backrefs here ?
AFAICS it looses the ´core´

these strings are assumed to be prefixed by _D4core, see the comparisons below.

@rainers

rainers Sep 4, 2016

Member

how can it do backrefs here ?
AFAICS it looses the ´core´

these strings are assumed to be prefixed by _D4core, see the comparisons below.

@UplinkCoder

This comment has been minimized.

Show comment
Hide comment
Member

UplinkCoder commented Sep 4, 2016

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Sep 5, 2016

Member

@rainers you might want to use fixwhitespace.
https://gist.github.com/UplinkCoder/2a8fd659eef9f3a2fdf8032f700bde80

Thanks. Usually my editor takes care of that, but most of this PR was done on another system with a non-configured emacs. I constantly have to fight its formatting rules...

Member

rainers commented Sep 5, 2016

@rainers you might want to use fixwhitespace.
https://gist.github.com/UplinkCoder/2a8fd659eef9f3a2fdf8032f700bde80

Thanks. Usually my editor takes care of that, but most of this PR was done on another system with a non-configured emacs. I constantly have to fight its formatting rules...

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Sep 5, 2016

Member

This should now pass with https://github.com/rainers/druntime/tree/mangle_backref and https://github.com/rainers/phobos/tree/mangle_backref. I've changed the travis script accordingly to give it a try.

Member

rainers commented Sep 5, 2016

This should now pass with https://github.com/rainers/druntime/tree/mangle_backref and https://github.com/rainers/phobos/tree/mangle_backref. I've changed the travis script accordingly to give it a try.

@MartinNowak

This comment has been minimized.

Show comment
Hide comment
@MartinNowak

MartinNowak Oct 3, 2016

Member

This is lingering around for quite a while already, but it's actually an important fix.
Let's try to finish that for 2.073 (end of 2016).
I've listed a few point that still should be addressed/discussed https://issues.dlang.org/show_bug.cgi?id=15831#c13.
The most important one, can we only backref-abbreviate very long symbols to mitigate the impact of this change?

We definitely need to get @ibuclaw into the discussion, he rewrote gdb's D demangler.

Issue 15831 - Abbreviate long O(N²) manglings on Active | Trello

Member

MartinNowak commented Oct 3, 2016

This is lingering around for quite a while already, but it's actually an important fix.
Let's try to finish that for 2.073 (end of 2016).
I've listed a few point that still should be addressed/discussed https://issues.dlang.org/show_bug.cgi?id=15831#c13.
The most important one, can we only backref-abbreviate very long symbols to mitigate the impact of this change?

We definitely need to get @ibuclaw into the discussion, he rewrote gdb's D demangler.

Issue 15831 - Abbreviate long O(N²) manglings on Active | Trello

@MartinNowak

This comment has been minimized.

Show comment
Hide comment
@MartinNowak

MartinNowak Oct 3, 2016

Member

Would also be nice to start the discussion w/ a more formal spec update, rather than a low-level code review.

Member

MartinNowak commented Oct 3, 2016

Would also be nice to start the discussion w/ a more formal spec update, rather than a low-level code review.

@MartinNowak

This comment has been minimized.

Show comment
Hide comment
@MartinNowak

MartinNowak Oct 29, 2016

Member

@rainers what problems are still left w/ this?

Member

MartinNowak commented Oct 29, 2016

@rainers what problems are still left w/ this?

@dlang-bot

This comment has been minimized.

Show comment
Hide comment
@dlang-bot

dlang-bot Jul 6, 2017

Contributor

Thanks for your pull request, @rainers!

Bugzilla references

Fix Bugzilla Description
15831 IFTI voldemort type exploding bloat
Contributor

dlang-bot commented Jul 6, 2017

Thanks for your pull request, @rainers!

Bugzilla references

Fix Bugzilla Description
15831 IFTI voldemort type exploding bloat
@andralex

This comment has been minimized.

Show comment
Hide comment
@andralex

andralex Jul 15, 2017

Member

@MartinNowak are we good to go?

Member

andralex commented Jul 15, 2017

@MartinNowak are we good to go?

@wilzbach

This comment has been minimized.

Show comment
Hide comment
@wilzbach

wilzbach Jul 15, 2017

Member

It would be great if someone actually hurt by the huge symbols could build dmd with dlang/dmd#5855 and the runtime with dlang/druntime#1830 and see how it works out.

FYI, building druntime on my Linux instance spiked from 8.5s to 98s (reproducible) with your druntime changes. I haven't had a chance to look into details.

What master (f189b2d) dmd/5855
libphobos2-ut.so 279M 171M
std.algorithm object test files 118M 45M
std.range object test files 50M 34M
phobos unittest build + execute 65s 65s

Plots

I have thrown two quick plots together because I prefer visual graphics over looking at raw data. They aren't fancy, but I think they show @rainers great work (I had to use log-scale on the y-axis and different x-scales to make it look nice)

image

As the longest symbol with pr/5855 is 1142 it makes sense to look closer:

image

image

(the dark red area at the bottom is the common overlap)

(full Jupyter notebook with the before/after symbol files).

Member

wilzbach commented Jul 15, 2017

It would be great if someone actually hurt by the huge symbols could build dmd with dlang/dmd#5855 and the runtime with dlang/druntime#1830 and see how it works out.

FYI, building druntime on my Linux instance spiked from 8.5s to 98s (reproducible) with your druntime changes. I haven't had a chance to look into details.

What master (f189b2d) dmd/5855
libphobos2-ut.so 279M 171M
std.algorithm object test files 118M 45M
std.range object test files 50M 34M
phobos unittest build + execute 65s 65s

Plots

I have thrown two quick plots together because I prefer visual graphics over looking at raw data. They aren't fancy, but I think they show @rainers great work (I had to use log-scale on the y-axis and different x-scales to make it look nice)

image

As the longest symbol with pr/5855 is 1142 it makes sense to look closer:

image

image

(the dark red area at the bottom is the common overlap)

(full Jupyter notebook with the before/after symbol files).

@andralex

This comment has been minimized.

Show comment
Hide comment
@andralex

andralex Jul 15, 2017

Member

FYI, building druntime on my Linux instance spiked from 8.5s to 98s

Thanks for the analysis! Is that ninety-eight seconds?

Member

andralex commented Jul 15, 2017

FYI, building druntime on my Linux instance spiked from 8.5s to 98s

Thanks for the analysis! Is that ninety-eight seconds?

@wilzbach

This comment has been minimized.

Show comment
Hide comment
@wilzbach

wilzbach Jul 15, 2017

Member

Thanks for the analysis! Is that ninety-eight seconds?

Yes, btw we could create a mangle branch at dmd and druntime (like e.g. newCTFE), s.t.

  • we can easily check this on the auto-tester without needing to merge the druntime branch
  • @MartinNowak's nightly build server can build the feature branch and this option is as available via our install script: curl https://i.dlang.io | bash -s dmd-mangle (like curl https://i.dlang.io | bash -s dmd-newCTFE), s.t. more people can test this?
Member

wilzbach commented Jul 15, 2017

Thanks for the analysis! Is that ninety-eight seconds?

Yes, btw we could create a mangle branch at dmd and druntime (like e.g. newCTFE), s.t.

  • we can easily check this on the auto-tester without needing to merge the druntime branch
  • @MartinNowak's nightly build server can build the feature branch and this option is as available via our install script: curl https://i.dlang.io | bash -s dmd-mangle (like curl https://i.dlang.io | bash -s dmd-newCTFE), s.t. more people can test this?
@andralex

This comment has been minimized.

Show comment
Hide comment
@andralex

andralex Jul 15, 2017

Member

OK, sadly that won't work out. Even 9.8 seconds would be a problem. @rainers do you know what the speed bottleneck is?

Member

andralex commented Jul 15, 2017

OK, sadly that won't work out. Even 9.8 seconds would be a problem. @rainers do you know what the speed bottleneck is?

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 15, 2017

Member

OK, sadly that won't work out. Even 9.8 seconds would be a problem. @rainers do you know what the speed bottleneck is?

I just tried it in a linux VM and the build time of the druntime PR without this dmd PR raised the compilation time from about 10s to 70s. It seems similar on Windows, but not as extreme. I suspect that this might be caused by usage of externDFunc that now uses the demangler at compile time to construct the mangled name. If that's the cause I can create a short cut for most symbols, but I'll have to investigate tomorrow.

As Sebastian, I don't see that rise in build time for phobos unittests, though it was a few percent slower as shown in #5855 (comment).

Member

rainers commented Jul 15, 2017

OK, sadly that won't work out. Even 9.8 seconds would be a problem. @rainers do you know what the speed bottleneck is?

I just tried it in a linux VM and the build time of the druntime PR without this dmd PR raised the compilation time from about 10s to 70s. It seems similar on Windows, but not as extreme. I suspect that this might be caused by usage of externDFunc that now uses the demangler at compile time to construct the mangled name. If that's the cause I can create a short cut for most symbols, but I'll have to investigate tomorrow.

As Sebastian, I don't see that rise in build time for phobos unittests, though it was a few percent slower as shown in #5855 (comment).

@JackStouffer

This comment has been minimized.

Show comment
Hide comment
@JackStouffer

JackStouffer Jul 15, 2017

Member

If push comes to shove, we can always put this behind a DMD flag.

Member

JackStouffer commented Jul 15, 2017

If push comes to shove, we can always put this behind a DMD flag.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 16, 2017

Member

but I'll have to investigate tomorrow.

It turned out this was just another instance of the dmd inliner going wild to an extent that the optimizer can't handle reasonably. I've disabled inlining some functions (dlang/druntime@81a06e7) and it now builds slightly faster than master for me.

Member

rainers commented Jul 16, 2017

but I'll have to investigate tomorrow.

It turned out this was just another instance of the dmd inliner going wild to an extent that the optimizer can't handle reasonably. I've disabled inlining some functions (dlang/druntime@81a06e7) and it now builds slightly faster than master for me.

@andralex

This comment has been minimized.

Show comment
Hide comment
@andralex

andralex Jul 16, 2017

Member

Awesome! Fix the failing tests?

Member

andralex commented Jul 16, 2017

Awesome! Fix the failing tests?

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 16, 2017

Member

Fix the failing tests?

The tests need the corresponding druntime change: dlang/druntime#1830

Member

rainers commented Jul 16, 2017

Fix the failing tests?

The tests need the corresponding druntime change: dlang/druntime#1830

@wilzbach wilzbach changed the base branch from master to mangle Jul 16, 2017

@dlang-bot

This comment has been minimized.

Show comment
Hide comment
@dlang-bot

dlang-bot Jul 16, 2017

Contributor

Thanks for your pull request, @rainers!

Bugzilla references

Auto-close Bugzilla Description
15831 IFTI voldemort type exploding bloat
Contributor

dlang-bot commented Jul 16, 2017

Thanks for your pull request, @rainers!

Bugzilla references

Auto-close Bugzilla Description
15831 IFTI voldemort type exploding bloat

@wilzbach wilzbach closed this Jul 16, 2017

@wilzbach wilzbach reopened this Jul 16, 2017

@dlang-bot

This comment has been minimized.

Show comment
Hide comment
@dlang-bot

dlang-bot Jul 16, 2017

Contributor

Thanks for your pull request, @rainers!

Bugzilla references

Auto-close Bugzilla Description
15831 IFTI voldemort type exploding bloat
Contributor

dlang-bot commented Jul 16, 2017

Thanks for your pull request, @rainers!

Bugzilla references

Auto-close Bugzilla Description
15831 IFTI voldemort type exploding bloat
@wilzbach

This comment has been minimized.

Show comment
Hide comment
@wilzbach

wilzbach Jul 16, 2017

Member

The tests need the corresponding druntime change: dlang/druntime#1830

Created a mangle branch and changed the target on both PRs, s.t. the auto-tester and other CIs can test this without needing to merge the druntime PR (hence also the rebase)

Member

wilzbach commented Jul 16, 2017

The tests need the corresponding druntime change: dlang/druntime#1830

Created a mangle branch and changed the target on both PRs, s.t. the auto-tester and other CIs can test this without needing to merge the druntime PR (hence also the rebase)

@wilzbach wilzbach referenced this pull request Jul 16, 2017

Open

Prevent double posts #141

@andralex

This comment has been minimized.

Show comment
Hide comment
@andralex

andralex Jul 16, 2017

Member

@wilzbach good to know! Well I've pulled dlang/druntime#1830 already because it was passing tests.

Member

andralex commented Jul 16, 2017

@wilzbach good to know! Well I've pulled dlang/druntime#1830 already because it was passing tests.

@andralex

This comment has been minimized.

Show comment
Hide comment
@andralex

andralex Jul 16, 2017

Member

@wilzbach is there a way now to kick off the tests again?

Member

andralex commented Jul 16, 2017

@wilzbach is there a way now to kick off the tests again?

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 16, 2017

Member

Created a mangle branch and changed the target on both PRs

Thanks.

is there a way now to kick off the tests again?

I see a new test failure, will fix it in a few minutes.

Member

rainers commented Jul 16, 2017

Created a mangle branch and changed the target on both PRs

Thanks.

is there a way now to kick off the tests again?

I see a new test failure, will fix it in a few minutes.

@MartinNowak

This comment has been minimized.

Show comment
Hide comment
@MartinNowak

MartinNowak Jul 16, 2017

Member

Let's maybe go with __traits(__externDMangle) as it's internal only and might get replaced with a bit of druntime CTFE logic.
Already done in dlang/druntime#1830, thanks @rainers :).

Member

MartinNowak commented Jul 16, 2017

Let's maybe go with __traits(__externDMangle) as it's internal only and might get replaced with a bit of druntime CTFE logic.
Already done in dlang/druntime#1830, thanks @rainers :).

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 16, 2017

Member

Let's maybe use __traits(__externDMangle)

The new trait is completely gone in this version.

Member

rainers commented Jul 16, 2017

Let's maybe use __traits(__externDMangle)

The new trait is completely gone in this version.

@andralex andralex added the auto-merge label Jul 16, 2017

@dlang-bot dlang-bot merged commit 16101ef into dlang:mangle Jul 16, 2017

0 of 7 checks passed

codecov/patch CI failed.
Details
codecov/project CI failed.
Details
continuous-integration/jenkins/pr-merge This commit cannot be built
Details
auto-tester Fail: 1, Pending: 9
Details
ci/circleci Your tests failed on CircleCI
Details
CyberShadow/DAutoTest Building documentation
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 16, 2017

Member

I see a new test failure, will fix it in a few minutes.

Works for me locally. I suspect I still looked at the test results of building against the master druntime.

I'm slightly confused by all the new branches, though. Is there anywhere we can check the build with both dmd/druntime patches.

Member

rainers commented Jul 16, 2017

I see a new test failure, will fix it in a few minutes.

Works for me locally. I suspect I still looked at the test results of building against the master druntime.

I'm slightly confused by all the new branches, though. Is there anywhere we can check the build with both dmd/druntime patches.

@andralex

This comment has been minimized.

Show comment
Hide comment
@andralex

andralex Jul 16, 2017

Member

@rainers probably #6997 is to look at

Member

andralex commented Jul 16, 2017

@rainers probably #6997 is to look at

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 16, 2017

Member

probably #6997 is to look at

@andralex Nah, that's merge_stable, not merge_mangle.

Member

rainers commented Jul 16, 2017

probably #6997 is to look at

@andralex Nah, that's merge_stable, not merge_mangle.

@wilzbach

This comment has been minimized.

Show comment
Hide comment
@wilzbach

wilzbach Jul 16, 2017

Member

I'm slightly confused by all the new branches, though. Is there anywhere we can check the build with both dmd/druntime patches.

That was the idea behind the branches, sorry for the confusion. In general, for a PR to branch X the same branch will be checked out when building druntime and phobos (master is used as soft-fallback).

@rainers probably #6997 is to look at

I think #6998 is the right one (it will checkout master at druntime as it is targeted at master)

Member

wilzbach commented Jul 16, 2017

I'm slightly confused by all the new branches, though. Is there anywhere we can check the build with both dmd/druntime patches.

That was the idea behind the branches, sorry for the confusion. In general, for a PR to branch X the same branch will be checked out when building druntime and phobos (master is used as soft-fallback).

@rainers probably #6997 is to look at

I think #6998 is the right one (it will checkout master at druntime as it is targeted at master)

@quickfur

This comment has been minimized.

Show comment
Hide comment
@quickfur

quickfur Jul 17, 2017

Member

Cool beans! I just checked out the mangle branch (for others who may be interested: you have to checkout mangle for both dmd and druntime, and then recompile dmd, druntime, phobos) and tested it on my template-heavy project. The results are very promising: my executable sizes are down to about 4MB now (down from about 8MB for the version with OO type erasure, or 20MB for the version with no OO type erasure).

  • With the horcrux version of Phobos (i.e., dlang/phobos#5610 and dlang/phobos#5611), one executable has size 4751768 bytes (4.75 MB).
  • With Phobos git HEAD (i.e., without the horcrux trick), the same executable has size 4785096 (4.79 MB), about 30KB larger.

I also tested two versions of the same program (from git history):

  • A version with type erasure (in the middle of the UFCS chain, insert an OO wrapper that erases the types of previous components of the chain), which now has all symbols less than 1000 characters;
  • A version without type erasure, where the largest symbol is about 1200 characters, a noticeable but mostly negligible difference.

So this looks very promising indeed.

However, compile times have noticeably slowed down, so that will have to be addressed before we merge this to master. In any case, many thanks to @rainers for pulling this off!

Member

quickfur commented Jul 17, 2017

Cool beans! I just checked out the mangle branch (for others who may be interested: you have to checkout mangle for both dmd and druntime, and then recompile dmd, druntime, phobos) and tested it on my template-heavy project. The results are very promising: my executable sizes are down to about 4MB now (down from about 8MB for the version with OO type erasure, or 20MB for the version with no OO type erasure).

  • With the horcrux version of Phobos (i.e., dlang/phobos#5610 and dlang/phobos#5611), one executable has size 4751768 bytes (4.75 MB).
  • With Phobos git HEAD (i.e., without the horcrux trick), the same executable has size 4785096 (4.79 MB), about 30KB larger.

I also tested two versions of the same program (from git history):

  • A version with type erasure (in the middle of the UFCS chain, insert an OO wrapper that erases the types of previous components of the chain), which now has all symbols less than 1000 characters;
  • A version without type erasure, where the largest symbol is about 1200 characters, a noticeable but mostly negligible difference.

So this looks very promising indeed.

However, compile times have noticeably slowed down, so that will have to be addressed before we merge this to master. In any case, many thanks to @rainers for pulling this off!

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 17, 2017

Member

I just checked out the mangle branch (for others who may be interested: you have to checkout mangle for both dmd and druntime

Thank you for testing. The druntime changes are already merged to master, and the "official" PR for dmd is now #6998.

However, compile times have noticeably slowed down

Is your project available for testing somewhere? I've been using phobos unittests for benchmrking, and that showed an increase of compilation times of a few percent. Sebastians test above didn't show a difference at all.

Member

rainers commented Jul 17, 2017

I just checked out the mangle branch (for others who may be interested: you have to checkout mangle for both dmd and druntime

Thank you for testing. The druntime changes are already merged to master, and the "official" PR for dmd is now #6998.

However, compile times have noticeably slowed down

Is your project available for testing somewhere? I've been using phobos unittests for benchmrking, and that showed an increase of compilation times of a few percent. Sebastians test above didn't show a difference at all.

@quickfur

This comment has been minimized.

Show comment
Hide comment
@quickfur

quickfur Jul 17, 2017

Member

Actually, I take back my comment about compile times. I just re-tested on two versions of my program:

  • With OO type erasure (smaller symbols lengths overall), compilation is about 13 seconds on git HEAD, but about 11 seconds on the mangle branch.
  • Without OO type erasure (longer symbols), compilation is about 40 seconds on git HEAD, but about 12 seconds on the mangle branch.

So actually, there's a decrease in compilation times, not an increase.

Member

quickfur commented Jul 17, 2017

Actually, I take back my comment about compile times. I just re-tested on two versions of my program:

  • With OO type erasure (smaller symbols lengths overall), compilation is about 13 seconds on git HEAD, but about 11 seconds on the mangle branch.
  • Without OO type erasure (longer symbols), compilation is about 40 seconds on git HEAD, but about 12 seconds on the mangle branch.

So actually, there's a decrease in compilation times, not an increase.

@rainers

This comment has been minimized.

Show comment
Hide comment
@rainers

rainers Jul 17, 2017

Member

Without OO type erasure (longer symbols), compilation is about 40 seconds, but about 12 seconds on the mangle branch.

Cool. That's what I was hoping for ;-) It also happens in https://issues.dlang.org/show_bug.cgi?id=15831 if symbols get really long.

Member

rainers commented Jul 17, 2017

Without OO type erasure (longer symbols), compilation is about 40 seconds, but about 12 seconds on the mangle branch.

Cool. That's what I was hoping for ;-) It also happens in https://issues.dlang.org/show_bug.cgi?id=15831 if symbols get really long.

{
mangleTemplateInstance(ti);
}
else if (p.getIdent())

This comment has been minimized.

@ibuclaw

ibuclaw Jun 24, 2018

Member

This change caused a regression (or exposed an underlying bug).

Type merging of immutable!T -> T is now broken, if T is nested in a struct S(alias fun). Calling mangleToBuffer is called before and after fun has completed its semantic pass now yields two different results. So there are two copies of T floating around the AST with pointers to immutable!T.

struct MultiwayMerge()
{ 
    bool compFront()
    {
        return true;
    }

    struct BinaryHeap(alias less)
    {
        // Both:
        // __T13MultiwayMergeZQq__T10BinaryHeapS_DQBu__TQBqZQBu9compFrontMFZbZQBr4Impl
        // __T13MultiwayMergeZQq__T10BinaryHeapS_DQBu__TQBqZQBu9compFrontMFNaNbNiNfZbZQBz4Impl
        struct Impl
        {
            int _payload;
        }

        void initialize()
        {
            immutable Impl init;
        }   
    }

    BinaryHeap!(compFront) _heap;
}

MultiwayMerge!() multiwayMerge;
@ibuclaw

ibuclaw Jun 24, 2018

Member

This change caused a regression (or exposed an underlying bug).

Type merging of immutable!T -> T is now broken, if T is nested in a struct S(alias fun). Calling mangleToBuffer is called before and after fun has completed its semantic pass now yields two different results. So there are two copies of T floating around the AST with pointers to immutable!T.

struct MultiwayMerge()
{ 
    bool compFront()
    {
        return true;
    }

    struct BinaryHeap(alias less)
    {
        // Both:
        // __T13MultiwayMergeZQq__T10BinaryHeapS_DQBu__TQBqZQBu9compFrontMFZbZQBr4Impl
        // __T13MultiwayMergeZQq__T10BinaryHeapS_DQBu__TQBqZQBu9compFrontMFNaNbNiNfZbZQBz4Impl
        struct Impl
        {
            int _payload;
        }

        void initialize()
        {
            immutable Impl init;
        }   
    }

    BinaryHeap!(compFront) _heap;
}

MultiwayMerge!() multiwayMerge;

This comment has been minimized.

@rainers

rainers Jun 25, 2018

Member

Calling mangleToBuffer is called before and after fun has completed its semantic pass now yields two different results.

AFAICT calling mangleToBuffer on an incompletely analyzed symbol should never happen. The old implementation reused the first (incorrect) result, so the issue could have passed unnoticed.

@rainers

rainers Jun 25, 2018

Member

Calling mangleToBuffer is called before and after fun has completed its semantic pass now yields two different results.

AFAICT calling mangleToBuffer on an incompletely analyzed symbol should never happen. The old implementation reused the first (incorrect) result, so the issue could have passed unnoticed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment