-
-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REG2.067] Issue 14828 - duplicate symbol __ModuleInfoZ depending on ordering of files passed to dmd #4851
Conversation
Seems to me that the helper functions should be generated when the moduleinfo is, and only then. I don't understand why foo.d's object code should include helper functions for bar.d. |
By #3552, the module helper functions are also placed in COMDAT, when they're used by non-root template instantiations. |
Module *mx = Module::amodules[j]; | ||
if (mx != m && mx->importedFrom == m && (mx->marray || mx->massert || mx->munittest)) | ||
genhelpers(mx, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why foo.d's object code should include helper functions for bar.d.
@WalterBright Here was the problematic part. If global.params.multiobj == false
, it's not a problem.
But when global.params.multiobj == true
, in the library compilation:
all1_order_flipped:
${dmd} -oflibproj6.a -lib -g project2.d foo2.d foo1.d
Both foo2
and foo1
imports stdio
, then stdio
->importedFrom is set to foo2
. And this code has emit the stdio
module helpers into the libraried foo2.obj
, which also contains foo2.__ModuleInfoZ
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stdio module helpers should be there with the stdio.obj file, along with stdio's moduleinfo. There should be no reason for any other module to generate either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stdio module helpers should be there with the stdio.obj file, along with stdio's moduleinfo.
Yes, it's still correct. When we compile stdio.d, stdio.obj will contain both non-COMDAT moduleinfo and three helper functions.
There should be no reason for any other module to generate either.
No, it's necessary to fix issue 846. When a user instantiate writeln!(), it will implicitly use the helper stdio.__array
for array bounds check. But, if he don't link stdio.obj, it had got the linker error "undefined symbol stdio.__array". In #3552, I've applied those changes to fix 846:
- All module helper functions had changed to not use ModuleInfo. It was just for getting module file name, so I changed to use bare module file name string as the alternative.
- After that, each helper functions has changed to not dependent to the corresponding ModuleInfo. When they're required for a non-root template instance, they will be also generated and placed in COMDAT section. Finally, the use of them won't cause link failures, even if the non-COMDAT versions are not linked.
If you want to know the details, please also check the change in #3552. Honestly this PR is a supplemental change of that. Honestly, when I wrote that PR, I didn't think well about the library generation. So the COMDAT helpers had placed in incorrect places. This PR will fix that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so #3552 is where it came from. Thanks for pointing it out. I did not review that PR. I believe it is the wrong solution, however. A very simple solution is to compile the helpers and moduleinfo into stdio.obj. Then, you have to link with stdio.obj. I do not think it should be a surprise to anyone that if they import stdio, they have to link with stdio.obj. (The stdio.obj should be in the library, too, meaning it will get pulled in automatically.)
This solution resolves https://issues.dlang.org/show_bug.cgi?id=846 in a simple, easy to understand, and easy to implement manner. Furthermore, it eliminates all that .obj file bloat I've noticed where the helper functions for every import get generated into every .obj file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, it eliminates all that .obj file bloat I've noticed where the helper functions for every import get generated into every .obj file.
Bloat is an exaggeration, each of those functions only adds about 50 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 3 per import, and then repeated endlessly for each object file. For:
import std.stdio;
void test() { }
the following are generated:
_D6object7__arrayZ COMDAT flags=x0 attr=x10 align=x0
_D6object8__assertFiZv COMDAT flags=x0 attr=x10 align=x0
_D6object15__unittest_failFiZv COMDAT flags=x0 attr=x10 align=x0
_D4core4stdc6stdint7__arrayZ COMDAT flags=x0 attr=x10 align=x0
_D4core4stdc6stdint8__assertFiZv COMDAT flags=x0 attr=x10 align=x0
_D4core4stdc6stdint15__unittest_failFiZv COMDAT flags=x0 attr=x10 align=x0
_D3std8typecons7__arrayZ COMDAT flags=x0 attr=x10 align=x0
_D3std8typecons8__assertFiZv COMDAT flags=x0 attr=x10 align=x0
_D3std8typecons15__unittest_failFiZv COMDAT flags=x0 attr=x10 align=x0
_D3std6traits7__arrayZ COMDAT flags=x0 attr=x10 align=x0
_D3std6traits8__assertFiZv COMDAT flags=x0 attr=x10 align=x0
_D3std6traits15__unittest_failFiZv COMDAT flags=x0 attr=x10 align=x0
_D3std4meta7__arrayZ COMDAT flags=x0 attr=x10 align=x0
_D3std4meta8__assertFiZv COMDAT flags=x0 attr=x10 align=x0
_D3std4meta15__unittest_failFiZv COMDAT flags=x0 attr=x10 align=x0
and:
db 066h,06fh,06fh,000h,063h,03ah,05ch,063h ;foo.c:\c
db 062h,078h,05ch,06dh,061h,072h,073h,05ch ;bx\mars\
db 064h,072h,075h,06eh,074h,069h,06dh,065h ;druntime
db 05ch,069h,06dh,070h,06fh,072h,074h,05ch ;\import\
db 06fh,062h,06ah,065h,063h,074h,02eh,064h ;object.d
db 000h,063h,03ah,05ch,063h,062h,078h,05ch ;.c:\cbx\
db 06dh,061h,072h,073h,05ch,064h,072h,075h ;mars\dru
db 06eh,074h,069h,06dh,065h,05ch,069h,06dh ;ntime\im
db 070h,06fh,072h,074h,05ch,063h,06fh,072h ;port\cor
db 065h,05ch,073h,074h,064h,063h,05ch,073h ;e\stdc\s
db 074h,064h,069h,06eh,074h,02eh,064h,000h ;tdint.d.
db 063h,03ah,05ch,063h,062h,078h,05ch,06dh ;c:\cbx\m
db 061h,072h,073h,05ch,070h,068h,06fh,062h ;ars\phob
db 06fh,073h,05ch,073h,074h,064h,05ch,074h ;os\std\t
db 079h,070h,065h,063h,06fh,06eh,073h,02eh ;ypecons.
db 064h,000h,063h,03ah,05ch,063h,062h,078h ;d.c:\cbx
db 05ch,06dh,061h,072h,073h,05ch,070h,068h ;\mars\ph
db 06fh,062h,06fh,073h,05ch,073h,074h,064h ;obos\std
db 05ch,074h,072h,061h,069h,074h,073h,02eh ;\traits.
db 064h,000h,063h,03ah,05ch,063h,062h,078h ;d.c:\cbx
db 05ch,06dh,061h,072h,073h,05ch,070h,068h ;\mars\ph
db 06fh,062h,06fh,073h,05ch,073h,074h,064h ;obos\std
db 05ch,06dh,065h,074h,061h,02eh,064h,000h ;\meta.d.
used only by those helper functions.
The object file is 2,229 bytes, although the generated code for test(){}
is one byte.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generating __unittest_fail
is just a bug. I'm not adding improvement for that in this PR, but I can easily kill it.
@WalterBright I also think this is the root of issue 14748. When someone tried to use the COMDAT helpers that is placed in wrong object file, the linker would pull the whole object file itself that might contain huge code blocks. |
It might be simpler to always create the helper functions (even for release code) and avoid the COMDATs for non-root modules. That's an alternative fix to Issue 846 that should also fix this issue. |
We already have code that tries this, but it was moved from TemplateDeclaration to TemplateInstance and no longer works. Can inlined code also use those helper? If so we'd need to unconditionally generate them. |
Alternatively we might drop the helpers and always directly call _d_arrayassert pass module name + line number. That would increase code size for each assertion a little bit though. |
Exactly. |
I think we should favor the human friendly compiler behavior than compiler-friendly requirement, as far as possible. Please consider the pragmatic situation of issue 846. For example: "I want to use a template function std.algorithm.map for the stack allocated arrays in my embedded system program. But to reduce binary size, I don't link the phobos library. But it would cause undefined symbol link failure: std.range.primitives,__array. "What's the __array? Who use that symbol in std.range.primitives? I cannot understand..." Walter, you should quickly understand the following actual reason, and explain to him.
The little complexity in compiler code will avoid such the error. I think it's huge benefit. |
I also considered the idea to fix issue 846. But as you say, it will increase binary size per each |
Unfortunately, it produces regressions and bloat. It's always been true that if you import stdio.d, you have to link with stdio.obj. It's a simple rule. Trying to make it work otherwise is just going to lead to endless problems.
I don't agree. #3552 adds 60 lines and touches 9 files, and this fix for it adds 200 lines of code and touches 10 files.
It obviously depends on std.range.
I do not. Trying to import Phobos without linking with Phobos is doable, but only for people who really know what they're doing and can deal with issues like this, i.e. they have to be familiar with how things actually work. For those who aren't ready to look under the hood, they need to link with Phobos, as that is how things are designed. The solution I propose:
Problem solved! |
Without having looked at what is proposed here in detail:
Not quite. One of the requirements for Deimos bindings is that they are written in a way that avoids having to link against the D modules. Not supporting this would certainly make LDC's life easier; I'm just saying that many people expect DMD to behave differently. If we do not want to support this, it should be prominently documented. |
The simple rule will disallow template library idiom forever. It's possible with C++, but not with D. Isn't D a better designed C++?
+44 line comes from the test case. So, 60-44 == 16 lines is enough small.
Sorry, my words was not precise. I meant: It looks like not to depend on other non-template symbols in Phobos.
That's not only with Phobos issue. If a user write his own template library and use it without linking the module object file, same error happens.
As I understand, it's already done. ( |
If the |
That works if the import is declarations only. It doesn't work if you start having definitions with arrays. I.e. as I mentioned before you have to know what you're doing. |
So link with the template library .obj file. Not a big deal.
That's right, he has to link with the module object file. This is not a terrible burden.
Anyone writing a bare metal system will have to learn about object files, linking, etc., regardless. Also, generating the helper files into every object file does NOT solve the problem. What about struct initializers? Symbolic debug info? Class vtables? Static variables? This PR is not only complex, it doesn't work - it's a bandaid and will lead to false expectations and further complaints. |
The original idea of #3552 is that: if the module helpers are used from instantiated templates, those should also be "instantiated" (generated and placed in COMDAT section). Indeed, it would increase the size of object code a little, but does not affect to the final executable size. I believe it's still simple and good behavior. |
I agree that they should have the general knowlege about object files and linking. But requiring knowledge about dmd internals is unnecessary constraint. I've argue that such the module helpers belong to the dmd implementation detail.
If they're instantiated things, they're placed in the instantiated object code. If not, the requires to link corresponding object file. Every thing can follow the simple "template instantiation" model. I believe making an exception for the module helpers is bad idea. |
Plain old link command might also pull all symbols other than helpers in that module. If we force the link, anymore we would not be able to shrink final executable size in that case. In general, such the enforcement is called bad know-how. I don't want to left it in the future of D. |
If it does, then the code in that module contains more than templates. |
What do you mean? |
I've built embedded systems, standalone systems, etc. The idea that one can do so and have no knowledge of details of how the compiler you're using generates symbols is just not going to work. That includes using C and C++.
The only way there'd be other code in the modules .obj file and references to more external symbols is if and only if there are more than just templates in the module. The compiler does not insert random references to other symbols in object files. |
If you're a poor compiler's user, such the defensive stance is not bad, but we're now working as compiler developers. Making the compiler smarter is our duty. |
Very sorry, I still don't understand about the explained situation. (I'm Japanese. One longer statement is harder to read) At least I'm talking about the situation to use a fully templated utility (that does not depend on any non-template symbols). Le's show a minimized code: module a; import b;
void main() { int[3] arr; int n = foo(arr[], 1); } module b;
T foo(T)(T[] arr, size_t i) { return arr[i]; }
When However, the old dmd had implicitly added the dependency from I think the old behavior had been contrary to the general mental model about the "template instantiation". Most of D programmers don't know well about dmd-internals, and don't know what the missing of Again, the current As we already does in git-head, the "correct way" is not expensive excepting for Walter. I wish to keep it.keep it. |
I understand, I know you want to avoid having to link with the module's .obj file. But your solution causes more problems, and it DOES NOT solve the problem in general as detailed above. I don't believe that linking in the module's .obj file is some awful problem. Just tell the user to link it in. This is how the D compilation system is designed to work. |
At least this PR is not for general issues. It's for the problematic object file generation around module helpers.
What will be caused? This is specific for module helpers. And actually fixes the regression issue. If you have concern about this PR, please explain it.
So, please document it as a part of D specification. Then, I'll add a section to changelog that, we decided not to support "template library" idiom - it's a D defeat from C++. |
It's the same issue for struct initializers, etc. The same. This design does not solve the problem, it's a bandaid for just one aspect of it, and it does not scale.
Again:
It's implicit in needing to compile all modules in a project and then link them together.
C++ doesn't even have modules. Use |
What's accidental complexity and increases this PR's diff a lot is to have 3 different object compilation modes, each with it's own shortcomings. |
if (entrypoint && m == rootHasMain) | ||
genObjFile(entrypoint, global.params.multiobj); | ||
genObjFile(entrypoint, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
genObjFile
, what a misnomer, it doesn't actually generate a file.
…g of files passed to dmd
continue; | ||
if (!mx->marray && !mx->massert && !mx->munittest) | ||
continue; | ||
genhelpers(mx, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still emitting weak duplicates of those helper functions into other modules, means it won't solve Issue 14748 for separate compilation.
This indicates a severe flaw of this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then again, other than libraries, objects passed to the linker aren't optional by default, so it might not be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right.
Rest LGTM, moving those helpers to separate library objects solves the problem without adding linkage requirements to header only libraries. |
@MartinNowak Thanks for your detailed review! |
Not if it references a struct declared in the same module that isn't part of the template. Or a class' vtable. Or any references to static data.
It's hard to see why one would need templates to interface with C .h files. But be that as it may, again, this does NOT solve the problem in general as noted above. "can't" is too strong of a word, as well. In the rare case where one might be indexing arrays in a Deimos template, and I mean rare, there are simple ways to do it that will not invoke the calls to __array and do not affect the user. I disagree strongly with this approach. I feel we've all had our say, and it's holding up 2.068. I wish we could reach consensus, but it has reached a point where we need have a decision, and that falls to me. Let's go with the 2 step approach I outlined above. Please close this. The two bugzilla issues on this do present an opportunity for us to improve the documentation. |
Again, this is not for the general problem. This is specific fix for the module helpers.
No, I cannot close this PR yet. Your argument is still unreasonable to me. I organize and provide my point again: The module helpers are not visible from dmd users. No spec documentation explains about that. Neither druntime nor phobos code doesn't contain their implementations. It's just belong to the compiler implementation. Therefore if any link failures happen due to them, it must be merely a compiler's bug. I doubt that you would needlessly stick the 'simple rule'. |
P.S. Rewriting from a small C macro code to the D template function is not uncommon idea. I think that's merely unused in most actual projects, because of issue 846 that was existed until 2.065. |
But, it's rare that one needs to index an array in C library macro (in fact it only applies to static arrays), and one can work around the bounds checking by using Let's not conflate the discussion with C++'s template header libraries. We have enough important issues with our existing compilation modes, that we should not start to support another one, and as Walter said, the rule is simple, compile and link the module. |
Indexing and slicing requires
It will make the indexing access unsafe automatically. Isn't memory safety is one of the advantage of D? I am not thinking D should support full template library idiom. Separate compilation model never allow that. I know that. module b;
int g;
struct S { ... }
auto foo(T)(T t) { ++g; return t; }
auto foo()() { return typeof(S); }
template Foo(T) {
static this() { }
auto bar() { return 1; }
} For that, understanding the explanation "D relies on separate compilation model" is easy, But again, module helpers are just the dmd's implementation detail. Consider: module b;
dmd is a reference compiler among the variation of D compilers. It's implementation details should be hidden as far as possible, to keep its portability maximum. |
If we reintroduce the link-failure, the complaint about that in the learn forum will be endless. We don't have enough man power to answer it. But I also think it's not good to document the internal name |
You don't have D arrays in C header files. |
It'll remain a rare problem. |
Disallowing trivial rewriting from
'rare' is not same with nothing. |
I implemented the fix I suggested in #4859 |
I defeated in the debate, with the score 1 VS 2. So I'll forget the issue. |
Thank you, Kenji. I appreciate it. |
It's often useful to be pragmatic, b/c what seems like an important issue from a close distance often is of little importance in a bigger scope. |
I think I am one of the top pull request sender, but most of them are for the small bug fixes. I'm really sad so you're saying small change is closer distance issue. It sounds to me that my small maintenance bugfixes are less important than other committers' big pictured changes. |
No, this is not what I meant. Your changes have the biggest impact and all the small fixes are of utmost importance, particularly b/c dmd's implementation tended to ignore many important details. What I was trying to say is the following. Sometimes a decision between A and B is hard to make and will fuel a lengthy debate. Often that decision (A or B) is of little importance in the long-term, b/c either both work or it's simple to change the decision later on. |
https://issues.dlang.org/show_bug.cgi?id=14828