New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.198 no longer builds on armv7hl #3639

Closed
belegdol opened this Issue Jun 6, 2018 · 22 comments

Comments

Projects
None yet
5 participants
@belegdol
Contributor

belegdol commented Jun 6, 2018

0.198 no longer builds on armv7hl due to of running out of memory during linking. The errors are:
/usr/bin/ld: cannot size stub section: Memory exhausted
or: /usr/bin/ld: failed to set dynamic section sizes: Memory exhausted
0.197 was building fine using the same set-up. You may find detailed logs in koji:
https://koji.fedoraproject.org/koji/taskinfo?taskID=27319209 https://koji.fedoraproject.org/koji/taskinfo?taskID=27319211 https://koji.fedoraproject.org/koji/taskinfo?taskID=27319237

@belegdol

This comment has been minimized.

Show comment
Hide comment
Contributor

belegdol commented Jun 6, 2018

@cuavas

This comment has been minimized.

Show comment
Hide comment
@cuavas

cuavas Jun 6, 2018

Member

I'm not sure there's a lot we can do about this. The number of symbols will likely continue to rise as we migrate more stuff from macros and glue code to C++ templates. We're still building successfully on 32-bit x86 platforms, including MinGW with a 2GB process limit.

Member

cuavas commented Jun 6, 2018

I'm not sure there's a lot we can do about this. The number of symbols will likely continue to rise as we migrate more stuff from macros and glue code to C++ templates. We're still building successfully on 32-bit x86 platforms, including MinGW with a 2GB process limit.

@balr0g

This comment has been minimized.

Show comment
Hide comment
@balr0g

balr0g Jun 6, 2018

Contributor

@belegdol can a 64-bit ld be used to build a 32-bit binary (e.g. using -m elf_i386)?

Contributor

balr0g commented Jun 6, 2018

@belegdol can a 64-bit ld be used to build a 32-bit binary (e.g. using -m elf_i386)?

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 6, 2018

Contributor

@balrog, Not for the package building purposes, the builds are required to be native.

Contributor

belegdol commented Jun 6, 2018

@balrog, Not for the package building purposes, the builds are required to be native.

@balr0g

This comment has been minimized.

Show comment
Hide comment
@balr0g

balr0g Jun 6, 2018

Contributor

@belegdol are we already passing -Wl,--no-keep-memory -Wl,--reduce-memory-overheads to the linker?
Alternatively, is it possible to use the gold linker? Apparently this issue has hit LLVM in the past, and that was one suggested workaround.

Contributor

balr0g commented Jun 6, 2018

@belegdol are we already passing -Wl,--no-keep-memory -Wl,--reduce-memory-overheads to the linker?
Alternatively, is it possible to use the gold linker? Apparently this issue has hit LLVM in the past, and that was one suggested workaround.

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 7, 2018

Contributor

@balr0g, thanks for the suggestion! I tried the additional linker flags but it unfortunately did not help:
https://koji.fedoraproject.org/koji/taskinfo?taskID=27460043
The log is full of lines like

../../../../../scripts/mame_mame/liboptional.a(coco_gmc.o):(.rodata+0x6c): multiple definition of `typeinfo name for device_finder<device_cococart_interface, false>'
../../../../../scripts/mame_mame/liboptional.a(coco_dcmodem.o):(.rodata+0x6c): first defined here

Could this be somehow related?

Contributor

belegdol commented Jun 7, 2018

@balr0g, thanks for the suggestion! I tried the additional linker flags but it unfortunately did not help:
https://koji.fedoraproject.org/koji/taskinfo?taskID=27460043
The log is full of lines like

../../../../../scripts/mame_mame/liboptional.a(coco_gmc.o):(.rodata+0x6c): multiple definition of `typeinfo name for device_finder<device_cococart_interface, false>'
../../../../../scripts/mame_mame/liboptional.a(coco_dcmodem.o):(.rodata+0x6c): first defined here

Could this be somehow related?

@drencorxeen

This comment has been minimized.

Show comment
Hide comment
@drencorxeen

drencorxeen Jun 7, 2018

Contributor

@belegdol ,
There are a few of us who have had the linker issue as well.
#3605

One of the work arounds that @mrgw454 found is in that message.

Though @cuavas says it is a linker bug and should report it to the developers of binutils.

So far I myself have tried GCC 5.3, 6.3, 6.4, 7.3, and 8.1 in combination with the default LD and LD.gold on Raspbian as well as downloading the current binutils sources and building. We get the same issue with binutils 2.30 LD and LD.gold

Good luck and hope you have better luck getting this issue resolved.

Contributor

drencorxeen commented Jun 7, 2018

@belegdol ,
There are a few of us who have had the linker issue as well.
#3605

One of the work arounds that @mrgw454 found is in that message.

Though @cuavas says it is a linker bug and should report it to the developers of binutils.

So far I myself have tried GCC 5.3, 6.3, 6.4, 7.3, and 8.1 in combination with the default LD and LD.gold on Raspbian as well as downloading the current binutils sources and building. We get the same issue with binutils 2.30 LD and LD.gold

Good luck and hope you have better luck getting this issue resolved.

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 7, 2018

Contributor

@drencorxeen, it could be a problem in arm version of ld - I am only seeing this issue on armv7hl, but not on aarch64, s390x, i686 or x86_64. In issue #3605 you are also getting the errors on 32bit arm.

Contributor

belegdol commented Jun 7, 2018

@drencorxeen, it could be a problem in arm version of ld - I am only seeing this issue on armv7hl, but not on aarch64, s390x, i686 or x86_64. In issue #3605 you are also getting the errors on 32bit arm.

@drencorxeen

This comment has been minimized.

Show comment
Hide comment
@drencorxeen

drencorxeen Jun 7, 2018

Contributor

@belegdol ,

Yes Raspbian is 32bit arm only. So far the people at Raspberry Pi foundation haven't released a 64bit anything for the Raspberry Pi's that use 64bit arm CPU's yet. From what I last saw they hadn't had any plans for making a 64bit distro.

As far as the build issues I have tried building MAME on Raspbian Wheezy, Jessie, and Stretch and with all versions get the same build issues.

Contributor

drencorxeen commented Jun 7, 2018

@belegdol ,

Yes Raspbian is 32bit arm only. So far the people at Raspberry Pi foundation haven't released a 64bit anything for the Raspberry Pi's that use 64bit arm CPU's yet. From what I last saw they hadn't had any plans for making a 64bit distro.

As far as the build issues I have tried building MAME on Raspbian Wheezy, Jessie, and Stretch and with all versions get the same build issues.

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 9, 2018

Contributor

For the record using gold does not help:
https://koji.fedoraproject.org/koji/taskinfo?taskID=27501609

Contributor

belegdol commented Jun 9, 2018

For the record using gold does not help:
https://koji.fedoraproject.org/koji/taskinfo?taskID=27501609

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 10, 2018

Contributor

It turns out I was not looking closely enough - using -Wl,--no-keep-memory -Wl,--reduce-memory-overheads actually does solve the OOM problem but linking still fails due to multiple declarations. Same happens when using ld.gold toghether with -Os optimisation.

Contributor

belegdol commented Jun 10, 2018

It turns out I was not looking closely enough - using -Wl,--no-keep-memory -Wl,--reduce-memory-overheads actually does solve the OOM problem but linking still fails due to multiple declarations. Same happens when using ld.gold toghether with -Os optimisation.

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 12, 2018

Contributor

Hello,
I got the following feedback from Fedora ld maintainer:

I would like to fix the multiple definition problem if possible, since that seems like it is a real bug. Is there a way to produce a reduced testcase that reproduces the problem ? (I am not a C++ expert, so I am hoping that someone else will be able to reduce the problem down to a more manageable size...)

Would anybody watching this bug would be able to produce one? Thank you!

Contributor

belegdol commented Jun 12, 2018

Hello,
I got the following feedback from Fedora ld maintainer:

I would like to fix the multiple definition problem if possible, since that seems like it is a real bug. Is there a way to produce a reduced testcase that reproduces the problem ? (I am not a C++ expert, so I am hoping that someone else will be able to reduce the problem down to a more manageable size...)

Would anybody watching this bug would be able to produce one? Thank you!

@belegdol belegdol closed this Jun 12, 2018

@belegdol belegdol reopened this Jun 12, 2018

@cuavas

This comment has been minimized.

Show comment
Hide comment
@cuavas

cuavas Jun 16, 2018

Member

The simplest test case would be to create files like this, and try to compile/link the C++ source files with the header present:

// tmpl.h
template <typename T, T V>
class test_template
{
public:
    virtual ~test_template();
    T value();
};
template <typename T, T V> test_template<T, V>::~test_template() { }
template <typename T, T V> T test_template<T, V>::value() { return V; }
extern template class test_template<int, 1>;
extern template class test_template<int, 2>;
extern template class test_template<long, 1>;
extern template class test_template<long, 2>;
// inst1.cpp
#include "tmpl.h"
template class test_template<int, 1>;
template class test_template<int, 2>;
template class test_template<long, 1>;
template class test_template<long, 2>;
// inst2.cpp
#include "tmpl.h"
template class test_template<int, 1>;
template class test_template<int, 2>;
template class test_template<long, 1>;
template class test_template<long, 2>;
// main.cpp
#include "tmpl.h"
int main(int argc, char const *argv[])
{
    test_template<int, 1> a;
    test_template<int, 2> b;
    test_template<long, 1> c;
    test_template<long, 2> d;
    return 0;
}
Member

cuavas commented Jun 16, 2018

The simplest test case would be to create files like this, and try to compile/link the C++ source files with the header present:

// tmpl.h
template <typename T, T V>
class test_template
{
public:
    virtual ~test_template();
    T value();
};
template <typename T, T V> test_template<T, V>::~test_template() { }
template <typename T, T V> T test_template<T, V>::value() { return V; }
extern template class test_template<int, 1>;
extern template class test_template<int, 2>;
extern template class test_template<long, 1>;
extern template class test_template<long, 2>;
// inst1.cpp
#include "tmpl.h"
template class test_template<int, 1>;
template class test_template<int, 2>;
template class test_template<long, 1>;
template class test_template<long, 2>;
// inst2.cpp
#include "tmpl.h"
template class test_template<int, 1>;
template class test_template<int, 2>;
template class test_template<long, 1>;
template class test_template<long, 2>;
// main.cpp
#include "tmpl.h"
int main(int argc, char const *argv[])
{
    test_template<int, 1> a;
    test_template<int, 2> b;
    test_template<long, 1> c;
    test_template<long, 2> d;
    return 0;
}
@drencorxeen

This comment has been minimized.

Show comment
Hide comment
@drencorxeen

drencorxeen Jun 16, 2018

Contributor

@cuavas ,
Using your example it builds fine on Ubuntu 18.04LTS 64bit and does in fact fail to build on Raspbian Jessie, Stretch.

Contributor

drencorxeen commented Jun 16, 2018

@cuavas ,
Using your example it builds fine on Ubuntu 18.04LTS 64bit and does in fact fail to build on Raspbian Jessie, Stretch.

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 16, 2018

Contributor

Thank you @cuavas! I have relayed this to Fedora ld maintainer.

Contributor

belegdol commented Jun 16, 2018

Thank you @cuavas! I have relayed this to Fedora ld maintainer.

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jun 19, 2018

Contributor

We have the final conclusion:

It turns out that the multiple symbol definition problem is an artefact of the default ARM API. Specifically the default API (AAPCS) says:
3.2.5.4 of the ARM C++ ABI says that class data only has vague linkage if the class has no key function.
Which translates into a requirement for only one typeinfo definition for a given template for the entire program.
Other architectures have a more sane API, which allows for multiple definitions, one per compilation unit.
If you use an alternative ARM API then you can get the behaviour you desire. For example if you compile the testcase with the "-mabi=apcs=gnu" option then it will compile, assemble and link correctly. Of course the program may not run correctly because the libraries involved have presumably all been compiled with the default API.
Anyway, I think that this is as far as we can take this particular issue. It seems that MAME is just too big for the ARM, and the default ARM API is too broken to support it. Sorry. :-(

Contributor

belegdol commented Jun 19, 2018

We have the final conclusion:

It turns out that the multiple symbol definition problem is an artefact of the default ARM API. Specifically the default API (AAPCS) says:
3.2.5.4 of the ARM C++ ABI says that class data only has vague linkage if the class has no key function.
Which translates into a requirement for only one typeinfo definition for a given template for the entire program.
Other architectures have a more sane API, which allows for multiple definitions, one per compilation unit.
If you use an alternative ARM API then you can get the behaviour you desire. For example if you compile the testcase with the "-mabi=apcs=gnu" option then it will compile, assemble and link correctly. Of course the program may not run correctly because the libraries involved have presumably all been compiled with the default API.
Anyway, I think that this is as far as we can take this particular issue. It seems that MAME is just too big for the ARM, and the default ARM API is too broken to support it. Sorry. :-(

@cuavas

This comment has been minimized.

Show comment
Hide comment
@cuavas

cuavas Jun 20, 2018

Member

So how does it work at all for implicit template instantiation? I mean, it shouldn't be doing anything different for explicit vs implicit template instantiation. It really does sound insane for an ABI requirement.

Member

cuavas commented Jun 20, 2018

So how does it work at all for implicit template instantiation? I mean, it shouldn't be doing anything different for explicit vs implicit template instantiation. It really does sound insane for an ABI requirement.

@drencorxeen

This comment has been minimized.

Show comment
Hide comment
@drencorxeen

drencorxeen Jun 24, 2018

Contributor

Yes that does sound strange.
Especially with more and more devices coming out using ARM CPU's. This could potentially be a issue in the future.

Contributor

drencorxeen commented Jun 24, 2018

Yes that does sound strange.
Especially with more and more devices coming out using ARM CPU's. This could potentially be a issue in the future.

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol
Contributor

belegdol commented Jun 24, 2018

@StefanBruens

This comment has been minimized.

Show comment
Hide comment
@StefanBruens

StefanBruens Jun 29, 2018

Contributor

Strictly speaking, the current code violates the C++ standard, see e.g.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=608029#10

It is actually trivial to fix, just instantiate each definition once, see PR #3715

Contributor

StefanBruens commented Jun 29, 2018

Strictly speaking, the current code violates the C++ standard, see e.g.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=608029#10

It is actually trivial to fix, just instantiate each definition once, see PR #3715

@belegdol

This comment has been minimized.

Show comment
Hide comment
@belegdol

belegdol Jul 25, 2018

Contributor

0.200 builds on armv7hl.

Contributor

belegdol commented Jul 25, 2018

0.200 builds on armv7hl.

@belegdol belegdol closed this Jul 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment