Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic_cast triggering DSI exception with -use-dynld #77

Closed
capehill opened this issue Oct 27, 2019 · 67 comments
Closed

dynamic_cast triggering DSI exception with -use-dynld #77

capehill opened this issue Oct 27, 2019 · 67 comments

Comments

@capehill
Copy link

capehill commented Oct 27, 2019

There was a report ( http://www.amigans.net/modules/xforum/viewtopic.php?post_id=116034#forumpost116034 ) that dynamic_cast crashes with DSI when using shared objects. I was trying to create a simpler example to reproduce the crash but got a different symptom.

The example crashes with 2 conditions: -use-dynld and virtual inheritance. Switching to static objects or removing "virtual" keyword seems to remove DSI exception.

// g++ dyncast.cpp -use-dynld -athread=native -gstabs

class Base
{
public:
    virtual int f() = 0;
};

class Derived:
  virtual
    public Base
{
public:
    int f() { return 3; }
};

int main()
{
    Base* b = new Derived();

    Derived* d = dynamic_cast<Derived*>(b);

    return 0;
}

GCC 10.2.0.

EDIT: simplified example more and added -gstabs. Added binary archive too.
dyncast_crash.zip

EDIT2: added DSI into title.

EDIT3: removed ISI from title, updated GCC version from 8 to 10.

dyncast_new_serial_log.txt

@afxgroup
Copy link

It seems the classic C++ problem that happens when you create a virtual class but don't define a non virtual destructor. And this cause problems when the object is destroyed.
But maybe i'm wrong

@raziel-
Copy link

raziel- commented Oct 28, 2019

@afxgroup

Any quick fix/workaround available?

@sba1
Copy link
Owner

sba1 commented Oct 28, 2019 via email

@raziel-
Copy link

raziel- commented Oct 28, 2019

That isn't possible, I'm afraid.
I can't build the app static anymore because of it's size.
I'm hitting the ram barrier (2 GB installed, only approx. 1.5 GB usable), will run out of memory and ultimately crash on linking.

Also, if there would be an up to date libstdc++.so, it wouldn't be a hassle to provide it, since I'm using a local sobjs dir with all mandatory (shared) libraries together with the app.

Is the problem known and maybe already fixed?

@sba1
Copy link
Owner

sba1 commented Oct 28, 2019 via email

@capehill
Copy link
Author

@sba1 For me, the purpose of using dynamic linkage would be due to size of the binary and also sometimes the licenses. I'm also wondering would it be easy to support "gold" linker and would that help with Raziel's scenario? https://en.wikipedia.org/wiki/Gold_(linker)

I created a local "sobjs" directory and copied libstdc++.so and libgcc.so there. According to GrimReaper, libstdc++.so was used from the local directory. I made another test where I assigned SOBJS: to RAM disk, and copied libc.so, libstdc++.so and libgcc.so (from GCC:lib) to RAM: and it still crashes.

@sba1
Copy link
Owner

sba1 commented Oct 28, 2019 via email

@raziel-
Copy link

raziel- commented Oct 28, 2019

That isn't possible, I'm afraid. I can't build the app static anymore because of it's size. I'm hitting the ram barrier (2 GB installed, only approx. 1.5 GB usable), will run out of memory and ultimately crash on linking.

Linking statically or shared against libstdc++.a should not have such a dramatic effect (unless you use lto).

I'm afraid it is...see, the shared objects that are used to be linked into the executable aren't the problem, they are small enough to get linked.
The problem are the engines the app uses to provide game support.
Those engines are ever so growing and gain size for whatever new engines is added.

If linking against .so really helps you out of this problem it would hit the same limit sooner than later anyway.

Actually no.
Since with static, everything is cramped into the main app which grows and grows and now exceeds my available ram on linking.

The solution is, to use every engine as "plugin", which is nothing else but a shared "library", load only if a game from said engine is requested by the main app.
Those engines are stored in their own subdir and as such don't add to the size of the main app.
It's factually the only way to build scummvm natively now...static isn't possible anymore, maybe with a few tricks, but only for another few months, as soon as the next engine is added, it will be over.

Also, if there would be an up to date libstdc++.so, it wouldn't be a hassle to provide it, since I'm using a local sobjs dir with all mandatory (shared) libraries together with the app. Is the problem known and maybe already fixed?

Linking against libstd++.a in a shared fashion is not recommended because of the unclear ABI (the --thread=anative is nothing official).

Hmm, maybe I misunderstand, but i don't link shared to the static libstdc++.a.
That wouldn't work anyway, let alone all the linker errors I would get, if I'd try to do so.

If you supply your own .so objects you can do it as you like of course, but there won't be any libstdc++.so from my side for end users because the new ones are incompatible with the old ones and this will create a mess.

Again, maybe I misunderstand, so don't get me wrong, but I don't need a libstdc++.so...I'd need a fix for the obvious and reproducable crash inside libstdc++.

And one more thing I don't really understand...if a program links with libstdc++.so and no end user version will ever be available how would such a program work for end users in any way without providing the shared object?

Until there is no official channel for this, and no official solution for different flavour of shared objects, you should not use it in a shared fashion. If the purpose of using libstdc++.so is just a workaround for some other problem it is much better to address the other problem. Regarding the problem, have you checked that the proper "libstdc++.so" is used (and not the one that sits in SOBJS:)?

Yes, I renamed the old libstdcc.so and only use gcc8.3.0's libstdc++.so everywhere, so it is picked up.

Overall your answer is sadly very discouraging.
I know that resources are scarce and the overall outlook is grim, but I had hoped for something to look forward to.
Unfortunately that will mean 2.1.0 will be the last release of scummvm for amigaos4...

@sba1
Copy link
Owner

sba1 commented Oct 29, 2019 via email

@raziel-
Copy link

raziel- commented Oct 29, 2019

Regarding the reported problem it is simply not clear where the problem is. It could be

  1. Mixup of shared objects (clib2 vs newlib2)
  2. Usage of clib2 shared objects (I'm not sure how well this works with a thread model as clib2 is not threadsafe)
  3. Bug in binutils
  4. Bug in elf.library
  5. Some other weird thing.
    I'm not yet convinced whether 1) and 2) is not a likely cause. 3), 4), and 5) are much harder to research and in particular 4) would make it nearly impossible to see a fix in the distant future. I also think that we can rule out a bug in libstdc++ because it is working with the static linking.

1+2) I'm not using clib2.
If libstdc++.so provided with gcc8.3.0 is newlib, then I'm using that.
I've never used clib2 (people told me to avoid that, so I did).
Of course, I could try to rename sdk:local/clib2/lib and see if it is picked up while linking, but I doubt it, since at no point I'm using clib2...

Yes, I renamed the old libstdcc.so and only use gcc8.3.0's libstdc++.so everywhere, so it is picked up.

Which ones (newlib vs clib2)? Does this also hold for all the other involved shared objects?

Newlib only

Yes, I wrote a script that extracts and installs all needed sobjs from the main app (readelf -d) and copy them from sdk:local/newlib/lib to the apps personal sobjs drawer.
So, everything that was used while compiling is also used while running.

@sba1
Copy link
Owner

sba1 commented Oct 29, 2019 via email

@raziel-
Copy link

raziel- commented Oct 29, 2019

Yes, I wrote a script that extracts and installs all needed sobjs from the main app (readelf -d) and copy them from sdk:local/newlib/lib to the apps personal sobjs drawer. So, everything that was used while compiling is also used while running.

Just for the record: You tried the same minimal example as given by capehill and observe the crash with all the correct .so files? You don't observe the crash if you use some other functions, like std::cout etc.?

No, not yet.
I'm at work with no access to my system, I will try and report back once I can.

All the information given is with scummvm.

You can still try to link it via a cross compiler or disable few of seldom used engines or to mix up static/shared. For instance, you could try to link the libstdc++ statically to main program and still have keep these plugins as .so. However, I don't know anything about the structure of the scrummvm.

I don't have a cross compiler set up, I don't even have a suitable system to do so and honestly, I would like to avoid using a foreign system to build amigaos4 binaries.

The second idea is interesting, though.
If that would work, it would indeed be a way to keep scummvm available, without the hassle of shared libstdc++.so (and libc.so, for that matter) bugs.
Plus the main binary wouldn't really grow in size, as only the mandatory shared libraries would be built in as static again.

The problem is, that i already tried building scummvm that way and it errored on linking, due to the mix-up of shared and static.
Then again, i was trying to build it with -fPIC, -use-dynld and -shared in place.
If I understand correctly, then -use-dynld is used to create a binary with (a) linked shared object(s).
So, if i leave -use-dynld out, the binary should be linked with static libraries, while the engines are kept and used as shared objects, shouldn't they?
I hope that works, will try that.

@raziel-
Copy link

raziel- commented Oct 29, 2019

Thinking some more about it...would it also work if I simply remove/move libstdc++.so, so it isn't found by the linker (sdk: and sobjs:) and only provide libstdc++.a in the search path?

Would that work and would libstdc++ then be linked statically?
That way I could still link everything else shared, but could circumvent the libstdc++.so bug?!

@sba1
Copy link
Owner

sba1 commented Oct 30, 2019 via email

@afxgroup
Copy link

However i've compiled and tested on Sam460 with all latest beta updates and that example doesn't crash. With or without virtual. I've compiled it with ubuntu cross compiler 8.3.0

@raziel-
Copy link

raziel- commented Oct 30, 2019

There should be a special option: -static-libstdc++ which should force the static linking of the library.

Oh?
Well, that would be ideal, then.
Is that a linker flag, or where would I put it?

Thank you

@afxgroup
Copy link

i've tried that option but it seems not work. The shared object is always linked

@raziel-
Copy link

raziel- commented Oct 31, 2019

i've tried that option but it seems not work. The shared object is always linked

:-(

@raziel-
Copy link

raziel- commented Nov 1, 2019

@capehill
@sba1

I don't get a crash with the minimal example from post #1.

i do get an output in the shell which reads
3 4
but there is no crash for me.

readelf tells me this

Dynamic section at offset 0x40cc contains 18 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libstdc++.so]
0x00000001 (NEEDED) Shared library: [libgcc.so]
0x00000001 (NEEDED) Shared library: [libc.so]
0x00000004 (HASH) 0x1000108
0x00000005 (STRTAB) 0x10004d8
0x00000006 (SYMTAB) 0x1000248
0x0000000a (STRSZ) 763 (bytes)
0x0000000b (SYMENT) 16 (bytes)
0x00000015 (DEBUG) 0x0
0x00000003 (PLTGOT) 0x1003000
0x00000002 (PLTRELSZ) 72 (bytes)
0x00000014 (PLTREL) RELA
0x00000017 (JMPREL) 0x1000804
0x00000007 (RELA) 0x10007d4
0x00000008 (RELASZ) 120 (bytes)
0x00000009 (RELAENT) 12 (bytes)
0x6000000e (AMIGAOS_DYNVERSION) 0x2
0x00000000 (NULL) 0x0

and the example was linked with

version SOBJS:libc.so file full
newlib.library 53.30 (12.02.2014)

libgcc.so 894549 ----rw-d 23-Oct-19 21:47:14
libstdc++.so 11145432 ----rwed 08-Mar-19 11:13:20
while libstdc++.so is picked up from
SDK:gcc/lib/libstdc++.so
(all of them are the same as in SOBJS:)

@capehill
Copy link
Author

capehill commented Nov 1, 2019

@raziel- @afxgroup I would like to test your binaries. I attached mine in the first post.

@raziel-
Copy link

raziel- commented Nov 1, 2019

@capehill

Well, the "simplified" one doesn't lonk for me, i get
g++ dyncast.gcc -use-dynld -athread=native -gstabs
Development:Coding/SDK/gcc/ppc-amigaos/bin/ld:dyncast.gcc: file format not recognized; treating as linker script
Development:Coding/SDK/gcc/ppc-amigaos/bin/ld:dyncast.gcc:1: syntax error

The "normal" one is attached

@raziel-
Copy link

raziel- commented Nov 1, 2019

dyncast_normal.lha.zip
It´s a .lha, really

@capehill
Copy link
Author

capehill commented Nov 1, 2019

Well, the "simplified" one doesn't lonk for me, i get
g++ dyncast.gcc -use-dynld -athread=native -gstabs
Development:Coding/SDK/gcc/ppc-amigaos/bin/ld:dyncast.gcc: file format not recognized; treating

It should be .cpp. Not sure what is going on there. Please note that I had to zip my lha (Github..) but when I download my attachment, it's .cpp file for sure.

@raziel-
Copy link

raziel- commented Nov 1, 2019

@capehill

Yep, i was c&ping your code, seems there is a stray char somewhere.

Downloaded and i get a crash
Crashlog_a.out_2019-11-01_20-31-41.txt
Debug_Serial.txt
dyncast_simplified.lha.zip

@raziel-
Copy link

raziel- commented Nov 2, 2019

@sba1
@afxgroup

Should i report another bug with the non-working
-static-libc, -static-libgcc and -static-libstdc++ switch?

@raziel-
Copy link

raziel- commented Nov 6, 2019

@capehill
@sba1
@afxgroup

So, i "fixed" it, if you can call that fixing.

Short:

The reason of the crash is not some error in libstdc++ (at least, i don't think so anymore), but because of a "faulty/foreign/whatever" directory structure on my end.
I use SFS2 on all my partitions.

Long:

I've lost too much hair and nerves over that one, so bear with me as you all should share some of the misery i was going through ;-)

The luck i was granted was the fact that i still had ONE scummvm source dir which produced a working binary (no crash, no oddities, all of the games running and the best thing...it was a shared build!!!)
Now, since i absolutely resent Illogic, i tried to find out the reason why one of my source dirs worked and the others didn't.

First step:
I moved the plugins directories around, same with the sobjs dir, i moved the binaries and data directories around, all to no avail.
One scummvm binary worked, the other didn't.

Second step:
I tried to compare the two directories, well, one directory (the good) was with sources from October 23, the other (the bad) from today.
So, i thought it might have been something that was added to the code.
I backed up my good directory, pulled the current sources to it and let it build...
It worked...and gone were a few more hairs (I still have some, do not fear).
Illogical, i tell you!
Another directory comparision brought nothing new, just a few github packs that were missing, but that wasn't the reason.

Third step:
Sooo, then i remembered something. Also something very weird.
Some months before, i did betatesting for Olaf's (Barthel) smbfs and there i encountered ann oddity where i wasn't able to access an smbfs share from within ScummVM's launcher (e.g. save files on my NAS).
ScummVM was erroring out and complaining about something, i think it was complaining about the directory not being readable or writable or something. (Olaf fixed it inside smbfs, but i guess it wasn't smbfs' fault in the first place)

I also remembered that i had to move the good directory across partitions when i wanted to copy it (WB copy complained about a locked file and refused to "copy clone"), it the other day.

I did a test by "copy clone"ing the good directory and IT CRASHED...What the heck???
That's even more illogical since the original directory sits beside the copied one and IT STILL WORKS!!!

Next i copied the good directory across partitions and back to where it was (renaming it and deleting all of the other [bad] directories).
Guess what happened?...IT WORKED!!!
What the heck, again!
Why would it do that, it's so illogical!!!

Last step:

Having cornered the problem to being something wrong with the directory structure, i now blame ScummVM's filesystem access.
I don't know why and i don't know how, but for some reason ScummVM's internal filesystem handling does not like any directory structure that has been created with (tested) copy (directory creating), makedir and gnu's mkdir.
What works, and this are actually the ONLY two things working, i have found so far, are ASyncWB's directory creation (i guess ASyncWB is doing the directory creating when stuff is copyied over to another partition) and WB PD menu and create dir (which might also be ASYNCWB)?

I didn't test any more possibilities, but i guess that already gives an idea on what might be going wrong.
The mkdir crash especially, i witnessed a few days ago when i completely set up the scummvm install directory from scratch and let the amigaos.mk script (from the sources) do all the installing.
It will crash even a working static build (when i copy the binary inside the evil directory structure), but of course somewhere else in the code because it now can't access it's theme files (which sits in their own subdirectory).

I do have a request now and i know this is asking very much, given your time constraints and dedication to other projects.
Would any of you be so kind and browse over the code that handle filesystems, check for errors and point them out?

It's here and it's hopefully not very big.
https://github.com/scummvm/scummvm/blob/master/backends/fs/amigaos4/amigaos4-fs.cpp#L1

I will do any changes locally and test myself of course.

Pretty please?
Thank you very much and sorry for falsly claiming a bug in libstdc++.so :-(

@afxgroup
Copy link

afxgroup commented Nov 7, 2019

I had same problem in many of my ports and this happens with filesystem.. But most of the time the culprit was JXFS. Even if in my opinion is a mix of file system and newlib.

@raziel-
Copy link

raziel- commented Nov 7, 2019

@afxgroup

sigh :-(
So not really a fix available...

Did you find a workaround for your ports (apart from my hack to manually create the directory structures)?
I still get that dynamic cast crash on one game/engine so far (haven't tested all of them yet).
Now knowing what the culprit is, i will (hopefully) probably track down where it breaks, but it's really intimidating to know that these kind of crashes could crawl up on me every now and again.

Are those problems still due with the beta NGFS (you are a betatester, i assume?)

As much as i like to have and work with these shared builds (now that they work), i feel like i'm constantly treading on thin ice on a warm spring day...and it already crackles all around me.

Thank you for the feedback

@afxgroup
Copy link

afxgroup commented Nov 8, 2019

No. No solution except to test everything first on FFS (or at least SFS). JXFS is one of the fastest file systems we have but is bugged and no more updated by Joerg.

@raziel-
Copy link

raziel- commented Nov 10, 2019

@capehill

Confirmed crashing in ram: when using a directory structure created with mkdir

@capehill
Copy link
Author

So the crash points still to dynamic_cast or? Of all those dynamic_casts in ScummVM, majority seems to be in "titanic" engine according to Github search.

@raziel-
Copy link

raziel- commented Nov 10, 2019

Yep, it still does

@raziel-
Copy link

raziel- commented Apr 10, 2021

@capehill
@sba1
@afxgroup

Update:
The crash has changed.

I don't get it in dynamic_cast anymore and i also don't get DAR zero accesses (updated newlib fixed that)
I temporarily removed the workaround for it in amigaos-fs.cpp and before and after the change the crash entry is the same now (see below).

Progress?

This is where i get the crash now, everytime
btw, normally on start i'd get a line telling me "WARNING: Could not parse GLSL version 'Huh?'!", but i never reach that point.

Anyone can give me a new hint, maybe?

btw2: The workaround/hack with the "sane" directory" structure doesn't work anymore either.

Crash log for task "scummvm"
Generated by GrimReaper 53.19
Crash occured in module scummvm at address 0x7E960D10
Type of crash: DSI (Data Storage Interrupt) exception
Alert number: 0x80000003

Register dump:
GPR (General Purpose Registers):
0: D0B26E6C 494C19E0 0000003C 4CEE9558 00000D33 4CEE9560 00002880 00000028
8: FFFFFFFF 00000000 00000000 00000004 00000000 4F49189C 5F661FA0 00000001
16: 54FB8470 DFFF4240 53DC59D0 00000000 00000000 00000000 00000000 00000000
24: 00000000 00000000 00000000 494C1A50 494C19E8 4CEE9558 4F487AFC 494C1A30

FPR (Floating Point Registers, NaN = Not a Number):
0: nan 1 1 1
4: 1 4.5036e+15 310 140
8: 170 0.04 1 4.5036e+15
12: 16 4.5036e+15 0 0
16: 0 0 0 0
20: 0 0 0 0
24: 0 0 0 0
28: 0 0 0 0

FPSCR (Floating Point Status and Control Register): 0x82004000

SPRs (Special Purpose Registers):
Machine State (msr) : 0x0200B030
Condition (cr) : 0x5F600000
Instruction Pointer (ip) : 0x7E960D10
Xtended Exception (xer) : 0x510F2134
Count (ctr) : 0x00570001
Link (lr) : 0x00000000
DSI Status (dsisr) : 0x85027002
Data Address (dar) : 0x5F5FFA44

680x0 emulated registers:
DATA: 95C19300 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ADDR: 6FFA4000 95C58200 00000000 00000000 00000000 00000000 62293CAE 494C1330
FPU0: 0 0 0 0
FPU4: 0 0 0 0

Symbol info:
Instruction pointer 0x7E960D10 belongs to module "scummvm" (PowerPC)
Symbol: _ZN6OpenGL9ContextGL10initializeENS_14ContextOGLTypeE + 0x8C in section 12 offset 0x00274978

Stack trace:
[graphics/opengl/context.cpp:73] scummvm:_ZN6OpenGL9ContextGL10initializeENS_14ContextOGLTypeE()+0x8c (section 12 @ 0x274978)
[graphics/opengl/context.cpp:59] scummvm:_ZN6OpenGL9ContextGL10initializeENS_14ContextOGLTypeE()+0x78 (section 12 @ 0x274964)
[backends/platform/sdl/sdl.cpp:237] scummvm:_ZN11OSystem_SDL11initBackendEv()+0x9c (section 12 @ 0x7DA8)
[base/main.cpp:475] scummvm:scummvm_main()+0x7a4 (section 12 @ 0xAE50)
[backends/platform/sdl/amigaos/amigaos-main.cpp:75] scummvm:main()+0x1cc (section 12 @ 0x93AC)
native kernel module newlib.library.kmod+0x000025fc
native kernel module newlib.library.kmod+0x00003328
native kernel module newlib.library.kmod+0x0000384c
scummvm:_start()+0x1e0 (section 12 @ 0x328C)
native kernel module dos.library.kmod+0x0002a458
native kernel module kernel+0x00059e04
native kernel module kernel+0x00059e7c

PPC disassembly:
7e960d08: 93fd0004 stw r31,4(r29)
7e960d0c: 3be10050 addi r31,r1,80
*7e960d10: 807a0000 lwz r3,0(r26)
7e960d14: 812300f8 lwz r9,248(r3)
7e960d18: 7d2903a6 mtctr r9

System information:

CPU
Model: P.A. Semi PWRficient PA6T-1682M VB1
CPU speed: 1800 MHz
FSB speed: 900 MHz
Extensions: altivec

Machine
Machine name: AmigaOne X1000
Memory: 2097152 KB
Extensions: bus.pci bus.pcie

@capehill
Copy link
Author

capehill commented Apr 10, 2021

@raziel- that is a zero page hit, please check disassembly + registers. I think DAR display is wrong (IIRC there is some difference depending whether you look at the Grim Reaper display vs. serial line?). R26 is zero anyway.

So I fetched the latest ScummVM and build it (gmake clean; gmake) on GCC 10.2. With -use-dynld I get dynamic_cast related crash, binary loads with static linkage. I copied libstd++.so and libgcc.so to local SOBJS/ directory.

I get the crash when loading the ScummVM launcher, no games needed.

Here is what I have on serial now:

scummvm_dyncast_crash_serial_log.txt

@raziel-
Copy link

raziel- commented Apr 10, 2021

Ok...standing by.

Thank you

@afxgroup
Copy link

i would try to compile it with my clib2 and SDL2 (i've successfully compiled) but as i told you i have a strange error on ScummVM compilation and so i can't check what is wrong

@capehill capehill changed the title dynamic_cast triggering DSI (ISI) exception with -use-dynld dynamic_cast triggering DSI exception with -use-dynld Apr 11, 2021
@capehill
Copy link
Author

@raziel- https://www.amigans.net/modules/xforum/viewtopic.php?post_id=121395#forumpost121395 Okay so I'm not building with "plugins", I have only enabled -use-dynld in ScummVM makefile to trigger the dynamic_cast-related DSI. I can try to build the plugins versions too.

@raziel-
Copy link

raziel- commented Apr 11, 2021

@capehill

Just to rule out every possibility, yes that would be nice

Btw: if you use the inst_ext_so.rexx script from dists/amigaos you dont need to go hunting for the sobjs.

Inst_ext_so.rexx your-shared-exe full-path-to-your-shared-exe

and it will gather and copy all needed sobjs to a subdir called sobjs where your binary is.

@raziel-
Copy link

raziel- commented Apr 11, 2021

@afxgroup

I don't have your emails anymore, please could you post your error here?
Maybe capehill has an idea on why it's not working?

@capehill
Copy link
Author

@capehill

Just to rule out every possibility, yes that would be nice

Btw: if you use the inst_ext_so.rexx script from dists/amigaos you dont need to go hunting for the sobjs.

Inst_ext_so.rexx your-shared-exe full-path-to-your-shared-exe

and it will gather and copy all needed sobjs to a subdir called sobjs where your binary is.

I got your crash reproduced. I don't understand it completely but it seems to me, that symbol called "mini_CurrentContext" (exported by SDL2) is not visible or accessible to ScummVM when plugin-compilation is used. I tried to print the pointer but it just crashed(!). The idea is that minigl.library needs to set this pointer and then application is implicitly using it when making those gl* calls.

So I added a hack on ScummVM side, like:

extern "C" {
void* mini_CurrentContext; // HACK
}

And then I was able to print the pointer and ScummVM continued until the DSI caused by dynamic_cast ;)

So regarding this OpenGL-related issue we need to understand what is difference between plugin vs. normal build, but we should probably discuss it somewhere else and leave this thread for dynamic_cast-related puzzles.

@raziel-
Copy link

raziel- commented Apr 11, 2021

Ok,

At least you found inconsistency, if not a bug, so thats a good thing.
Maybe use the forums for my further ramblings?

@afxgroup
Copy link

i had the same problem with SDL and mini_CurrentContext on clib2. I had to add -lGL to get it but it cause double definitions functions sometimes because they are defined in both SDL and GL

@raziel-
Copy link

raziel- commented Apr 11, 2021

But wouldn't that mean that SDL is simply missing a definition GL has?
Would it fix, at least, that specific crash by defining it in SDL?

Bear with me if i ask stupid questions...

@afxgroup
Copy link

SDL doesn't miss a definition. It seems to miss the implementation.

Take a look at SDL code

`/* The client program needs access to this context pointer

  • to be able to make GL calls. This presents no problems when
  • it is statically linked against libSDL, but when linked
  • against a shared library version this will require some
  • trickery.
    */
    DECLSPEC struct GLContextIFace *mini_CurrentContext = 0;
    `

This means that is defined there but never assigned. So you should add the code to assign it into your program. However i don't know why it is defined in SDL but never assigned. maybe @capehill know the SDL code better than us

@capehill
Copy link
Author

SDL doesn't miss a definition. It seems to miss the implementation.

Take a look at SDL code

`/* The client program needs access to this context pointer

* to be able to make GL calls. This presents no problems when

* it is statically linked against libSDL, but when linked

* against a shared library version this will require some

* trickery.
  */
  DECLSPEC struct GLContextIFace *mini_CurrentContext = 0;
  `

This means that is defined there but never assigned. So you should add the code to assign it into your program. However i don't know why it is defined in SDL but never assigned. maybe @capehill know the SDL code better than us

MiniGL sets the pointer during MakeCurrent. This is an old mechanism (SDL1 has it) and not invented by me so I don't know the detailed history.

minigl.h:

#define CC mini_CurrentContext

MGLAPI void mglMakeCurrent(void *context)
{
CC =(struct GLContextIFace *)context;
}

It works okay in normal scenarios with libSDL2.a and libSDL2.so. However now there is something new which trigger issues in ScummVM plugin build. But this is off-topic until someone points a connection to C++ dynamic_cast.

@afxgroup
Copy link

Yes, indeed. MiniGL not SDL. That's why if you don't link the exe with MiniGL it find the undefined references. But libGL.a is a static library and maybe cause problems with shared objects if linked against it

@lephilousophe
Copy link
Contributor

lephilousophe commented May 15, 2021

Hello there,

I spent the last weeks trying to debug this with @raziel- for our new cross-compilation toolchain.
Thanks to the minimalist example posted in this issue, I managed to find the problem and I have a fix for it.

Long story short, the ELF dynamic linker (elf.library) of AmigaOS doesn't seem to handle R_PPC_ADDR32 for dynamic symbols.
The patch submitted in referenced PR #101 forces the linker to emit a R_PPC_COPY which fixes relocation and avoids crash.

For the problem specifically described here, one needs to be quite fluent with GCC handling of RTTI (more information at Itanium ABI followed by GCC).

The crash comes from dynamic_cast when it tries to execute whole_type->__do_dyncast.
whole_type is provided at compilation phase when dynamic_cast call is handled by compiler.
In the example above, whole_type is the RTTI class for Base class.
This type (name Base::typeinfo) is created by compiler in our executable and inherits from class abi::__class_type_info. It is a structure composed of two fields: the vtable address and the type name. vtable should point at virtual functions array and type name points to constant Base::typeinfo-name.
In our case, Base::typeinfo doesn't override any function provided by abi::__class_type_info. The two vtables are identical and compiler makes Base::typeinfo vtable address point to abi::__class_type_info::vtable. This ones resides in libstdc++.so.
A relocation of type R_PPC_ADDR32 is emitted for this. When loading our executable, the dynamic linker is then supposed to put the address of abi::__class_type_info::vtable in Base::typeinfo but AmigaOS fails to do so.
Forcing linker to use a R_PPC_COPY relocation ensures the vtable is well populated and dynamic_cast works.

@raziel-
Copy link

raziel- commented May 15, 2021

/worship @lephilousophe

@raziel-
Copy link

raziel- commented May 25, 2021

I can confirm that the patch fixes DSI crash in dynamic_cast with native builds.

@capehill
Copy link
Author

@raziel- @lephilousophe

I recompiled simple example using https://github.com/sodero/adtools/releases/tag/10.3.0_2 and it worked without crash. Closing this ticket. Thank you.

@sba1

Do you think this issue should be checked further by elf.library maintainers?

@sba1
Copy link
Owner

sba1 commented May 26, 2021

Do you think this issue should be checked further by elf.library maintainers?

Yes, it should be checked and possibly fixed there.

@lephilousophe
Copy link
Contributor

@capehill, I don't understand. The sodero tag doesn't include my patch.
So if it works it's not due to my PR.

@raziel-
Copy link

raziel- commented May 26, 2021

Funny, regarding this
https://github.com/sodero/adtools/tree/10.3.0_2/binutils/2.23.2/patches
it actually isn't :-)

I build my local shared scummvm based on this release, I think, and it works aswell...intruiging.

@sodero
Copy link
Contributor

sodero commented May 26, 2021

@raziel- and @lephilousophe

I just did a quick one-off build and shared the binary on GitHub. The patch is included in the binary in other words.

Instead of commiting I created this PR. Much better than having two repos.

@raziel-
Copy link

raziel- commented May 26, 2021

@sodero

Ah, that explains it.

Thank you

@lephilousophe
Copy link
Contributor

@sodero thanks for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants