Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix aliasing bug in GOB engine #3778

Closed
wants to merge 1 commit into from
Closed

Conversation

snaphat
Copy link

@snaphat snaphat commented Apr 2, 2022

This aliasing bug crashes gob3 in retroarch when compiled with -O3 optimization after the intro is shown and the goblin stretches in the first scene of the game.

This is due to gcc treating the local pointers oldNestLevel, oldBreakFrom, and oldCaptureCounter as aliases of the pointers in the _vm structure when -O3 optimization is enabled.

However, it appears to only lead to fault behavior within the libretro_scummvm core. After investigation, I determined the crash is due to these pointers being backed by xmm registers (specifically oldBreakFrom in the build I was using) which get clobbered in some cases due to the aliasing optimizations.

More specifically, within Retroarch one of the xmm registers that is used as backing for this variable get clobbered when the slowmotion_ratio is set on this line of code. This is then treated as the 'original' pointer address which ultimately leads to a deference of an invalid memory address later.

For whatever reason, these xmm registers don't appear to get clobbered in standalone scummvm. I speculate that it may only result in a fault when LIBCO is utilized for threading -- which is the case with the libretro core. I did disassemble the standalone version of scummvm as well, and it does use the xmm registers as backing; however, they don't end up clobbered.

The issue should be replicable using gob3 with:

  • Stock x86_64 build of retroarch and libretro_scummvm.
  • mingw-w64 compiled version of the libretro_scummvm core using -O3 optimization.
  • mingw-w64 compiled version of this alternative wrapper here.

Note, I did not test a Linux build or other architectures.

This bug crashes gob3 in retroarch when compiled with -O3 optimization.
@lephilousophe
Copy link
Member

This doesn't look like an aliasing problem.
The bug you describe is that oldBreakFrom is stored in some xmm register (which one btw, because it matters on Windows) and some code outside of scummvm erases this register.
It's either a bug in the compiler or in retroarch code.
Adding the volatile keyword doesn't seem a proper fix for this but more like some workaround.

Aliasing bugs occur when accessing memory through 2 different pointers. This is not the case here: oldBreakFrom is written at the start of the function and read at the end. No dereferencing is done with it.

@snaphat
Copy link
Author

snaphat commented Apr 3, 2022

I appreciate the input. I investigated further, and you are correct, it is indeed not an aliasing issue. The actual issue is a bug in libco that I've documented here.

I initially thought it was an aliasing issue was because both playTot and runloop_iterate were properly spilling and restoring the callee-saved xmm registers as per the Microsoft ABI. I had observed both xmm6 and xmm8 corruption causing the issue, but both were properly spilled/restored in the aforementioned functions. It turned out there was another entry/exit point in the libco library that performs the actual swapping between retroarch and scummvm code. This contains faulty handcoded 'machine code' for restoring all callee-saved xmm registers. I should have suspected this given that the issue didn't occur outside of a libco build.

In any case, fixing this libco issue will squash many game compatibility issues with the scummvm core on Windows. This PR can be closed. Thanks for your help!

@lephilousophe
Copy link
Member

You are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants