-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync dh fpu state with normal fpu #3965
Conversation
When the dynamic core uses normal core to run instructions, we need to: 1. Copy the dh FPU state to normal FPU before entering the normal core. 2. After returning from normal core, copy the normal FPU state to dh FPU. Known affected programs (when using "core = dynamic"): - DOS Nagivator when repeating running bad commands on the command prompt. - Turbo Debugger when showing the FPU state window and stepping forward line-by-line. This only happens if the dynamic core is using the dh FPU implementation. There are three different FPU code paths in dosbox, the normal core that interprets everything, the dynamic core without dh FPU that runs recompiled code but calls the normal FPU functions, and then the new dh FPU that does not share the same functions and data structures with normal core. This normal core run branch is triggered when the code block was modified by the running program for more than 4 times. The first 3 times the dyn core will just recompile the guest code. Starting from the 4th time it will decide this part is frequently changed so there is no point to refill the dynrec cache and just call the normal core to run it. Performance impact and testing: - DOS games: Quake, Doom, Elite II (and more) plus all DOS benchmarks in PERFDOS2 did not follow this code path (presumably because they're not repeatedly modifying the same clode block), and therefore there is no performance impact. - Win98 tests by @rderooy using DOSBox-X show a ~20% and 5% impact. joncampbell123/dosbox-x#3965 (comment) Future optimizations: - The first syncing (from dh FPU to normal FPU) be made conditional based on the dynamic core's "state_used" flag (however, this flag is currently reset by the prior hardware FPU saver, so that needs fixing too). - The second syncing (from normal FPU back to dh FPU) cannot be made conditional yet beacuse there is no flag indicating if the normal FPU was used, therefore this would need to be added. Signed-off-by: kcgen <kcgen@users.noreply.github.com>
Regarding the performance hit, I believe the emulation first needs to be correct, then fast. This PR and the next one (#3969) fixe a legit bug that causes certain programs to not function properly. Given the focus of the X project being "accurate" and to be "a platform for running DOS applications", being correct is definitely more important here than other dosbox forks. An example of production application not functioning properly is that on x86 dynamic core, debuggers won't be able to see FPU states when debugging code. This is now fixed. That said, the root cause of the bug and the need for "syncing the 2 fpu states" is because we have 2 fpu states in the first place, which I suppose is due to historical reason. I believe we have an opportunity to refactor the fpu emulation code and unify the 2 states into a single, lean data structure and this will eliminate the need for the syncing and get the speed back, not to mention the code will be shorter and simpler. |
@LiFengSC -> Nice! This is exactly what @finalpatch also mentioned - Is this something you want to tackle? (Or are you @finalpatch ?) |
Sorry that was me. I accidentally commented through my work account. Rewriting the fpu code is a fairly big chunk of work and has some risk. At this point I feel I'm not super motivated to do this, especially when things are now working. |
hey np! maybe it's something you'll want to chip away at, even tiny bits at a time. It's great having your help make DOSBox (as a whole) better! |
What issue(s) does this PR address?
This is the same change as in dosbox-staging/dosbox-staging#2248
Basically this fixes the fpu state out of sync problem when dynamic core calls normal core to handle self modification code.
Does this PR introduce new feature(s)?
No
Does this PR introduce any breaking change(s)?
No
Additional information
I tested this with dos navigator:
I used mingw build because the visual studio build uses 64-bit floats and won't run dos navigator.