Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jit: Remove unsafe MOV optimization #3968

Merged
merged 1 commit into from Jun 30, 2016
Merged

Conversation

hthh
Copy link

@hthh hthh commented Jun 30, 2016

This optimization broke arithXex in rare cases by emitting XOR where MOV was expected.

We could instead enforce locking of flags around MOV operations, but I'd prefer for the low level XEmitter interface to not optimize things.

Optimistically, this might fix https://dolp.in/i9661 but I haven't tried it.


This change is Reviewable

@delroth
Copy link
Member

delroth commented Jun 30, 2016

@dolphin-emu-bot rebuild

I'd prefer if we didn't add another method for this if possible. Can we somehow detect when the XOR optimization is safe? @mmastrac

@hthh
Copy link
Author

hthh commented Jun 30, 2016

Thanks for the build :)

There are two other options, I think:

First, we could remove the optimization and function altogether. I don't know if it has much impact. I'd guess it just saves around three bytes in a handful of places?

Alternately, we can leave the optimization as-is, and require a lock around each sensitive MOV:

    if (d != b)
    {
      LockFlags();
      MOV(32, gpr.R(d), gpr.R(b));
      UnlockFlags();
    }

I'm not a fan of this option, because I suspect it will cause very subtle errors of a similar nature in the future (and there might be unnoticed parts of the code that need locks added already).

If you'd prefer a different approach, let me know and I'll remove the xor reg, reg optimization, or add the locks to arithXex.

@mmastrac
Copy link
Member

mmastrac commented Jun 30, 2016

I found this bug in the C++ register cache and ended up locking the carry flag for the block. mmastrac@aef8ada. I should have moved that change out when I split the MOV_sum code out.

I will double check to see if there were any other cases where I needed to lock the carry flag.

The best way to fix this would be to add an assertion that specifies that the carry flag is indeterminate after a MOV. That would surface any assumptions very quickly.

@hthh
Copy link
Author

hthh commented Jun 30, 2016

I agree that you could add a system for tracking flags, which could find uses of the overflow, sign, zero, adjust, parity and carry flags after a MOV, but before they are redefined by another instruction... This would probably find most issues (not sure how it'd work across branches). I'm less convinced by the tradeoff between the complexity/maintainability of such a system and the benefits of generating "xor eax, eax" instead of "mov eax, 0".

FWIW, at the moment, I think this MOV is unsafe in the C++ register cache branch: https://github.com/mmastrac/dolphin/blob/69916f2ea64123ce15b1c0e47b8418c970e57ded/Source/Core/Core/PowerPC/Jit64/Jit_Integer.cpp#L1431

edit: Okay, I'm going to stop rambling about this and go with whatever you prefer, but on your branch it does seem pretty confusing that gpr.LockCarry() and LockFlags() are entirely separate things, and that while carry is locked, the actual flag can be modified deliberately by CMC and may be accidentally modified by MOV (because it doesn't just emit mov like the other instruction functions).

@mmastrac
Copy link
Member

It's really just a performance win from making register zeros a one-byte instruction rather than five, saving us some space in the CPU cache. Unfortunately there are no quick-zero instructions that don't stomp on flags.

I think there are two options here. Either we lock the flags around this function to prevent us from generating XOR, or we remove the optimization entirely until we have some flag-instrumenting code in x64Emitter. I don't think the indeterminate flag code would be terribly difficult, but it would require instrumenting each opcode-generating function with its flag effects.

I would lean towards removing the optimization for now, since there is probably also an issue w/the register cache accidentally generating XOR in this code path as well.

This optimization broke arithXex in rare cases by
emitting XOR where MOV was expected.
@mmastrac
Copy link
Member

LGTM

@mmastrac
Copy link
Member

@dolphin-emu-bot rebuild

@hthh
Copy link
Author

hthh commented Jun 30, 2016

@mmastrac thanks :)

I think flag instrumentation code in x64Emitter might open up more interesting possibilities (although as the instrumentation gradually gets more complex I wonder if it'd be easier to start a new JitIL).

@delroth delroth merged commit 4888470 into dolphin-emu:master Jun 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants