Jit: Remove unsafe MOV optimization #3968

hthh · 2016-06-30T12:05:34Z

This optimization broke arithXex in rare cases by emitting XOR where MOV was expected.

We could instead enforce locking of flags around MOV operations, but I'd prefer for the low level XEmitter interface to not optimize things.

Optimistically, this might fix https://dolp.in/i9661 but I haven't tried it.

This change is

delroth · 2016-06-30T12:07:29Z

@dolphin-emu-bot rebuild

I'd prefer if we didn't add another method for this if possible. Can we somehow detect when the XOR optimization is safe? @mmastrac

hthh · 2016-06-30T13:27:38Z

Thanks for the build :)

There are two other options, I think:

First, we could remove the optimization and function altogether. I don't know if it has much impact. I'd guess it just saves around three bytes in a handful of places?

Alternately, we can leave the optimization as-is, and require a lock around each sensitive MOV:

    if (d != b)
    {
      LockFlags();
      MOV(32, gpr.R(d), gpr.R(b));
      UnlockFlags();
    }

I'm not a fan of this option, because I suspect it will cause very subtle errors of a similar nature in the future (and there might be unnoticed parts of the code that need locks added already).

If you'd prefer a different approach, let me know and I'll remove the xor reg, reg optimization, or add the locks to arithXex.

mmastrac · 2016-06-30T14:41:44Z

I found this bug in the C++ register cache and ended up locking the carry flag for the block. mmastrac@aef8ada. I should have moved that change out when I split the MOV_sum code out.

I will double check to see if there were any other cases where I needed to lock the carry flag.

The best way to fix this would be to add an assertion that specifies that the carry flag is indeterminate after a MOV. That would surface any assumptions very quickly.

hthh · 2016-06-30T15:15:11Z

I agree that you could add a system for tracking flags, which could find uses of the overflow, sign, zero, adjust, parity and carry flags after a MOV, but before they are redefined by another instruction... This would probably find most issues (not sure how it'd work across branches). I'm less convinced by the tradeoff between the complexity/maintainability of such a system and the benefits of generating "xor eax, eax" instead of "mov eax, 0".

FWIW, at the moment, I think this MOV is unsafe in the C++ register cache branch: https://github.com/mmastrac/dolphin/blob/69916f2ea64123ce15b1c0e47b8418c970e57ded/Source/Core/Core/PowerPC/Jit64/Jit_Integer.cpp#L1431

edit: Okay, I'm going to stop rambling about this and go with whatever you prefer, but on your branch it does seem pretty confusing that gpr.LockCarry() and LockFlags() are entirely separate things, and that while carry is locked, the actual flag can be modified deliberately by CMC and may be accidentally modified by MOV (because it doesn't just emit mov like the other instruction functions).

mmastrac · 2016-06-30T15:32:26Z

It's really just a performance win from making register zeros a one-byte instruction rather than five, saving us some space in the CPU cache. Unfortunately there are no quick-zero instructions that don't stomp on flags.

I think there are two options here. Either we lock the flags around this function to prevent us from generating XOR, or we remove the optimization entirely until we have some flag-instrumenting code in x64Emitter. I don't think the indeterminate flag code would be terribly difficult, but it would require instrumenting each opcode-generating function with its flag effects.

I would lean towards removing the optimization for now, since there is probably also an issue w/the register cache accidentally generating XOR in this code path as well.

This optimization broke arithXex in rare cases by emitting XOR where MOV was expected.

mmastrac · 2016-06-30T15:50:53Z

LGTM

mmastrac · 2016-06-30T15:51:19Z

@dolphin-emu-bot rebuild

hthh · 2016-06-30T15:59:36Z

@mmastrac thanks :)

I think flag instrumentation code in x64Emitter might open up more interesting possibilities (although as the instrumentation gradually gets more complex I wonder if it'd be easier to start a new JitIL).

Jit: Remove unsafe MOV optimization

e57333a

This optimization broke arithXex in rare cases by emitting XOR where MOV was expected.

hthh force-pushed the mov-bug branch from 161a025 to e57333a Compare June 30, 2016 15:44

delroth merged commit 4888470 into dolphin-emu:master Jun 30, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jit: Remove unsafe MOV optimization #3968

Jit: Remove unsafe MOV optimization #3968

hthh commented Jun 30, 2016 •

edited by delroth

delroth commented Jun 30, 2016

hthh commented Jun 30, 2016

mmastrac commented Jun 30, 2016 •

edited

hthh commented Jun 30, 2016 •

edited

mmastrac commented Jun 30, 2016

mmastrac commented Jun 30, 2016

mmastrac commented Jun 30, 2016

hthh commented Jun 30, 2016

Jit: Remove unsafe MOV optimization #3968

Jit: Remove unsafe MOV optimization #3968

Conversation

hthh commented Jun 30, 2016 • edited by delroth

delroth commented Jun 30, 2016

hthh commented Jun 30, 2016

mmastrac commented Jun 30, 2016 • edited

hthh commented Jun 30, 2016 • edited

mmastrac commented Jun 30, 2016

mmastrac commented Jun 30, 2016

mmastrac commented Jun 30, 2016

hthh commented Jun 30, 2016

hthh commented Jun 30, 2016 •

edited by delroth

mmastrac commented Jun 30, 2016 •

edited

hthh commented Jun 30, 2016 •

edited