New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jit: Load the memory register only when the msr bits have changed and do not use jumps to load it. #12048
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking into the same thing in PR #11958, but in the end I never actually took it to the conclusion you did in your last commit here. Good to see it implemented in Jit64 too.
| LDR(IndexType::Unsigned, ARM64Reg::W27, PPC_REG, PPCSTATE_OFF(msr)); | ||
| TST(ARM64Reg::W27, LogicalImm(1 << (31 - 27), 32)); | ||
| MOVP2R(MEM_REG, jo.fastmem_arena ? memory.GetLogicalBase() : memory.GetLogicalPageMappingsBase()); | ||
| MOVP2R(ARM64Reg::X25, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put the TST after the MOVP2Rs (both here and in the other instances of code like this). Putting the TST right after the LDR means that in-order CPUs won't be able to run the MOVP2Rs until the load completes.
| MOV(64, R(RMEM), ImmPtr(memory.GetLogicalBase())); | ||
| MOV(64, R(RSCRATCH2), ImmPtr(memory.GetPhysicalBase())); | ||
| CMOVcc(64, RMEM, R(RSCRATCH2), CC_Z); | ||
| MOV(64, PPCSTATE(mem_ptr), R(RMEM)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is repeated between here and Jit_Branch.cpp. Please split it out to a separate function, like you did with UpdateMembase.
| "PC {:#018x}, access address {:#018x}, memory base {:#018x}, MSR.DR {}", | ||
| ctx->CTX_PC, access_address, memory_base, m_ppc_state.msr.DR); | ||
| "PC {:#018x}, access address {:#018x}, memory base {:#018x}, MSR.DR {}, memptr {:#018x}, pbase {:#018x}, lbase {:#018x}", | ||
| ctx->CTX_PC, access_address, memory_base, m_ppc_state.msr.DR, (u64)m_ppc_state.mem_ptr, (u64)memory.GetPhysicalBase(), (u64)memory.GetLogicalBase()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer C++ style casts (static_cast) over C-style casts.
Actually, there's something even better you can do here: You can use fmt::ptr. Example:
dolphin/Source/Core/Common/JitRegister.cpp
Line 108 in 7f40c6f
| const auto entry = fmt::format("{} {:x} {}\n", fmt::ptr(base_address), code_size, symbol_name); |
| @@ -311,6 +311,9 @@ class JitArm64 : public JitBase, public Arm64Gen::ARM64CodeBlock, public CommonA | |||
| void BeginTimeProfile(JitBlock* b); | |||
| void EndTimeProfile(JitBlock* b); | |||
|
|
|||
| // RMEM | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing called "RMEM" in JitArm64. I assume you mean MEM_REG. Though with that said, I'm not sure how much information this comment really imparts...
| ARM64Reg msr = ARM64Reg::W27; | ||
| LDR(IndexType::Unsigned, msr, PPC_REG, PPCSTATE_OFF(msr)); | ||
|
|
||
| UpdateMembase(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to do UpdateMembase each time we're in the dispatcher? If you instead update the value of MEM_REG every time the msr changes, you shouldn't need to update MEM_REG here, because MEM_REG is callee-saved and isn't used for anything else in the JIT, and so should never get overwritten.
(I'm not sure if Jit64's RMEM is callee-saved.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you're right, we should only need it when going through enter_code right?
I tired that but it didn't work, i'll look into why eventually but for now i need to catch a plane :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, you'll need it for enter_code, but other than that it should just be needed when msr is changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm not sure if Jit64's RMEM is callee-saved.)
Jit64's RMEM is callee-saved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now i've finally figured this out, i forgot that GlobalAdvance also checks exceptions and missed the load there. Now it works for both x86_64 and aarch64 to only load the memory register when needed.
I'm a bit puzzled why it worked on x86_64 without that, but i guess the fastmem implementation there might be a bit more forgiving.
| #endif | ||
| #ifdef _M_ARM_64 | ||
| ppc_state.mem_ptr = m_jit->jo.fastmem_arena ? memory.GetLogicalBase() : memory.GetLogicalPageMappingsBase(); | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could probably just use the AArch64 variant of this on all CPUs. Jit64 shouldn't rely on RMEM containing the memory base when there is no fastmem_arena, since in that case there is no memory base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the code is starting to look pretty good, but the commit history isn't very clean. The following commits should be melded into the commits that are the cause of the problems they're fixing:
- JitArm64/Jit64: some cleanups for the previous changes.
- JitArm64/Jit64: Fix linter errors.
- JitArm64/Jit64: Remove now unused variable.
- JitArm64/Jit64: Don't update the mem_ptr in JitInterface if m_jit is …
Or, alternatively... The approach this PR is taking has changed a bit since the PR was started, and some of the earlier changes in the PR are effectively made obsolete by later changes in the PR, especially "JitArm64/Jit64: Some minor optimizations to how msr is compared and l…". Maybe you could just squash all the commits into one?
| jo.fastmem_arena ? memory.GetLogicalBase() : memory.GetLogicalPageMappingsBase()); | ||
| SetJumpTarget(membaseend); | ||
| ARM64Reg msr = ARM64Reg::W27; | ||
| LDR(IndexType::Unsigned, msr, PPC_REG, PPCSTATE_OFF(msr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LDR can be moved back to where it was before this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has now been moved, but not to where it was before this PR. Was that done intentionally, for better codegen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, i moved it back
Hi, sorry for that, i was once told it's easier to follow a PR when the changes are incremental and only squashed once the PR is ready to go since it should make seeing what actually changed in between pushes easier. I'll squash everything. |
|
Well, yes and no. We like to have multiple commits when they're cleanly separated different steps, but fixes to minor problems that were introduced earlier in the PR are best amended in the commit they were introduced. That way, reading through the whole PR becomes easier, though it does come with the tradeoff of making it harder to track what has changed between different versions of a PR, like you say. |
|
Let's get this merged after the next beta comes out. |
|
@JosJuice Do you want to merge this now? |
Not sure if this is worth the time, but i made it so here we go.
Basically turns the membase switch into a single instruction (except when msr changes, then it's more).