Skip to content
This repository has been archived by the owner on Jan 28, 2023. It is now read-only.

Fix TEST instruction emulation #95

Merged
merged 2 commits into from
Sep 12, 2018
Merged

Fix TEST instruction emulation #95

merged 2 commits into from
Sep 12, 2018

Conversation

raphaelning
Copy link
Contributor

@raphaelning raphaelning commented Sep 7, 2018

This PR fixes two issues with the instruction emulator:

  1. Many unsupported MMIO instructions (e.g. TEST r/mN, rN, XCHG r/mN, rN, etc.) lead to a host crash.
  2. TEST r/mN, rN is not supported.

@@ -1956,6 +1956,10 @@ static int vcpu_emulate_insn(struct vcpu_t *vcpu)
uint64 rip = vcpu->state->_rip;
uint64 va;

// Clean up the emulation context of the previous MMIO instruction, so that
// even if things go wrong, the behavior will still be predictable.
vcpu_init_emulator(vcpu);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vcpu_init_emulator only needs to be called in EM_ERROR case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we failed to decode/emulate the previous instruction, we should have terminated the guest already and wouldn't have arrived here. The intention here is to always reset the emulation context (vcpu->emulate_ctxt) before we start working on a new instruction, because emulate_ctxt is shared among all MMIO instructions executed by this vCPU, and we don't want to accidentally reuse anything from the previous context.

BTW, I have made some changes to this patch, and will upload the new version later on.

@@ -1991,8 +1995,11 @@ static int vcpu_emulate_insn(struct vcpu_t *vcpu)
em_ctxt->rip = rip;
rc = em_decode_insn(em_ctxt, instr);
if (rc < 0) {
hax_panic_vcpu(vcpu, "%s: em_decode_insn() failed: vcpu_id=%u",
__func__, vcpu->vcpu_id);
hax_panic_vcpu(vcpu, "em_decode_insn() failed: vcpu_id=%u,"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we print these info as well?
ctxt->mode, ctxt->override_operand_size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry I missed this comment. I'll instead add a call to dump_vmcs(), which will give us (among other info) guest CR0 (GUEST_CR0 of VMCS) and CS (GUEST_CS_AR of VMCS), from which we'll be able to derive ctxt->mode, etc.

@raphaelning raphaelning force-pushed the insn-emu-test branch 2 times, most recently from c4bcf1d to de3967d Compare September 10, 2018 06:14
When decoding an instruction with an unsupported opcode (indicated
by the INSN_NOTIMPL flag), em_decode_insn() does not fail, which
can lead to a disaster in em_emulate_insn(), e.g. calling an
invalid handler function (soft_handler == NULL) and causing a host
kernel panic (#93).

1. In em_decode_insn(), check if the opcode is unsupported, i.e.
   the INSN_NOTIMPL flag is set or there is no emulation handler.
   If so, return a fatal error, raise a vCPU panic, and log the
   raw bytes of the instruction.
2. Before decoding a new instruction, reset the emulation context,
   so the old context is not accidentally referred to.
3. Add unit tests for two unsupported opcode cases. This requires
   refactoring EmulatorTest::run() first.

Signed-off-by: Yu Ning <yu.ning@intel.com>
Thanks to the scalable emulator design, implementing opcode 0x84
and 0x85 (TEST r/mN, rN) is quite simple. In addition, add a unit
test for the TEST instruction, and update one of the unimplemented
opcode unit tests to use XCHG instead of TEST.

Fixes #93.

Signed-off-by: Yu Ning <yu.ning@intel.com>
@mborgerson
Copy link
Contributor

mborgerson commented Sep 10, 2018

@raphaelning Thanks for addressing both of these issues! I'll try to test these changes within the next couple of days and verify (a) the kernel panic no longer happens and (b) this variant of the test instruction works as expected.

Copy link
Contributor

@junxiaoc junxiaoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just went through the change, it looks good, one small comment is for commit comment. I think it is better to change "Thanks.... quite simple" to "Implemented optcode 0x84 and 0x85(TEST r/mN, rN)."?

@raphaelning
Copy link
Contributor Author

@junxiaoc Ah, right, the wording isn't the best. Let's wait for the test result, in case there's also something in the code that needs to be fixed as well.

opcode_group = &opcode->group[ctxt->modrm.opc];
opcode->handler = opcode_group->handler;
if (!opcode_group->handler) {
const struct em_opcode_t *group_opcode;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving the variable declaration here isn't necessary, plus it's probably declared above for maximum compatibly across compilers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was to limit the scope of this variable: it is not used outside this if block, so there's no need to make it available for a larger scope. I thought this was a common practice in C programming, e.g. here's an advocate for it:

https://wiki.sei.cmu.edu/confluence/display/c/DCL19-C.+Minimize+the+scope+of+variables+and+functions

In addition, I thought moving the variable declaration closer to where it's used would also make the code easier to read. What compiler error can this change possibly cause?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C89-compilant compilers and gcc -pendantic would fail on local-scoped variable declarations.

Googling around, most people, like you and me, do seem to prefer local-scoped variable declarations. (Especially when C++ variables are initialized with constructor calls, limiting scope can be advantageous.) However, one remark that did stand out was "Reusing variable names with a different type is prevented".

Other than that, most compilers will try to limit scope and move variables into registers anyway, so it's a moot point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's interesting. Although I do have some vague memory of C89, I guess C99 was already prevalent when I learned coding. The compilers we use for HAXM (MSVC 2017 and Clang) are new enough to accept C99 syntax, so I wouldn't worry about compatibility with C89. In fact, we may consider enabling C11 support in these compilers some day.

However, one remark that did stand out was "Reusing variable names with a different type is prevented"

To achieve that, I'd rather avoid writing long functions, so we could easily catch variable naming conflicts in code review ;-)

@mborgerson
Copy link
Contributor

Confirmed (a) in my comment above, that the new updated driver no longer causes a kernel panic. Have not explicitly verified correctness of (b) yet, but my target appears to be getting much farther.

@raphaelning
Copy link
Contributor Author

@mborgerson Thanks! If you have time for one more test, could you apply only the first patch 4033099 and capture HAXM logs? The purpose of this patch is not only to avoid a host kernel panic, but also to log useful information about the unsupported MMIO instruction.

You can run the following command to catch live HAXM output:

log stream --predicate 'eventMessage contains "hax"' --info

@wcwang wcwang merged commit 876886c into master Sep 12, 2018
@wcwang wcwang deleted the insn-emu-test branch September 12, 2018 05:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants