New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored x86 instruction emulator #42

Merged
merged 4 commits into from May 24, 2018

Conversation

Projects
None yet
3 participants
@AlexAltea
Copy link
Contributor

AlexAltea commented Apr 6, 2018

I started rewriting the instruction emulator, as the current one is oversimplified to support a small subset of common instructions, but can't easily be extended to support others. This will fix issue #12 as well as many issues emulation-related in my guests.

Please let me know your thoughts/questions on the matter, and report any design issue, so that I can modify it in time. I'll keep updating this post as I progress with this task.

Design

The new emulator I'm working on is heavily inspired by KVM's x86 emulator [1], following these design choices (description is oversimplified but gives should give a rough idea):

  • Decoupled instruction decoder and instruction emulator. The decoder parses the instruction, storing the corresponding emulator handler and properties of destination/source operands in a context-like structure. The emulator uses this context to dispatch HAX_EXIT_FAST_MMIO to handle destination/source memory-operands and will execute the handler.

  • Emulator handlers will come in two flavors:

    1. Most instructions, e.g. ALU-related, will be functions using inlined assembly (see remark 1) with just the emulated instruction where all operands have been replaced by fixed register-operands. These instructions are followed of additional code to save EFLAGS/RFLAGS.
    2. Some instructions, e.g. branching-related, will be handled by custom functions specifically written for that particular instruction.
  • As expected, this emulator will be decoupled from vcpu.c into its own file emulate.c.

A high-level overview+comparison of the proposed changes can be found below (apologies for the bad handwriting but that was faster to sketch everything):

  • HAXM (old): haxm-old
  • HAXM (new): haxm-new

Remarks

  1. MSVC doesn't support inlined assembly on x64, and (correct me if I'm mistaken) Xcode/LLVM doesn't support MASM syntax that we could use as alternative. This leaves us with two choices:

    1. Duplicate the code for macOS and Windows driver-specific code, and let the linker solve the issue. This will be around 400 lines of code of otherwise unnecessary platform-specific code.
    2. Write the assembly emulator handlers in (say) NASM, and configure Windows/macOS projects to use this external tool. This will add an extra dependency, but the code will be cleaner.
  2. As far as I've analyzed the situation, none of these changes will require changes to the API that might break clients like QEMU.

@AlexAltea AlexAltea force-pushed the AlexAltea:master branch from 98fb6bb to 1f3d2db Apr 6, 2018

@AlexAltea AlexAltea referenced this pull request Apr 7, 2018

Closed

Add support for BMI1 instructions in emulated MMIO #9

3 of 3 tasks complete
@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented Apr 9, 2018

First of all, I'm really glad to see this pull request, and I want to thank you for your interest in contributing to HAXM. We welcome all contributions, especially something like this, which I think will enable more guest OSes to run under HAXM.

I completely agree with your assessment of the current instruction emulation implementation in HAXM. The vcpu_simple_decode() function you see now in vcpu.c used to be indeed simple as it name suggests, and I think it only supported a few variants of the MOV instruction at the time. One of the first patches I developed for HAXM was to expand it to support more MMIO instructions. But I did it in a hurry without referring to KVM code, and only came up with a flawed design, which is pretty much limited to the few types of MMIO instructions HAXM can emulate now. And I also came to realize that adding hundreds of lines of code to the already very long vcpu.c was a bad idea. So I'd support your design even if it was only for the refactoring.

On the design itself:

  1. Putting the new code in a separate source file (emulate.c) is definitely the way to go. At some point I think you'll need to integrate the new instruction emulator with vcpu.c, and that may not be a trivial change. Since vcpu.c is like the hub of the HAXM source tree that gets updated pretty often, in order to avoid rebase headaches, maybe you could design a clean interface between vcpu.c and the instruction emulator first and upload it for review. It would be great if the interface is applicable to the current emulator too, so we could refactor vcpu.c early, but that's not a requirement.
  2. Utilizing the hardware for instruction emulation is a clever and scalable approach. The only issue I can see now is that if the instruction is not supported on all CPU models (e.g. bextr as you mentioned in AlexAltea/orbital#9 is only available on Haswell and later), we'll need to check CPUID to avoid running into an invalid opcode error.
  3. As to where to put the new assembly code, I'd prefer the second solution which avoids code duplication. Right now as you can see, HAXM has already implemented the same low-level functions with different assembly flavors for Windows x86, Windows x64 and macOS. We could do the same for the new code, but that doesn't sound as easy as adding a build-time dependency. What do you think?
  4. I've taken a quick look at the MASM code in the PR and I think it looks pretty clean. When you later have an updated patch that you consider ready to be merged, please let us know how it can be tested if our usual test cases (booting Linux guests in QEMU) doesn't cover the new code.
  5. Your Orbital project looks very interesting. Is HAXM relevant to you for both Windows and macOS?

@raphaelning raphaelning added the feature label Apr 9, 2018

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented Apr 10, 2018

As to where to put the new assembly code, I'd prefer the second solution which avoids code duplication. [...] What do you think?

Completely agreed. I will do that then.

Your Orbital project looks very interesting. Is HAXM relevant to you for both Windows and macOS?

Thank you. For now, my project is Windows/Linux-only for technical reasons unrelated to HAXM. Nevertheless, I'll make sure the emulator redesign doesn't break macOS (or anything else).

@AlexAltea AlexAltea force-pushed the AlexAltea:master branch from 58fe255 to c1781e6 Apr 10, 2018

@raphaelning
Copy link
Contributor

raphaelning left a comment

I suppose this PR is still an RFC, so I focused on reviewing the interface (emulate.h). It looks pretty clean, but please take a look at my comments.

@@ -134,8 +134,7 @@
</PropertyGroup>
<!-- Needed by any VcxProj -->
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="ExtensionSettings" />

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

Just curious, is this generated by Visual Studio? If so, what version are you using?

This comment has been minimized.

@AlexAltea

AlexAltea Apr 16, 2018

Contributor

Visual Studio 2017 (v15.6.4). It was auto-generated, but presumably has no effect, I can revert this part to avoid polluting the history.

This comment has been minimized.

@raphaelning

raphaelning Apr 20, 2018

Contributor

OK don't worry about this. As long as it's auto-generated, it's fine. We created the VC++ project using VS2015, but we can upgrade to VS2017.

@@ -180,31 +179,49 @@
<ClCompile>
<SDLCheck>true</SDLCheck>
</ClCompile>
<MASM>
<UseSafeExceptionHandlers>true</UseSafeExceptionHandlers>

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

Same here, I suppose this is this auto-generated?

This comment has been minimized.

@AlexAltea

AlexAltea Apr 16, 2018

Contributor

I added this manually since it's required under the current settings in order to build .asm files. Same setting had also been applied by you guys in IntelHaxm.vcxproj.

Anyway, we still need to check whether we can use MASM for the macOS builds and if not, then switch to something else instead (NASM?), which will remove all of these settings in VS.

This comment has been minimized.

@raphaelning

raphaelning Apr 20, 2018

Contributor

I see. So basically this change is a result of putting MASM code in core/. I'd still prefer core/ to host arch-independent code only.

we still need to check whether we can use MASM for the macOS builds and if not, then switch to something else instead (NASM?)

Right. I doubt there's an assembler that understands MASM syntax and works on macOS, so maybe we should switch to a more portable syntax sooner rather than later? I've confirmed that NASM is supported on macOS: http://brewformulas.org/Nasm

But I also understand that your priority is Windows (and probably x86_64 only). If you want to keep the MASM code for now, could you move it to windows/? Eventually, we have to make the new instruction emulator work on Mac before we can release it.

@@ -239,6 +256,8 @@
<ItemGroup>
<!-- We only add items (e.g. form ClSourceFiles) that do not already exist (e.g in the ClCompile list), this avoids duplication -->
<ClCompile Include="@(ClSourceFiles)" Exclude="@(ClCompile)" />
<ClCompile Include="emulate.c" />

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

What about these two lines? I'd actually prefer that they were auto-generated by Visual Studio. But I wonder why they are not grouped together with the existing .c and .asm files.

This comment has been minimized.

@AlexAltea

AlexAltea Apr 16, 2018

Contributor

Interesting, this was unintended. I'll fix this.

@@ -239,6 +256,8 @@
<ItemGroup>
<!-- We only add items (e.g. form ClSourceFiles) that do not already exist (e.g in the ClCompile list), this avoids duplication -->
<ClCompile Include="@(ClSourceFiles)" Exclude="@(ClCompile)" />
<ClCompile Include="emulate.c" />
<MASM Include="emulate_ops.asm" />

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

The relative path is not given here, but I see you put it in core/. Since this is a Windows-specific source file, a better place for it would be windows/ (or maybe windows/amd64/? Not sure if this is x64-specific), and you can feel free to create a subdirectory. But if you plan to rewrite this for NASM so it can be shared by Windows and Mac, sure it can stay in core/.

This comment has been minimized.

@AlexAltea

AlexAltea Apr 16, 2018

Contributor

As mentioned above, the .asm may need to be rewritten in a language that we can easily integrate both in Xcode and Visual Studio build systems.


/* Emulate accesses to guest memory */
static int segmented_read(struct em_context_t *ctxt,
struct segmented_addr_t *addr, void *data, unsigned size)

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

nit: Just a note on coding style. When wrapping a long line, we usually indent each new line with 8 spaces, or align the parameters:

static int segmented_read(struct em_context_t *ctxt,
                          struct segmented_addr_t *addr,
                          void *data, unsigned size)
#define HAX_CORE_EMULATE_OPS_H_

/* Instruction handlers */
typedef void(*em_handler_t);

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

nit: I guess this means typedef void (*em_handler_t)();? IMHO it would be more readable if it resembled a function declaration.

#define EM_CONTINUE 0

struct em_operand_t {
unsigned int bytes;

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

nit: Any reason not to use a fixed width integer type like uint32_t? I actually don't know what bytes refers to here.

OP_IMM,
} type;
struct segmented_addr_t {
HAX_VADDR_T addr;

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

Is HAX_VADDR_T defined anywhere? Do you mean vaddr_t?

This comment has been minimized.

@AlexAltea

AlexAltea Apr 16, 2018

Contributor

#if defined(_X86_)
typedef uint32_t mword;
#endif
#if defined (_AMD64_)
typedef uint64_t mword;
#endif
typedef mword HAX_VADDR_T;

But probably vaddr_t is more convenient.

This comment has been minimized.

@raphaelning

raphaelning Apr 20, 2018

Contributor

Right! Sorry I missed the definition (never used it myself).

} type;
struct segmented_addr_t {
HAX_VADDR_T addr;
unsigned segment;

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

nit: Consider using a fixed width type?

uint32_t eflags;
};

int decode_insn(struct em_context_t *ctxt);

This comment has been minimized.

@raphaelning

raphaelning Apr 11, 2018

Contributor

Is this the function to be called from outside the emulator module, e.g. vcpu.c? If so, how to pass in the raw bytes of the instruction?

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented Apr 19, 2018

Testing

Developing the decoder/emulator will be easier with proper tests: running Linux guests might omits issues, and furthermore it doesn't reports the cause of an issue at all. For unit testing, I'm planning to use Microsoft's Native Unit Test Framework since it's integrated nicely into Visual Studio, but I'm open to any suggestions. If you prefer another framework or want the tests to run in macOS too, let me know.

image

The idea is mapping each instruction to a single TEST_METHOD where every instruction variant will be tested, e.g. and reg,mem, and mem,reg, and reg,imm, etc. In every variant, we will specify the initial state, the instruction will be assembled (using Keystone), then decoded+execute by the HAXM emulator and the output state will be verified.

An example of similar unit tests I've written to test emulators can be found here: https://github.com/AlexAltea/nucleus/blob/master/tests/cpu/ppc/ppc_integer.cpp.
Let me know if you agree with the overall design.

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented Apr 20, 2018

I really like the idea of using a unit testing framework! It would surely benefit not only the new instruction emulator component, but the entire HAXM project :)

I've never used the Microsoft Unit Testing Framework. Here are my questions (some may be dumb):

  1. Can we really run all tests in user space? That would be so cool, but I think most existing HAXM headers include hax_types_windows.h or hax_types_mac.h, which, in turn, includes headers from Windows DDK or Apple I/O Kit, so would there be a problem if you had to include such a header in your unit test .cpp (e.g. when you needed some data structures like vcpu_t)?
  2. Is it possible to run these unit tests outside Visual Studio, so that they can be automated?
  3. According to this article, Visual Studio 2017 also supports other unit testing frameworks like Google Test. I wonder if switching to Google Test now would make it easier to port the unit tests to Mac later (as well as to automate them)?

The idea of using an assembler API like Keystone is also brilliant. But does that mean we'll add a build-time dependency on Keystone (although only for the unit tests)? According to their GitHub page, Keystone is licensed under GPLv2, so I'm not sure if we have a license issue here...

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented Apr 20, 2018

I forgot to mention your sketch of the MMIO workflow - that's very nice! One of the biggest changes introduced by the new design is the "re-entry" arrow, so I'd like to understand it better. Basically, what is it about? What about the three small, black arrows?

I like the fact that the new design handles both vcpu_setup_fastmmio() and handle_mmio_post() in em_emulate_insn(), which is more consistent, probably because you now have a better per-vcpu emulation context.

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented Apr 20, 2018

Can we really run all tests in user space?

Yes, the emulator is completely isolated from the rest of the driver thanks to a very simple API [1] to perform simple tasks like reading/writing registers, memory, etc. The only external dependency would be some helper types. With few preprocessor lines we can alternate between hax_types* headers for kernel drivers and just a regular <stdint.h> typedef's. These are the only lines that we need in emulate.h to do this:

#ifdef HAX_TESTS
#include <stdint.h>
#else
#include "hax_types.h"
#endif

Is it possible to run these unit tests outside Visual Studio, so that they can be automated?

Good point. It might require plenty of manual work to do so (I haven't found any article on the matter), so maybe using Google Test might be more convenient. I will look into it. :-)

But does that mean we'll add a build-time dependency on Keystone (although only for the unit tests)?

Yes, I'd recommend using vcpkg, so all you need to do is to run vcpkg install keystone and you should be ready to build the tests without polluting the solution/project.

Keystone is licensed under GPLv2, so I'm not sure if we have a license issue here...

Apparently, they explicitly allowed compatibility with BSD licenses:
https://github.com/keystone-engine/keystone/blob/master/EXCEPTIONS-CLIENT

One of the biggest changes introduced by the new design is the "re-entry" arrow, so I'd like to understand it better. Basically, what is it about? What about the three small, black arrows?

Some instructions might require multiple reads/writes, which cannot be handled in a single request to QEMU. For instance, after decoding neg [addr], the emulator needs to: (1) read addr, (2) negate the value and (3) write it back to addr.

When the emulator needs to do a special operation, e.g. read/write MMIO, an "emulator exit" return code will signal every function in the driver call stack to return immediately (that's the small black arrows), until we are back in userland/QEMU to handle that request.

Then, once we return to the driver, specifically in vcpu_execute, we will check if the emulator had pending work, so we can continue with it. This process will repeat as many times as necessary. This is possible with relatively low overhead since the emulator context stores the decoded instruction, and other intermediates.

@AlexAltea AlexAltea force-pushed the AlexAltea:master branch from 9350425 to d191a3a Apr 22, 2018

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented Apr 22, 2018

I finished removing all Microsoft-specific code and using instead Google Test and NASM. The current revision builds now on Windows and macOS (of course, it's unusable at the current state since the decoder/emulator isn't finished yet).

@raphaelning
Copy link
Contributor

raphaelning left a comment

Thanks for switching to NASM and GTest! It seems you have got NASM integrated into Mac build too? That's awesome. I assume there's more effort needed to enable tests/ on Mac, but we can worry about it later.

@@ -135,6 +135,7 @@
<!-- Needed by any VcxProj -->
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
<Import Project="$(VCTargetsPath)\BuildCustomizations\nasm.props" />

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

Is this file missing from the git repo?

This comment has been minimized.

@AlexAltea

AlexAltea Apr 24, 2018

Contributor

This requires manually installing: https://github.com/ShiftMediaProject/VSNASM.

This is somewhat annoying. Although there is a NASM package for NuGet, unfortunately its build customizations do not work on Visual Studio 2017. I will consider updating that package, so that we don't have to deal with 3 separate sources (vcpkg, nuget, manual installation) for build-time dependencies.

This comment has been minimized.

@raphaelning

raphaelning Apr 25, 2018

Contributor

I see. We can live with the current approach, but it would be helpful if you could summarize the additional environment setup steps required to build and test this PR.

@@ -269,4 +274,7 @@
</ItemGroup>
<!-- /Necessary to pick up proper files from local directory when in the IDE-->
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
<Import Project="$(VCTargetsPath)\BuildCustomizations\nasm.targets" />

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

Is this file missing from the git repo?

This comment has been minimized.

@AlexAltea

AlexAltea Apr 24, 2018

Contributor

(See answer above)

(RFLAGS_CF | RFLAGS_PF | RFLAGS_AF | RFLAGS_ZF | RFLAGS_SF | RFLAGS_OF)

typedef struct em_vcpu_ops_t {
uint64_t(*read_gpr)(void *vcpu, uint32_t reg_index);

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

nit: Insert a space after the return type: uint64_t (*read_gpr)(void *vcpu, uint32_t reg_index);


typedef struct em_vcpu_ops_t {
uint64_t(*read_gpr)(void *vcpu, uint32_t reg_index);
void(*write_gpr)(void *vcpu, uint32_t reg_index, uint64_t value);

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

nit: Insert a space after the return type: void (*write_gpr)(void *vcpu, uint32_t reg_index, uint64_t value);

void(*write_gpr)(void *vcpu, uint32_t reg_index, uint64_t value);
} em_vcpu_ops_t;

typedef void(em_operand_decoder_t)(em_context_t *ctxt,

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor
  1. nit: Insert a space after the return type.
  2. nit: Any reason not to declare em_operand_decoder_t as a function pointer like read_gpr and write_gpr above? That way you can use it later without *, e.g. em_operand_decoder_t decode_dst.
core/vcpu.c Outdated

static uint64_t vcpu_read_gpr(struct vcpu_t *vcpu, uint32_t reg_index)
{
switch (reg_index) {

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

nit: It would be nice if we could replace this switch statement with just return vcpu->state->_regs[reg_index].

core/vcpu.c Outdated

static void vcpu_write_gpr(struct vcpu_t *vcpu, uint32_t reg_index, uint64_t value)
{
switch (reg_index) {

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

nit: It would be nice if we could replace this switch statement with just vcpu->state->_regs[reg_index] = value.

@@ -0,0 +1,314 @@
/*

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

Something to think about: Will this source file one day grow to 1000+ LoC as we add more tests to it? At some point maybe we need to split it into multiple .cpps.

This comment has been minimized.

@AlexAltea

AlexAltea Apr 24, 2018

Contributor

My plans are implementing all instructions currently supported, plus the ones needed for my guest kernel. Their corresponding tests will take around 1000 LoC, so I won't split anything.

A full test suite for the entire x86 instruction set will certainly be much larger, so at some point the file will need to be split. Can be considered future work. :-)

This comment has been minimized.

@raphaelning

raphaelning Apr 25, 2018

Contributor

Sounds good!

};

void test_f6alu_i08(const char* insn_name,
const std::vector<test_f6alu_t>& tests) {

This comment has been minimized.

@raphaelning

raphaelning Apr 24, 2018

Contributor

nit: We put the { that marks the beginning of a function body on its own line, but that's C coding style. This is C++ so it can follow a different style, but please be consistent and make SetUp() and run() do the same :)

nit x2: The same could be said about pointer declarations, i.e. type *var (C style) vs. type* var (C++ style).

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented Apr 24, 2018

I now understand the arrows in the illustration, thanks!

Glad to know that Keystone is compatible with our license. vcpkg seems to be a good way to add Keystone to Windows build, although it means we are using more than one tool for dependency management (GTest seems to be managed by NuGet). That's not a big issue, especially if getting rid of NuGet would break the integration of GTest with VS2017. On Mac, I assume we could just install the Homebrew package for Keystone.

@AlexAltea AlexAltea force-pushed the AlexAltea:master branch from 28c1c00 to 09dd2a4 Apr 27, 2018

@raphaelning
Copy link
Contributor

raphaelning left a comment

Sorry for the long wait! So the last patch of this series has enabled the new instruction emulator, but have you run an integration test? It's okay if it's not fully ready yet, and we can focus on unit testing.

core/vcpu.c Outdated
if (!em_ctxt->finished) {
switch (em_emulate_insn(em_ctxt)) {
case EM_EXIT_MMIO:
return HAX_EXIT_FAST_MMIO;

This comment has been minimized.

@raphaelning

raphaelning May 2, 2018

Contributor

Hmm, this may not be a proper return value. This function is basically the handler for HAX_VCPU_IOCTL_RUN, and the callers seem to expect no more than success/failure: 0 or HAX_EXIT indicates success, all other values (including HAX_RESUME - I have no idea why it was named like that) failure.

If you want to exit to QEMU with a HAX_EXIT_xxx status code, you should probably set htun->_exit_status = HAX_EXIT_FAST_MMIO, err = HAX_EXIT and goto out. We can't use return since we're still holding vcpu->tmutex.

core/vcpu.c Outdated
case EM_EXIT_MMIO:
return HAX_EXIT_FAST_MMIO;
case EM_ERROR:
return HAX_EXIT;

This comment has been minimized.

@raphaelning

raphaelning May 2, 2018

Contributor

HAX_EXIT indicates success. In case of an error, probably you should call hax_panic_vcpu, set err to some error code, and goto out?

core/vcpu.c Outdated
rc = em_emulate_insn(em_ctx);
switch (rc) {
case EM_EXIT_MMIO:
return HAX_EXIT_FAST_MMIO;

This comment has been minimized.

@raphaelning

raphaelning May 2, 2018

Contributor

This is not a valid return value. I've explained it in another comment above.

core/vcpu.c Outdated
}
rc = em_decode_insn(em_ctx, instr);
if (rc != EM_CONTINUE) {
return HAX_EXIT;

This comment has been minimized.

@raphaelning

raphaelning May 2, 2018

Contributor

This function is part of the HAX_VCPU_IOCTL_RUN handler where HAX_EXIT actually indicates success. See my other comment above on error handling.

core/vcpu.c Outdated
case EM_EXIT_MMIO:
return HAX_EXIT_FAST_MMIO;
case EM_ERROR:
return HAX_EXIT;

This comment has been minimized.

@raphaelning

raphaelning May 2, 2018

Contributor

This actually indicates success. See my other comment above on error handling.

core/vcpu.c Outdated
}
return HAX_RESUME;

This comment has been minimized.

@raphaelning

raphaelning May 2, 2018

Contributor

This actually indicates failure. See my other comment above.

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 8, 2018

Sorry for the long wait!

No worries, I also went away last week to take a small break. I've addressed the issues you mentioned. Emulator functions, in particular em_emulate_insn return EM_CONTINUE (= 0) on success, EM_EXIT* (> 0) when QEMU handling is required, EM_ERROR* (< 0) on errors. Therefore I've mapped:

  • EM_CONTINUE: Maps to HAX_EXIT, except when it's being called inside vcpu_execute as it means emulation is finished and we can continue executing the VM.
  • EM_EXIT* (> 0): Maps to HAX_EXIT.
  • EM_ERROR* (< 0): Maps to HAX_RESUME.

So the last patch of this series has enabled the new instruction emulator, but have you run an integration test?

Not yet, and I'm not sure what would be the best way to make integration tests to test all of HAXM at once. Is it alright if I simply try to launch Debian (as shown in [1]) and Android image?

[1] https://www.qemu.org/2017/11/22/haxm-usage-windows/

@raphaelning
Copy link
Contributor

raphaelning left a comment

Thanks! The mappings are correct, but sometimes we can't just do a simple return HAX_RESUME - see my new comments.

Is it alright if I simply try to launch Debian (as shown in [1]) and Android image?

Yes, both of these are important test cases. I'd also like to add a simple one: qemu-system-x86_64.exe -accel hax, which does not boot any guest OS but runs several ROMs, e.g. SeaBIOS, SeaVGABIOS, iPXE, etc. In fact, most of the code in vcpu_simple_decode() was verified with this boot ROM test. We have more test cases on our side, but as long as this PR passes the 3 basic tests, there won't be any major effort required to get it merged.

How much gap, in terms of functionality, is there between the old decoder/emulator and the new framework? At least, support for the REP prefix seems not ready yet (per a comment in decode_prefixes()), so the boot ROM test will probably fail.

core/vcpu.c Outdated
if (!em_ctxt->finished) {
rc = em_emulate_insn(em_ctxt);
if (rc < 0) {
return HAX_RESUME;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

Again, it's not safe to return from here because we are still holding a lock (vcpu->tmutex). Quoting one of my comments from last week:

In case of an error, probably you should call hax_panic_vcpu, set err to some error code, and goto out

hax_panic_vcpu() not only logs an error message, but also makes the vcpu panic, so QEMU can shut down the guest ASAP.

core/vcpu.c Outdated
return HAX_RESUME;
}
if (rc > 0) {
return HAX_EXIT;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

Can't just return here with lock held. Should set err = HAX_EXIT and goto out.

When rc == EM_EXIT_MMIO, is it safe to assume em_emulate_insn() has already set up htun for MMIO, e.g. htun->_exit_status = HAX_EXIT_FAST_MMIO? Probably so, but adding an assert() here wouldn't hurt.

This comment has been minimized.

@AlexAltea

AlexAltea May 9, 2018

Contributor

When rc == EM_EXIT_MMIO, is it safe to assume em_emulate_insn() has already set up htun for MMIO, e.g. htun->_exit_status = HAX_EXIT_FAST_MMIO?

Yes, EM_EXIT_MMIO is only returned after HAX_EXIT_FAST_MMIO is set, see e.g. vcpu_read_memory and vcpu_write_memory. The same can be assumed for any other EM_EXIT_* event.

but adding an assert() here wouldn't hurt.

Sure, you mean something like this, right?

if (rc == EM_EXIT_MMIO)
    assert(htun->_exit_status == HAX_EXIT_FAST_MMIO);

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

Yes, exactly. Also I guess rc == EM_EXIT_MMIO itself could be asserted, since we expect no other value when rc > 0.

core/vcpu.c Outdated
dump_vmcs(vcpu);
return -1;
}
#else // !CONFIG_HAX_EPT2
if (!vcpu_read_guest_virtual(vcpu, va, &instr, INSTR_MAX_LEN, INSTR_MAX_LEN,
0)) {
0)) {

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

nit: I'd prefer the original indentation style (consistent with how we wrap a long function call everywhere else). Or you could indent with 8 spaces if that's easier for you.

This comment has been minimized.

@AlexAltea

AlexAltea May 9, 2018

Contributor

Sorry, this might have been my IDE. I switch between few computers, and some editors aren't properly configured. I'll fix these (and other) lines.

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

Thanks! We don't have a script to check coding style or fix issues automatically, so I wouldn't push too hard on this :)

core/vcpu.c Outdated
hax_panic_vcpu(vcpu, "Error reading instruction at 0x%llx for decoding"
" (CS:IP=0x%llx:0x%llx)\n", va, cs_base, rip);
" (CS:IP=0x%llx:0x%llx)\n", va, cs_base, rip);

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

nit: Same here.

core/vcpu.c Outdated
}
rc = em_decode_insn(em_ctx, instr);
if (rc != EM_CONTINUE) {
return HAX_RESUME;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

Call hax_panic_vcpu() before return so QEMU will not try to run this vcpu over and again.

core/vcpu.c Outdated
num++;
rc = em_emulate_insn(em_ctx);
if (rc == EM_ERROR) {
return HAX_RESUME;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

Same here.

core/vcpu.c Outdated
uint64_t *value, uint32_t size)
{
struct vcpu_t *vcpu = obj;
struct hax_tunnel *htun;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

nit: Move this declaration into the if block below.

core/vcpu.c Outdated
{
struct vcpu_t *vcpu = obj;
struct hax_tunnel *htun;
struct hax_fastmmio *hft;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

nit: Same here.

core/vcpu.c Outdated
}
}
struct vcpu_t *vcpu = obj;
struct hax_tunnel *htun;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

nit: Move this declaration into the if block below.

core/vcpu.c Outdated
}
struct vcpu_t *vcpu = obj;
struct hax_tunnel *htun;
struct hax_fastmmio *hft;

This comment has been minimized.

@raphaelning

raphaelning May 9, 2018

Contributor

nit: Same here.

@AlexAltea AlexAltea force-pushed the AlexAltea:master branch from a578b70 to 188ecd5 May 9, 2018

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 10, 2018

Ok. Thanks for your help, I've addressed all your comments.

How much gap, in terms of functionality, is there between the old decoder/emulator and the new framework?

I've finished porting missing features today, including support for REP* prefixes. Now we should have the same functionality as the old emulator (plus a couple of new instructions). Regarding features, this PR can be considered "finished".

Now I'm testing Debian, but I'm still stuck at the BIOS stage: the OS freezes immediately after starting the QEMU and is followed by a kernel panic. There might be other issues like this, so I'm going to test and fix them all. I will let you know once I consider this PR to be "ready to merge".

@raphaelning
Copy link
Contributor

raphaelning left a comment

Thanks! You are doing all the hard work :) We really look forward to merging this PR, so please let us know whenever you feel you could use our help.

Here's a debugging tip which you may already know: before you load a BSOD crash dump (C:\Windows\Minidump\*.dmp) into WinDbg for analysis, you can modify the list of symbol search paths so WinDbg can find IntelHaxm.pdb and show you a better stack trace. I use this global environment variable: _NT_SYMBOL_PATH=srv*c:\Symbols*http://msdl.microsoft.com/download/symbols;X:\path\to\haxm\obj\out

/* 0x80 - 0x8F */
G(opcode_group1, op_modrm_rm, op_imm, op_none, INSN_BYTEOP),
G(opcode_group1, op_modrm_rm, op_imm, op_none, 0),
G(opcode_group1, op_modrm_rm, op_imm, op_none, INSN_BYTEOP),

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: I don't see 0x82 documented in the current Intel SDM. Some claim it's an alias for 0x80, but not valid in 64-bit mode:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20141229/250249.html

Omitting the 64-bit mode check should be okay for now, since 0x82 is a corner case.

I2_BV(em_mov, op_acc, op_di, op_none, INSN_MODRM | INSN_MOV), /* stos{b,w,d,q} */
I2_BV(em_mov, op_si, op_acc, op_none, INSN_MODRM | INSN_MOV), /* lods{b,w,d,q} */
X2(N),
/* 0xB0 - 0xFF */

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: 0xFF => 0xBF


static const struct em_opcode_t opcode_group3[8] = {
F(em_test, op_modrm_rm, op_imm, op_none, 0),
F(em_test, op_modrm_rm, op_imm, op_none, 0),

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: I don't see opcode 0xF6 /1 or 0xF7 /1 documented in the current Intel SDM, but they may exist in AMD spec:

http://ref.x86asm.net/coder32.html#gen_note_TEST_F6_1_F7_1

(Just an interesting observation, no action required)


/* Emulate accesses to guest memory */
static uint64_t get_canonical_address(struct em_context_t *ctxt,
uint64_t addr, uint32_t vaddr_bits)

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: Any good reason to use a fixed-width integer type for something as simple as vaddr_bits? I'd just use int or uint (unsigned int).

if (ctxt->mode == EM_MODE_PROT64) {
*la = get_canonical_address(ctxt, mem->ea, 48);
}
else {

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: Do not insert a line break before else: } else {


if (opcode->decode_dst == decode_op_di) {
register_add(ctxt, REG_RDI, ctxt->operand_size *
(ctxt->eflags & RFLAGS_DF) ? -1 : +1);

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: I'm not sure if operator precedence rules require using an extra pair of parentheses for this ?: expression, but I'd add it for readability: ((ctxt->eflags & RFLAGS_DF) ? -1 : +1)

}
if (opcode->decode_src1 == decode_op_si) {
register_add(ctxt, REG_RSI, ctxt->operand_size *
(ctxt->eflags & RFLAGS_DF) ? -1 : +1);

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: Same here.

core/vcpu.c Outdated
}
}
em_ctxt->rip = rip;
em_ctxt->eflags = vcpu->state->_rflags;

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

In case of an ALU (fastop) instruction, I guess we also need to write the updated RFLAGS back to vcpu after em_emulate_insn()? Maybe we could add read_rflags and write_rflags to em_vcpu_ops_t, so as to simply the logic here.

ctxt->insn = insn;
ctxt->lock = 0;
ctxt->rep = 0;
ctxt->len = 0;

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

nit: Avoid trailing whitespace. I also see this issue elsewhere in the file since my editor is set to highlight trailing whitespace. You may consider doing the same.

global fastop_dispatch
%ifidn __CONV__, x32_cdecl
fastop_dispatch:
; TODO: Unimplemented

This comment has been minimized.

@raphaelning

raphaelning May 11, 2018

Contributor

Eventually we need this implemented, since we still support 32-bit Windows.

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 12, 2018

Getting there... :-)

image

Side note: While I can successfully build HAXM on a macOS VM, I can't/won't test it. I assume that's fine since my changes are only in the generic haxlib core, so it suffices to prove it working under Windows.

EDIT: Turns out I'm also unable to check if Android works. I typically install Android Studio, but somehow this is forcing me to install the stock HAXM, rather than just using the test-signed drivers I installed. Is there any alternative to test Android?

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 12, 2018

Another, question: do you want me to squash all commits into a single one, or are you fine with keeping the Git history as-is in this pull request? Do you need any special Signed-off-by: ... message?

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 14, 2018

Getting there... :-)

Awesome, congrats!!

While I can successfully build HAXM on a macOS VM, I can't/won't test it.

No problem. You probably don't have the environment to test the 32-bit Windows driver either, so we'll run tests on these hosts.

I was trying to build this PR and ran into issues on both Windows and Mac. I'll describe them in a separate post.

Is there any alternative to test Android?

Hmm, you could install the stock HAXM first, and then unload it before loading the test-signed driver: sc stop intelhaxm. Alternatively, it's possible to run Android Emulator from the command line without installing the Android Studio IDE:

  1. Download Android SDK tools from here (under Command line tools only)
  2. Unzip the package.
  3. Use sdkmanager (in %ANDROID_SDK_ROOT%\tools\bin\) to install emulator and at least one Android system image, e.g. system-images;android-27;google_apis_playstore;x86. (I don't think either of these would result in the force installation of stock HAXM.)
  4. Use avdmanager (in %ANDROID_SDK_ROOT%\tools\bin\) to create an AVD (VM) configuration.
  5. cd emulator\
  6. List all AVD configurations: emulator -list-avds
  7. Launch the AVD: emulator -avd <avd_name> -verbose

Do you need any special Signed-off-by: ... message?

Probably not this time, since we haven't defined our sign-off policy yet. But thanks for your consideration!

do you want me to squash all commits into a single one, or are you fine with keeping the Git history as-is in this pull request?

This is a great question. On one hand, having 20 commits for a big PR like this is better than having one huge commit. On the other hand, there's probably a better way to split the PR, so that each commit is self-contained and does not break the build. I'm thinking about squashing and re-splitting the PR into maybe 3 commits:

a) Upgrade the project to VS2017 format.
b) Implement the new instruction emulator.
c) Enable the new instruction emulator.

But I'm not sure if this idea is feasible, since I haven't given it much thought. E.g. should the emulator-vcpu interface go to b or c? (I guess the former.) I'm open to suggestions, and I'm happy to help you with the rebasing and re-splitting. Bottom line: the code itself (and the feature it enables) is more important than the look of the git history. We can make a reasonable effort to improve the git history, but at the end of the day, as long as the code works well, we'll accept it :)

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 14, 2018

On Mac I have just one issue, so let me describe that first.

The build was successful, but when I load the signed kext (we have an Apple-approved code-signing certificate) by sudo kextutil intelhaxm.kext, I get:

(kernel) kxld[com.intel.kext.intelhaxm]: The following symbols are unresolved for this kext:
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_adc
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_add
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_and
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_cmp
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_dec
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_inc
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_neg
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_not
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_or
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_sbb
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_sub
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_test
(kernel) kxld[com.intel.kext.intelhaxm]:        _em_xor
(kernel) kxld[com.intel.kext.intelhaxm]:        _fastop_dispatch
(kernel) Can't load kext com.intel.kext.intelhaxm - link failed.
(kernel) Failed to load executable for kext com.intel.kext.intelhaxm.
(kernel) Kext com.intel.kext.intelhaxm failed to load (0xdc008016).
(kernel) Failed to load kext com.intel.kext.intelhaxm (error 0xdc008016).

These symbols don't really exist in the code. Apparently they are derived from the NASM routines (defined in core/emulate_ops.asm, e.g. em_add) by adding the _ prefix. It seems only the routines referenced by core/emulate.c have this problem. Any idea how to fix this?

You may not be able to reproduce the above error with an unsigned kext. But you can try this:

$ kextlibs -undef-symbols /path/to/intelhaxm.kext
For all architectures:
    com.apple.kpi.bsd = 17.5
    com.apple.kpi.iokit = 17.5
    com.apple.kpi.libkern = 17.5
    com.apple.kpi.mach = 17.5

For x86_64:
    17 symbols not found in any library kext:
        _em_add
        _em_xor
        _em_and
        _real_ncpus
        _em_test
        _em_neg
        _em_sbb
        _em_not
        _fastop_dispatch
        _em_dec
        _em_inc
        _em_or
        _em_cmp
        _cpu_number
        _em_sub
        _em_adc
        _mp_rendezvous_no_intrs

Note that a few symbols listed in this output (e.g. _mp_rendezvous_no_intrs) actually exist in the macOS kernel and can be ignored (they have nothing to do with this PR).

One more experiment:

$ nm -a build/intelhaxm.build/Release/intelhaxm.build/DerivedSources/emulate_ops.o
0000000000000180 T em_adc
0000000000000100 T em_add
0000000000000200 T em_and
0000000000000780 T em_andn
0000000000000740 T em_bextr
...
0000000000000280 T em_xor
00000000000007b6 T fastop_dispatch

(only original symbol names)

whereas:

$ nm -a build/intelhaxm.build/Release/intelhaxm.build/Objects-normal/x86_64/ia32.o
00000000000001ef T ___fls
00000000000001c5 T ___handle_cpuid
00000000000001e7 T ___nmi
000000000000015a T _btr
000000000000016d T _bts
0000000000000180 T _cpu_has_emt64_support
00000000000001ae T _cpu_has_nx_support
0000000000000197 T _cpu_has_vmx_support
0000000000000151 T _fxrstor
0000000000000148 T _fxsave
0000000000000000 T _get_cr0
0000000000000009 T _get_cr2
0000000000000012 T _get_cr4
...
00000000000000a2 T _set_dr7
00000000000000ef T _set_kernel_ds
00000000000000f7 T _set_kernel_es
00000000000000ff T _set_kernel_fs
00000000000000e7 T _set_kernel_gs

(only mangled symbol names)

So it seems we need to make NASM add the _ prefix?

<PropertyGroup Label="Globals">
<ProjectGuid>{0458723c-5c3f-4509-bdae-35fae2a90c19}</ProjectGuid>
<Keyword>Win32Proj</Keyword>
<WindowsTargetPlatformVersion>10.0.16299.0</WindowsTargetPlatformVersion>

This comment has been minimized.

@raphaelning

raphaelning May 14, 2018

Contributor

I'm building with the latest Enterprise WDK release (RS4), which includes a different (newer) version Windows SDK, so I had to delete this line. This project doesn't really depend on a specific version of Windows SDK anyway.

@@ -15,38 +15,42 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "IntelHaxm", "windows\IntelH
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "haxlib", "core\haxlib.vcxproj", "{BC80D1E0-5738-4048-A742-8A20949A6587}"
EndProject
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "Tests", "Tests", "{4FF4BA9A-1A63-4CBF-A6C0-1D2D83B2146E}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tests", "tests\tests.vcxproj", "{0458723C-5C3F-4509-BDAE-35FAE2A90C19}"

This comment has been minimized.

@raphaelning

raphaelning May 14, 2018

Contributor

There doesn't seem to be an easy way to exclude tests from the VS solution build yet. I wonder how we can add a "no tests" option to the build. It's okay if tests is built by default, but it's not really a strict requirement for the driver build, so we should make it optional.

Actually I'm having a lot of trouble installing all the dependencies for tests... Still haven't got keystone installed via vcpkg due to a CMake failure :(

This comment has been minimized.

@AlexAltea

AlexAltea May 15, 2018

Contributor

Actually I'm having a lot of trouble installing all the dependencies for tests... Still haven't got keystone installed via vcpkg due to a CMake failure :(

It seems that my vcpkg port hasn't been reviewed at all: Microsoft/vcpkg#3273

Maybe it will make sense that I port keystone to NuGet instead and fix nasm NuGet package to work with the latest Visual Studio, so we fetch all dependencies from a single source under Windows (instead of using 3 different sources).

This comment has been minimized.

@raphaelning

raphaelning May 15, 2018

Contributor

That sounds great! Hopefully NuGet maintainers are more responsive. I like the idea of vcpkg, but right now I don't think it's mature enough for our project to depend on...

This comment has been minimized.

@ras0219-msft

ras0219-msft May 18, 2018

Sorry for dropping that PR on the floor! With our recent support for Linux and OSX, we built up a backlog of PRs that we're actively trying to clear.

Vcpkg can also be used to produce NuGet packages containing all your dependencies automatically: vcpkg export --nuget keystone gtest. The packages are "freestanding" from vcpkg, so your users don't need to interact with vcpkg at all if that's preferable.

This comment has been minimized.

@AlexAltea

AlexAltea May 21, 2018

Contributor

@ras0219-msft No worries, thanks for the export --nuget hint. :-)

Include paths are not working with the exported package. This are the steps to repeat it:

  1. vcpkg export keystone:x86-windows keystone:x64-windows --nuget --nuget-id=KeystoneEngine --nuget-version=0.9.1
  2. Update the *.nupkg file just to add a description, icon, etc. to the *.nuspec file.
  3. Create a new VS solution+project, and install the *.nupkg file from a local repository.
  4. Add #include <keystone/keystone.h> and build the project.

Somehow MSVC complains that the header wasn't found, regardless of {Debug,Release} {Win32,x86,x64} modes. This is strange considering that the *.vcxproj file automatically added the packages\keystoneengine.0.9.1\build\native\keystoneengine.targets file which already takes care of handling include/lib paths.

keystoneengine.0.9.1.nupkg.zip

This comment has been minimized.

@raphaelning

raphaelning May 22, 2018

Contributor

I noticed something last week, although I'm not sure if it's related to this problem: HaxmDriver.sln uses build configurations like Win7 Debug, Win7 Release, etc., but not Debug/Release. However, vcpkg seems to look for the latter:

> msbuild HaxmDriver.sln /p:Configuration="Win7 Debug" /p:Platform="x64"
...
VcpkgTripletSelection:
  Using triplet "x64-windows" from "<redacted>\vcpkg\installed\x64-windows\"
  Vcpkg is unable to link because we cannot decide between Release and Debug libraries. Please define the property VcpkgConfiguration to be 'Release' or 'Debug' (currently 'Win7 Debug').
...

Although these warnings do not fail the build.

This PR has actually added Debug and Release to HaxmDriver.sln, but I don't think they are usable now. Probably the right thing to do is map them to Win7 Debug and Win7 Release, respectively. Then I guess our VS solution will work better with vcpkg.

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 14, 2018

I just tested this PR on Windows 10 x64. It seems there's still some debugging work to do ;-)

  1. I also see the Debian boot menu as in your screenshot, but after I make a selection, the guest is stuck and never boots to shell or GUI.
  2. The boot ROM test (qemu-system-x86_64 -accel hax) fails: the QEMU VGA screen stays black.
  3. Android Emulator also boots to a hang (black screen).
@raphaelning
Copy link
Contributor

raphaelning left a comment

So it seems we need to make NASM add the _ prefix?

It turns out that's pretty easy to do and indeed fixes the kext load error :)

I just tested Debian i386 boot on Mac and got the same hang after seeing the GRUB menu.

outputFiles = (
"$(DERIVED_FILE_DIR)/${INPUT_FILE_BASE}.o",
);
script = "/usr/local/bin/nasm -f macho64 ${INPUT_FILE_PATH} -o ${SCRIPT_OUTPUT_FILE_0}\n";

This comment has been minimized.

@raphaelning

raphaelning May 15, 2018

Contributor
  1. Add --prefix _ to generate symbol names with a leading underscore, so as to avoid symbol resolution errors when we load intelhaxm.kext. (Reference)
  2. nit: Move the required parameter (input file) to the end.

In summary:

script = "/usr/local/bin/nasm -f macho64 --prefix _ -o ${SCRIPT_OUTPUT_FILE_0} ${INPUT_FILE_PATH}\n";
} sib;
} em_context_t;

em_status_t __stdcall em_decode_insn(struct em_context_t *ctxt, const uint8_t *insn);

This comment has been minimized.

@raphaelning

raphaelning May 21, 2018

Contributor

The __stdcall decoration introduces a build warning on macOS:

warning: calling convention '__stdcall' ignored for this target [-Wignored-attributes]

So basically, do we have to decorate any C function that calls any of the ASM routines? Why?

This comment has been minimized.

@AlexAltea

AlexAltea May 21, 2018

Contributor

The problem is that haxlib.vcxproj and IntelHaxm.vcxproj are stdcall by default, and tests.vcxproj is cdecl by default.

As mentioned earlier, the solution would be to explicitly stating the correct calling convention. However, changing tests.vcxproj to stdcall causes an issue with Google Test, and changing haxlib/IntelHaxm to cdecl causes an issue with MASM code.

The only solution left was explicitly stating that these two emulator functions are stdcall, but that's a suboptimal fix. I'm still thinking how to properly deal with this situation.

This comment has been minimized.

@raphaelning

raphaelning May 22, 2018

Contributor

Thanks. IIUC, your other proposal about replacing existing MASM code with NASM should be a proper fix. But for this PR, we can accept the current solution. The only issue is the macOS build warning, which I think can be fixed by something like:

#ifdef __i386__
#define EMCALL __stdcall
#else  // !__i386__
#define EMCALL
#endif  // __i386__

em_status_t EMCALL em_decode_insn(..);
em_status_t EMCALL em_emulate_insn(..)
@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 21, 2018

One thing we should investigate in advance is whether NASM supports VMX instructions, e.g. VMLAUNCH, INVEPT, etc.,

Yes, they are supported by NASM: https://www.nasm.us/xdoc/2.13.03/html/nasmdocb.html#section-B.1.13

EDIT: As of 8acf0a6, NASM is a NuGet dependency. Meaning that as soon as you press "Build solution", Visual Studio will download and add build customizations automatically, if needed. I had to create another package for NASM: https://www.nuget.org/packages/nasm2/, since the original one is outdated, broken and unmaintained (the owners don't reply). Not sure how NuGet works with EWDK (I've never used it).

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 22, 2018

Wow, I've been testing the latest PR (8acf0a6), and so far I haven't run into any issue (in terms of functionality). In particular, the Android Emulator issues that I saw yesterday are gone!

I don't know how much of this massive improvement is due to the use of VMCS GPA (e6ff1d3), and how much due to other changes ;-) It should be pretty easy to find out, if you restore the GPA translation step and print a warning when the result doesn't match the cached GPA.

We'll run more tests in the next few days. Meanwhile, let's try to finalize the code so you can start reorganizing the commits as planned. I'll post more review comments.

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 22, 2018

Yes, they are supported by NASM: https://www.nasm.us/xdoc/2.13.03/html/nasmdocb.html#section-B.1.13

Great! Currently, INVEPT is not implemented for Win32, since it is not supported by Win32 MASM. So rewriting the existing assembly code in NASM has the extra benefit of getting that fixed :-)

As of 8acf0a6, NASM is a NuGet dependency. Meaning that as soon as you press "Build solution", Visual Studio will download and add build customizations automatically

This works perfectly, thanks! Previously, I had some trouble with the VSNASM installation script, and had to resort to manual installation.

Not sure how NuGet works with EWDK (I've never used it).

Once I've resolved the NuGet dependencies in VS2017, I can also build the solution with EWDK, so it was really easy. It seems possible to manually install a NuGet dependency from outside VS2017 as well, using the NuGet CLI tool, and I'll give it a try. Either way, if we leave aside the optional tests project, where the Keystone dependency is yet to be converted to NuGet, setting up the build environment for this PR is a smooth process now.

if (opcode->flags & INSN_TWOMEM) {
return true;
}
if (opcode->flags & INSN_STRING && ctxt->rep) {

This comment has been minimized.

@raphaelning

raphaelning May 22, 2018

Contributor

Why do REP instructions require translation? E.g. if the destination of REP STOS is an MMIO region, we'll get as many EPT violations as the repetition count (*CX), right? Each time we should be able to just use the VMCS GPA instead of translating ES:*DI to GPA, no?

This comment has been minimized.

@AlexAltea

AlexAltea May 22, 2018

Contributor

Why do REP instructions require translation?

Because with each iteration ES:*DI changes (+= (!DF ? +1 : -1) * ctxt->address_size), we will have access memory at different GPA's, but the optimization only works when the instruction accesses a single GPA. Additionally these GPAs can't be trivially computed by just increasing/decreasing the first GPA, since we might hit different pages.

This comment has been minimized.

@raphaelning

raphaelning May 22, 2018

Contributor

I see. I was assuming we would resume guest execution at each iteration, which is what the old instruction emulator used to do. But it seems we are now bouncing back and forth between QEMU and the new emulator without entering guest mode at all, until we finish emulating the entire REP instruction. The two approaches differ in how many times we ask the hardware to execute the instruction for us (previously *CX times, now only once). Both work, but the new approach, as I now realize, is exactly the design you illustrated (i.e. the little arrows) ;-)

core/vcpu.c Outdated
len++;
if (has_sib) {
len++;
if (is_mmio_address(vcpu, pa)) {

This comment has been minimized.

@raphaelning

raphaelning May 22, 2018

Contributor

nit: When we skip translation, we should skip this check as well, or assert it is true, because we assume the GPA is an MMIO address.

core/vcpu.c Outdated
vcpu_translate(vcpu, ea, 0, &pa, NULL, false);
}

if (is_mmio_address(vcpu, pa)) {

This comment has been minimized.

@raphaelning

raphaelning May 22, 2018

Contributor

nit: Same here.

G(opcode_group1, op_modrm_rm, op_imm, op_none, INSN_BYTEOP),
G(opcode_group1, op_modrm_rm, op_imm, op_none, 0),
G(opcode_group1, op_modrm_rm, op_imm, op_none, INSN_BYTEOP),
G(opcode_group1, op_modrm_rm, op_simm8, op_none, 0),

This comment has been minimized.

@raphaelning

raphaelning May 22, 2018

Contributor

I checked all 8 instructions in this group. 0x80-0x83 all require the immediate operand to be sign-extended to the operand size. So instead of op_simm8, we probably need op_simm?

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 22, 2018

I've published a Keystone package at NuGet: https://www.nuget.org/packages/keystoneengine/.
Now building on Windows, including tests, is completely straightforward.

Regarding your earlier comment: While it's true that we have Configurations that don't match naming conventions expected by vcpkg.targets, the project tests.vcxproj (which is the only consumer of Keystone) only uses Debug and Release configurations, so we are good.

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 23, 2018

Now building on Windows, including tests, is completely straightforward.

Indeed, thanks! This time I didn't rely on VS2017 to resolve the new NuGet dependency. It's very simple:

  1. Download nuget.exe from https://www.nuget.org/downloads (no installation required).
  2. Run \path\to\nuget.exe restore from the root of the HAXM source tree.

NuGet basically creates a packages folder in the HAXM source tree, and installs the dependencies there. That's a clean approach, which works for both VS and EWDK.

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 23, 2018

The new patch looks good, thanks! The only remaining issue is the Clang warning about __stdcall being ignored for Mac (x86_64) build. I've suggested a possible fix:

#42 (comment)

Could you take care of it? Then we can go ahead with the Git history revamp.

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 23, 2018

I've suggested a possible fix:

Sorry, with a PR as massive in number of comments, sometimes I lose track of your replies (and sometimes this page even fails to load). :-)

Could you take care of it? Then we can go ahead with the Git history revamp.

Sure! I'll do it right now and proceed to cleanup the history. If you find any other issue, I could always amend the very last commit and force-push it.

@raphaelning
Copy link
Contributor

raphaelning left a comment

Thanks. It turns out there are 2 more warnings, due to the use of __stdcall in emulate.c, where these functions are defined. Could you fix those with EMCALL as well?

[...] and sometimes this page even fails to load

Same here! I've seen the pink unicorn a lot lately. Let's wrap this PR up ;-)

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 23, 2018

It turns out there are 2 more warnings, due to the use of __stdcall in emulate.c

Actually, there's no need to fix this now if you've already started cleaning up the history. Please feel free to address it in one of the final commits.

@AlexAltea AlexAltea force-pushed the AlexAltea:master branch from a89d572 to 029e876 May 23, 2018

AlexAltea added some commits May 23, 2018

Update solution to VS2017
Additionally, extending .gitignore with entries for the corresponding IDE/OS-generated temporary files.

@AlexAltea AlexAltea force-pushed the AlexAltea:master branch from 029e876 to 274e80b May 23, 2018

@AlexAltea

This comment has been minimized.

Copy link
Contributor

AlexAltea commented May 23, 2018

@raphaelning Done! :-)

I've checked that every commit along the way builds successfully on 32-bit and 64-bit mode (under Windows, but macOS should be fine as well), so that we don't break git bisect or anything.

@raphaelning

This comment has been minimized.

Copy link
Contributor

raphaelning commented May 24, 2018

Thank you so much! The final patches look great, and there's no warning from the macOS build. I'll proceed with the merge.

@raphaelning raphaelning merged commit 9f05a23 into intel:master May 24, 2018

raphaelning added a commit that referenced this pull request May 24, 2018

README.md: Update build instructions
Update the instructions for setting up the build environment and
running the build, which have changed a lot due to #42.

raphaelning added a commit that referenced this pull request May 24, 2018

README.md: Update build instructions
Update the instructions for setting up the build environment and
running the build, which have changed a lot due to #42.

raphaelning added a commit that referenced this pull request May 24, 2018

README.md: Update build instructions
Update the instructions for setting up the build environment and
running the build, which have changed a lot due to #42.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment