Added Intel x86-64 encodings for 64bit loads and store instructions #127

denismerigoux · 2017-07-31T17:21:24Z

Encoding recipes factored with 32 bits instructions, corresponding tests added.

stoklund

Thanks for working on this!

It generally looks good, you just need to add 32-bit tests and deal with the weirdness around the ABCD constraints.

stoklund · 2017-07-31T17:26:59Z

lib/cretonne/meta/base/instructions.py

@@ -271,7 +271,7 @@
        'istore16', r"""
        Store the low 16 bits of ``x`` to memory at ``p + Offset``.

-        This is equivalent to ``ireduce.i16`` followed by ``store.i8``.
+        This is equivalent to ``ireduce.i16`` followed by ``store.i16``.


You're welcome.

stoklund · 2017-07-31T17:32:48Z

filetests/isa/intel/binary64.cton

+    ; asm: movzwl (%rcx), %edi
+    [-,%rdi]            v402 = uload16.i32 v1   ; bin: 40 0f b7 39
+    ; asm: movzwl (%rsi), %edx
+    [-,%rdx]            v403 = uload16.i32 v2   ; bin: 40 0f b7 16


I think the loads producing a 32-bit value should be in the %I32 function below.

Here, you want to test the loads producing 64-bit values: uload32.i64 and uload16.i64.

Oups, my mistake.

stoklund · 2017-07-31T17:33:48Z

filetests/isa/intel/binary64.cton

+    ; asm: movq (%rcx), %rdi
+    [-,%rdi]            v460 = load.i64 v1      ; bin: 48 8b 39
+    ; asm: movq (%rsi), %rdx
+    [-,%rdx]            v461 = load.i64 v2      ; bin: 48 8b 16


I think it's best to keep the value numbers monotonically increasing, if possible.

Indeed, I'll do a final value renumbering pass when all modifications are done.

Thanks. It's fine to leave gaps, BTW.

stoklund · 2017-07-31T17:34:58Z

filetests/isa/intel/binary64.cton

+    ; asm: movsbq 50(%rcx), %rdi
+    [-,%rdi]            v418 = sload8.i64 v1+50         ; bin: 48 0f be 79 32
+    ; asm: movsbq -50(%rsi), %rdx
+    [-,%rdx]            v419 = sload8.i64 v2-50         ; bin: 48 0f be 56 ce


Exactly, this is the right set of instructions to test here. You should do the same for the no-displacement ones above.

stoklund · 2017-07-31T17:37:37Z

filetests/isa/intel/binary64.cton

+    ; asm: movsbq 50000(%rcx), %rdi
+    [-,%rdi]            v428 = sload8.i64 v1+50000         ; bin: 48 0f be b9 0000c350
+    ; asm: movsbq -50000(%rsi), %rdx
+    [-,%rdx]            v429 = sload8.i64 v2-50000         ; bin: 48 0f be 96 ffff3cb0


This is great, but please add similar tests for *.i32 instructions to the %I32 function below.

stoklund · 2017-07-31T17:38:42Z

lib/cretonne/meta/isa/intel/encodings.py

+    """
+    Add encodings for `inst.i32` to I32.
+    Add encodings for `inst.i32` to I64 with and without REX.
+    Add encodings for `inst.i64` to I64 with a REX.W prefix.


This is a good idea. Please add an explanation for the w_bit argument.

stoklund · 2017-07-31T17:40:18Z

lib/cretonne/meta/isa/intel/encodings.py

+
+I64.enc(base.istore32.i64.any, *r.st.rex(0x89))
+I64.enc(base.istore32.i64.any, *r.stDisp8.rex(0x89))
+I64.enc(base.istore32.i64.any, *r.stDisp32.rex(0x89))


This is a nice simplification. Very cool!

Do you think it makes sense to handle the three types of displacement that way too?

I didn't understand your question. Is there another way of handling the three types of displacement?

All I meant is that the code is very repetitive, so maybe a function call could reduce the redundancy. It's not necessary.

stoklund · 2017-07-31T17:56:17Z

lib/cretonne/meta/isa/intel/encodings.py

+
+enc_i32_i64_ld_st(base.istore8, 0, r.st_abcd, 0x88)
+enc_i32_i64_ld_st(base.istore8, 0, r.stDisp8_abcd, 0x88)
+enc_i32_i64_ld_st(base.istore8, 0, r.stDisp32_abcd, 0x88)


Aha, so this part is extra tricky because you're using 8-bit registers.

In 32-bit mode, and in 64-bit instructions without a REX prefix, the following 8-bit registers can be addressed: AL, CL, DL, BL, AH, CH, DH, BH. The last 4 are the "high" byte registers, corresponding to bits 8-15 of the ABCD GPRs.

In 64-bit mode with a REX prefix, the accessible 8-bit registers are: AL, CL, DL, BL, SPL, BPL, SIL, DIL, R8B-R15B. Note the difference in registers 4-7 — they are the low bytes of GPRs instead of the high bytes of ABCD.

In Cretonne, we don't model the high-byte registers. They are never used. This is the reason the I32 byte store instructions have an ABCD register constraint instead of GPR like the other instructions. When storing a byte, the value to store must be in one of the ABCD registers because without a REX prefix, we can't store the low byte of %esi for example.

However, with a REX prefix, a byte store can access the low byte of all 16 registers, so we just want a normal GPR register constraint which is much easier for the register allocator to satisfy.

The end result is:

When accessing 8-bit registers without a REX prefix, use the ABCD register constraint.

When accessing 8-bit registers with a REX prefix, use the GPR register constraint.

Ok, got it. I won't use enc_i32_i64_ld_st for istore8 but rather declare the I32.enc and I64.enc directly with the correct recipes.

sunfishcode · 2017-07-31T18:11:52Z

In the binary64.cton, several instructions appear to have REX prefixes that don't require them, like this:

; asm: movl %ecx, (%rsi)
istore32 v1, v2                             ; bin: 40 89 0e

Is this a known limitation?

stoklund · 2017-07-31T18:13:58Z

Removing redundant REX prefixes is treated as an optimization in Cretonne. Such an instruction will have two valid encodings: With and without a REX prefix. The encoding process goes like this:

Before register allocation: Pick the first valid encoding for each instruction. The encodings should be arranged such that this constraints the register allocator as little as possible.
After register allocation: Pick the smallest valid encoding for each instruction that has register constraints compatible with the register allocator's choices.

This approach means that redundant REX prefixes are removed after register allocation. On ARM, Thumb compression works the same way.

The test binemit mode could test either choice. Currently it picks the first valid encoding, which is the encoding that would be used before register allocation. You could argue that it shold pick the compressed encoding instead since it knows the register assignments.

stoklund · 2017-07-31T18:19:20Z

Ideally, I would like to be able to test both:

; asm: movl %ecx, (%rsi)
istore32 v1, v2                             ; bin[early]: 40 89 0e
                                            ; bin[late]:  89 0e

Fixed testing of 64bit intel encoding

denismerigoux · 2017-07-31T20:24:24Z

Tell me if it's good before merging and I will renumber the values in binary64.cton.

stoklund · 2017-07-31T20:26:23Z

Looks good!

denismerigoux · 2017-07-31T20:30:10Z

For 64bits instructions for which the REX prefix is optional (i.e. the REX.W bit is set to 0), should I generate only the encoding with REX prefix, only the encoding without or both ?

stoklund · 2017-07-31T20:32:11Z

You should generate both, the first one without REX, and the second one with REX.

You will only be able to test the first encoding, per the comments above and #128.

denismerigoux · 2017-07-31T20:32:37Z

Ok, got it.

Value renumbering in binary64.cton

Added Intel x86-64 encodings for 64bit loads and store instructions

d402c07

stoklund suggested changes Jul 31, 2017

View reviewed changes

stoklund mentioned this pull request Jul 31, 2017

Test compressed encodings #128

Closed

Using GPR registers instead of ABCD for istore8 with REX prefix

296bd2a

Fixed testing of 64bit intel encoding

Emit REX and REX-less encodings for optional REX prefix

fb7ba9d

Value renumbering in binary64.cton

denismerigoux force-pushed the load-store-64 branch from 14a0f1f to fb7ba9d Compare July 31, 2017 21:09

stoklund approved these changes Jul 31, 2017

View reviewed changes

stoklund merged commit b74723c into bytecodealliance:master Jul 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Intel x86-64 encodings for 64bit loads and store instructions #127

Added Intel x86-64 encodings for 64bit loads and store instructions #127

denismerigoux commented Jul 31, 2017

stoklund left a comment

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017

stoklund Jul 31, 2017

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017 •

edited

Loading

stoklund Jul 31, 2017

stoklund Jul 31, 2017

denismerigoux Jul 31, 2017

sunfishcode commented Jul 31, 2017

stoklund commented Jul 31, 2017

stoklund commented Jul 31, 2017

denismerigoux commented Jul 31, 2017

stoklund commented Jul 31, 2017

denismerigoux commented Jul 31, 2017

stoklund commented Jul 31, 2017

denismerigoux commented Jul 31, 2017

Added Intel x86-64 encodings for 64bit loads and store instructions #127

Added Intel x86-64 encodings for 64bit loads and store instructions #127

Conversation

denismerigoux commented Jul 31, 2017

stoklund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

denismerigoux Jul 31, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sunfishcode commented Jul 31, 2017

stoklund commented Jul 31, 2017

stoklund commented Jul 31, 2017

denismerigoux commented Jul 31, 2017

stoklund commented Jul 31, 2017

denismerigoux commented Jul 31, 2017

stoklund commented Jul 31, 2017

denismerigoux commented Jul 31, 2017

denismerigoux Jul 31, 2017 •

edited

Loading