x86_64: Implement integer saturating left shifting codegen #22529

xtexx · 2025-01-18T14:06:03Z

Simliarly to shl_with_overflow, we first SHL/SAL the integer, then SHR/SAR it back to compare if overflow happens.
If overflow happened, set result to the upper limit to make it saturating.

Theoretically, if the left shifting instruction is lowered as a single SHL/SAL opcode, and the left operand fits into a register (so no truncation is needed), the CF flag can be used to check for overflow. However the optimization is not implemented right now (for my laziness).

Bug: #17645

xtexx · 2025-01-19T03:26:09Z

What's the CI error. It seems that I didn't touch that region of code.

mlugg · 2025-01-19T03:32:21Z

Don't mind that failure -- it's an inconsistent issue which is being debugged as I type this. Once it's fixed, a rebase should solve that error.

jacobly0 · 2025-01-19T03:43:40Z

I don't like that this introduces a miscomp, I would rather the backend error on unsupported types rather than silently produce incorrect code.

xtexx · 2025-01-19T04:31:29Z

@jacobly0 which unsupported types? there is already a compiler error when unsupported type is hit.

https://github.com/ziglang/zig/blob/ec2df79cbf25e53c79d53dd6d461bd9d7457bd85/src/arch/x86_64/CodeGen.zig#L12944-L12946

jacobly0 · 2025-01-19T17:33:37Z

$ cat check.zig
const std = @import("std");

fn shlSat(x: anytype, y: std.math.Log2Int(@TypeOf(x))) @TypeOf(x) {
    return x <<| y;
}

fn testType(comptime T: type) bool {
    var ok = true;
    comptime var rhs: std.math.Log2Int(T) = 0;
    inline while (true) : (rhs += 1) {
        comptime var lhs: T = std.math.minInt(T);
        inline while (true) : (lhs += 1) {
            ok = shlSat(lhs, rhs) == lhs <<| rhs and ok;
            if (lhs == std.math.maxInt(T)) break;
        }
        if (rhs == @bitSizeOf(T) - 1) break;
    }
    return ok;
}

pub fn main() void {
    var ok = true;
    ok = testType(i2) and ok;
    ok = testType(u2) and ok;
    ok = testType(i3) and ok;
    ok = testType(u3) and ok;
    ok = testType(i4) and ok;
    ok = testType(u4) and ok;
    std.debug.print("{s}\n", .{if (ok) "ok" else "bad"});
}
$ zig run -fllvm check.zig
ok
$ zig run -fno-llvm check.zig
bad

xtexx · 2025-01-20T02:06:18Z

@jacobly0 ahh! Sorry for my mistake and thanks for your careful review!

I forgot that negative values need the minimum value to saturate and it seems that the saturating_arithmetic.zig behavior test is not strong enough. The force-push above should solve the error and the check you have given is passing successfully.

However, x86 does not have a saturating shifting instruction so we have to emit up to two shifting and two conditional branches, totally ~10 MIR instructions, I doubt if the shl_sat grammar and IR is useful, as it seems to be a pretty rare case.

jacobly0

Looks good enough to get in the next release, if I don't manage to finish the rewrite before then.

src/arch/x86_64/CodeGen.zig

jacobly0 · 2025-01-28T09:13:32Z

While rewriting comparisons, I have added Temp.cmpInts in #22637. If you want, you can see other uses in the file for how to integrate it with other legacy code. It handles turning any passed in condition into a functional result flag.

jacobly0

A missing instruction is causing a miscomp of:

const std = @import("std");
var lhs: i256 = -0x53d4148cee74ea43477a65b3daa7b8fdadcbf4508e793f4af113b8d8da5a7eb6;
var rhs: u8 = 0x91;
pub fn main() void {
    std.debug.print("{} <<| {} == {}\n", .{ lhs, rhs, lhs <<| rhs });
}

$ zig run -fllvm repro.zig
-37916679844576018420322620862767642147218730393747648456860519430618437287606 <<| 145 == -57896044618658097711785492504343953926634992332820282019728792003956564819968
$ zig run -fno-llvm repro.zig
-37916679844576018420322620862767642147218730393747648456860519430618437287606 <<| 145 == 57896044618658097711785492504343953926634992332820282019728792003956564819967

Fixed by:

@@ -88544,7 +88548,7 @@ fn genShiftBinOpMir(
 ) !void {
     const pt = self.pt;
     const zcu = pt.zcu;
-    const abi_size: u32 = @intCast(lhs_ty.abiSize(zcu));
+    const abi_size: u31 = @intCast(lhs_ty.abiSize(zcu));
     const shift_abi_size: u32 = @intCast(rhs_ty.abiSize(zcu));
     try self.spillEflagsIfOccupied();
 
@@ -88728,7 +88732,17 @@ fn genShiftBinOpMir(
                 .immediate => {},
                 else => self.performReloc(skip),
             }
-        }
+        } else try self.asmRegisterMemory(.{ ._, .mov }, temp_regs[2].to64(), .{
+            .base = .{ .frame = lhs_mcv.load_frame.index },
+            .mod = .{ .rm = .{
+                .size = .qword,
+                .disp = switch (tag[0]) {
+                    ._l => lhs_mcv.load_frame.off,
+                    ._r => lhs_mcv.load_frame.off + abi_size - 8,
+                    else => unreachable,
+                },
+            } },
+        });
         switch (rhs_mcv) {
             .immediate => |shift_imm| try self.asmRegisterImmediate(
                 tag,

Also, there is no handling for shift amounts larger than the bit width:

const std = @import("std");
var lhs: u8 = 0xab;
var rhs: u8 = 32;
pub fn main() void {
    std.debug.print("{} <<| {} == {}\n", .{ lhs, rhs, lhs <<| rhs });
}

$ zig run -fllvm repro.zig
171 <<| 32 == 255
$ zig run -fno-llvm repro.zig
171 <<| 32 == 171

Normally I would suggest looking at the llvm backend lowering, but even that miscomps. As usual, adding a check for this case to emit an error instead of a miscomp would make this PR mergeable.

src/arch/x86_64/CodeGen.zig

xtexx · 2025-03-01T13:46:03Z

Applied you fix for the missing instruction. Thank you!
While RHS may not be an immediate/literal constant, how can the miscomp case be checked at compile-time?

jacobly0 · 2025-03-01T13:57:38Z

Based on x86_64 semantics, you have only implemented the following cases:

lhs_bits <= 32 and rhs_bits <= 5
lhs_bits > 32 and lhs_bits <= 64 and rhs_bits <= 6
rhs_bits <= log2(lhs_bits)

xtexx · 2025-03-01T14:05:49Z

The genShiftBinOpMir diff mentioned above sometimes fails to encode:

error(x86_64_encoder): no encoding found for: none mov m128 imm8s none none

jacobly0 · 2025-03-01T14:07:23Z

That's unrelated to my diff, the instruction I added is neither .size = .xword nor contains an immediate.

xtexx · 2025-03-01T14:13:09Z

oops, sorry. It is my mistake.

src/arch/x86_64/CodeGen.zig

jacobly0 · 2025-03-02T00:22:33Z

Just a heads up, given all of the miscomps I have already found, this is going to need more behavior test coverage than just enabling an already existing test. At least regression tests covering the examples that have been mentioned should be added near the enabled behavior test.

xtexx · 2025-03-02T01:02:42Z

Just out of curiosity, how do you find some many cases (which looked like random to me) for testing?

jacobly0 · 2025-03-02T02:11:28Z

The bits are just copied from other random values, not very relevant to the miscomps. I just tried various combinations of type widths, signedness, and shift amounts that either overflow or not.

jacobly0

Otherwise looks good, I know some people have been waiting for this operation for a while.

test/behavior/bit_shifting.zig

Simliarly to shl_with_overflow, we first SHL/SAL the integer, then SHR/SAR it back to compare if overflow happens. If overflow happened, set result to the upper limit to make it saturating. Bug: ziglang#17645 Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com> Signed-off-by: Bingwu Zhang <xtex@aosc.io>

Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com> Signed-off-by: Bingwu Zhang <xtex@aosc.io>

xtexx force-pushed the x86-64/shl-sat-int branch from c87f153 to ebace39 Compare January 18, 2025 14:08

xtexx changed the title ~~x86_64: Implement integer saturing left shifting codegen~~ x86_64: Implement integer saturating left shifting codegen Jan 18, 2025

xtexx force-pushed the x86-64/shl-sat-int branch from ebace39 to ec2df79 Compare January 18, 2025 15:21

andrewrk requested a review from jacobly0 January 19, 2025 03:29

xtexx force-pushed the x86-64/shl-sat-int branch from ec2df79 to 9494329 Compare January 20, 2025 01:53

jacobly0 requested changes Jan 20, 2025

View reviewed changes

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

xtexx force-pushed the x86-64/shl-sat-int branch 2 times, most recently from 0f90ddb to 0d040ec Compare January 20, 2025 06:56

xtexx force-pushed the x86-64/shl-sat-int branch from 0d040ec to 7206677 Compare March 1, 2025 05:58

xtexx requested a review from jacobly0 March 1, 2025 06:00

jacobly0 requested changes Mar 1, 2025

View reviewed changes

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

xtexx force-pushed the x86-64/shl-sat-int branch 2 times, most recently from 788831b to bf4615d Compare March 1, 2025 14:40

jacobly0 reviewed Mar 1, 2025

View reviewed changes

src/arch/x86_64/CodeGen.zig Outdated Show resolved Hide resolved

xtexx force-pushed the x86-64/shl-sat-int branch from bf4615d to 4051039 Compare March 1, 2025 15:31

jacobly0 reviewed Mar 2, 2025

View reviewed changes

src/arch/x86_64/CodeGen.zig Show resolved Hide resolved

xtexx force-pushed the x86-64/shl-sat-int branch from 4051039 to 6b824df Compare March 2, 2025 01:07

jacobly0 approved these changes Mar 2, 2025

View reviewed changes

test/behavior/bit_shifting.zig Outdated Show resolved Hide resolved

xtexx force-pushed the x86-64/shl-sat-int branch from 6b824df to c2692a7 Compare March 2, 2025 02:30

xtexx and others added 2 commits March 2, 2025 10:30

x86_64: add behavior tests for saturating shift left

1da909a

Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com> Signed-off-by: Bingwu Zhang <xtex@aosc.io>

xtexx force-pushed the x86-64/shl-sat-int branch from c2692a7 to 1da909a Compare March 2, 2025 02:30

jacobly0 enabled auto-merge March 2, 2025 02:32

jacobly0 merged commit a6525c1 into ziglang:master Mar 2, 2025
9 checks passed

xtexx deleted the x86-64/shl-sat-int branch March 15, 2025 01:38

Uh oh!

x86_64: Implement integer saturating left shifting codegen #22529

x86_64: Implement integer saturating left shifting codegen #22529

Uh oh!

Conversation

xtexx commented Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xtexx commented Jan 19, 2025

Uh oh!

mlugg commented Jan 19, 2025

Uh oh!

jacobly0 commented Jan 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xtexx commented Jan 19, 2025

Uh oh!

jacobly0 commented Jan 19, 2025

Uh oh!

xtexx commented Jan 20, 2025

Uh oh!

jacobly0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jacobly0 commented Jan 28, 2025

Uh oh!

jacobly0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xtexx commented Mar 1, 2025

Uh oh!

jacobly0 commented Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xtexx commented Mar 1, 2025

Uh oh!

jacobly0 commented Mar 1, 2025

Uh oh!

xtexx commented Mar 1, 2025

Uh oh!

Uh oh!

Uh oh!

jacobly0 commented Mar 2, 2025

Uh oh!

xtexx commented Mar 2, 2025

Uh oh!

jacobly0 commented Mar 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacobly0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xtexx commented Jan 18, 2025 •

edited

Loading

jacobly0 commented Jan 19, 2025 •

edited

Loading

jacobly0 commented Mar 1, 2025 •

edited

Loading

jacobly0 commented Mar 2, 2025 •

edited

Loading