Skip to content

Add AoCO 2025 Day 06 Study Notes#36

Merged
gapry merged 3 commits intomainfrom
AoCO-2025-Day06
Feb 28, 2026
Merged

Add AoCO 2025 Day 06 Study Notes#36
gapry merged 3 commits intomainfrom
AoCO-2025-Day06

Conversation

@gapry
Copy link
Owner

@gapry gapry commented Feb 28, 2026

No description provided.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive study note document that explores the intricate ways modern compilers optimize integer division operations, particularly when the divisor is a power of two. It delves into the architectural specifics of x86 and AArch64, demonstrating how signed and unsigned division are transformed into efficient bitwise shifts and conditional logic to maintain C's defined behavior, offering valuable insights into low-level compiler optimizations.

Highlights

  • New Study Notes Added: A new document has been added detailing compiler optimizations for integer division, specifically focusing on how C's division by a power of two is translated into bit shift operations on x86 and AArch64 architectures.
  • Signed vs. Unsigned Division: The notes thoroughly explain the differences in optimization strategies for signed and unsigned integer division, highlighting the adjustments compilers make for negative numbers to ensure correct truncation towards zero.
  • Architecture-Specific Assembly Analysis: The document provides concrete examples and assembly output (llvm-objdump) for both x86 and AArch64, illustrating the sarl, shrl, asr, and lsr instructions, along with conditional moves (cmovnsl, csel) used for signed division.
Changelog
  • _posts/2026-02-28-Advent-of-Compiler-Optimisations-Study-Notes-06.md
    • Added new study notes on integer division compiler optimizations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@amazon-q-developer amazon-q-developer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds study notes for Day 6 of Advent of Compiler Optimisations 2025, covering division optimizations. I've identified several issues that need correction:

Critical Issues:

  • Logic error in line 160: The shift direction for shrl instruction is documented incorrectly as left shift (<<) when it should be right shift (>>)
  • Incorrect instruction name on line 132: References cmovens instead of cmovns
  • Incorrect register on line 209: Comment states w8 when it should reference w0

Minor Issues:

  • Three typos: "consdier" → "consider" (line 108), "filp" → "flip" (line 119), "Unsign" → "Unsigned" (line 136)

Please address these corrections to ensure the technical accuracy of the documentation.


You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.


Instruction:
```
shrl <imm>, <Rd> ; shr := Shift Right Logical, that is <Rd> = <Rd> << <imm>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛑 Logic Error: The shift direction is incorrect. shrl performs a right shift (>>), not a left shift (<<). This documentation error could mislead readers about how the instruction works.

Suggested change
shrl <imm>, <Rd> ; shr := Shift Right Logical, that is <Rd> = <Rd> << <imm>
shrl <imm>, <Rd> ; shr := Shift Right Logical, that is <Rd> = <Rd> >> <imm>

Instructions:
```
- add <Wd>, <Wn>, #imm ; w8 = w0 + 0x1ff
- cmp <Wn>, #imm ; Compares w8 with #0x0, and update the processor flags NZCV

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect register referenced in comment. The cmp instruction compares w0 (not w8) with #0x0.

Suggested change
- cmp <Wn>, #imm ; Compares w8 with #0x0, and update the processor flags NZCV
- cmp <Wn>, #imm ; Compares w0 with #0x0, and update the processor flags NZCV


Hence, the compiler does `(-1 + 511) / 512 = 510 / 512 = 510 >> 9 = 0`, we get the correct result.

In summary, the compiler use the `cmovens` and `test` instructions to detect whether `x` is positive.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect instruction name. The assembly output shows cmovnsl (Conditional Move if Not Sign), not cmovens.

Suggested change
In summary, the compiler use the `cmovens` and `test` instructions to detect whether `x` is positive.
In summary, the compiler use the `cmovns` and `test` instructions to detect whether `x` is positive.

To resolve this problem, the compiler adds `2^n - 1` to the negative number.
In this case, n = 9, hence it is `2^9 - 1 = 512 - 1 = 511 = 0x1FF`

Why `2^n - 1` ? Let's consdier `n = 9`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "consdier" should be "consider".

Suggested change
Why `2^n - 1` ? Let's consdier `n = 9`
Why `2^n - 1` ? Let's consider `n = 9`

0 1 1 1 1 1 1 1 1 1
```

It can help us to filp the bit of `x`. For example, `x = -1`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "filp" should be "flip".

Suggested change
It can help us to filp the bit of `x`. For example, `x = -1`
It can help us to flip the bit of `x`. For example, `x = -1`

If `x` is a positive number, shift it. Otherwise, use `2^n - 1` to create an `n` one mask.
Then we can use the shift operator to achieve the same result as we use the division operator.

## x86 Unsign Integer Division

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "Unsign" should be "Unsigned" for consistency with the previous section "x86 Signed Integer Division".

Suggested change
## x86 Unsign Integer Division
## x86 Unsigned Integer Division

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new blog post with study notes on compiler optimizations for integer division. The notes are well-structured and cover both x86 and ARM architectures. I've found a few typos, grammatical errors, and some technical inaccuracies in the explanations. My review includes suggestions to correct these issues to improve the clarity and correctness of the post. The most critical ones are an incorrect explanation of the shrl instruction and an inaccurate diagram for two's complement addition.

```
Position | 32 (Sign Bit) 10 1
| v v v
Carry | 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The 'Carry' row in this diagram appears to be incorrect and could be confusing. For example, when adding the least significant bits (1 + 1), the result is 0 with a carry of 1. However, your diagram shows a carry of 0 for this position. It might be clearer to either correct the carry propagation or remove this row entirely to avoid misleading readers.


Instruction:
```
shrl <imm>, <Rd> ; shr := Shift Right Logical, that is <Rd> = <Rd> << <imm>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The explanation for shrl is incorrect. It states that it's a left shift (<<), but shrl performs a logical right shift (>>). This is a critical error in the explanation that could confuse readers.

Suggested change
shrl <imm>, <Rd> ; shr := Shift Right Logical, that is <Rd> = <Rd> << <imm>
shrl <imm>, <Rd> ; shr := Shift Right Logical, that is <Rd> = <Rd> >> <imm>

To resolve this problem, the compiler adds `2^n - 1` to the negative number.
In this case, n = 9, hence it is `2^9 - 1 = 512 - 1 = 511 = 0x1FF`

Why `2^n - 1` ? Let's consdier `n = 9`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo here. 'consdier' should be 'consider'.

Suggested change
Why `2^n - 1` ? Let's consdier `n = 9`
Why `2^n - 1` ? Let's consider `n = 9`


Why `2^n - 1` ? Let's consdier `n = 9`

In binary, `2^9 - 1` creates the exactly 9 ones:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a minor grammatical error here. 'the exactly' should be 'exactly'.

Suggested change
In binary, `2^9 - 1` creates the exactly 9 ones:
In binary, `2^9 - 1` creates exactly 9 ones:

0 1 1 1 1 1 1 1 1 1
```

It can help us to filp the bit of `x`. For example, `x = -1`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo here ('filp' should be 'flip'). Additionally, the phrase 'flip the bit' might be misleading. This operation doesn't flip bits but adds a 'bias' to negative numbers to ensure the subsequent arithmetic right shift correctly truncates towards zero, matching C's division behavior.

Suggested change
It can help us to filp the bit of `x`. For example, `x = -1`
It can help us to flip the bit of `x`. For example, `x = -1`


Hence, the compiler does `(-1 + 511) / 512 = 510 / 512 = 510 >> 9 = 0`, we get the correct result.

In summary, the compiler use the `cmovens` and `test` instructions to detect whether `x` is positive.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are a couple of small issues here: a typo in the instruction name (cmovens should be cmovnsl or cmovns), and a grammatical error ('use' should be 'uses').

Suggested change
In summary, the compiler use the `cmovens` and `test` instructions to detect whether `x` is positive.
In summary, the compiler uses the `cmovnsl` and `test` instructions to detect whether `x` is positive.

If `x` is a positive number, shift it. Otherwise, use `2^n - 1` to create an `n` one mask.
Then we can use the shift operator to achieve the same result as we use the division operator.

## x86 Unsign Integer Division

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo in the section title. 'Unsign' should be 'Unsigned'.

Suggested change
## x86 Unsign Integer Division
## x86 Unsigned Integer Division

Instructions:
```
- add <Wd>, <Wn>, #imm ; w8 = w0 + 0x1ff
- cmp <Wn>, #imm ; Compares w8 with #0x0, and update the processor flags NZCV

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment for the cmp instruction is incorrect. It says it compares w8 with #0x0, but the instruction itself (cmp w0, #0x0) compares w0.

Suggested change
- cmp <Wn>, #imm ; Compares w8 with #0x0, and update the processor flags NZCV
- cmp <Wn>, #imm ; Compares w0 with #0x0, and update the processor flags NZCV

```

## References
- https://developer.arm.com/documentation/ddi0601/2025-12/AArch64-Registers/NZCV--Condition-Flags

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This link to the ARM documentation appears to be broken because of the future date 2025-12 in the URL, which causes a 404 error. You can fix this by replacing the date with latest to point to the most recent version of the documentation.

Suggested change
- https://developer.arm.com/documentation/ddi0601/2025-12/AArch64-Registers/NZCV--Condition-Flags
- https://developer.arm.com/documentation/ddi0601/latest/AArch64-Registers/NZCV--Condition-Flags

@gapry gapry merged commit c14f682 into main Feb 28, 2026
1 check passed
@gapry gapry deleted the AoCO-2025-Day06 branch February 28, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant