Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to avoid promoting to int (16-bit) when doing operations with char (8-bit) #14

Open
hikari-no-yume opened this issue Jun 12, 2023 · 0 comments

Comments

@hikari-no-yume
Copy link
Collaborator

Currently we do all arithmetic in 16-bit, even when the operands and destination are 8-bit (i.e. char types). This is a very single-pass-compiler and C thing to do (surely this is what the usual arithmetic conversions were designed for), but since uxn has native 8-bit operations and a limited stack size, it's neither efficient nor makes for particularly æsthetically-pleasing assembly. This will get worse if we start doing sign extension when promoting char to int (see #9).

So, it would be nice if (char)(some_char * 2 + 3) could be codegen'd as #02 MUL #03 ADD rather than #0002 MUL2 #0003 ADD2. As I see it there's two ways this could be done: in a “single-pass” fashion by changing the codegen step, or with some sort of later optimisation pass.

I am optimistic about the former approach. I think we could do it by propagating cast/conversion information downwards when doing codegen for expressions.

Currently the codegen behaviour is something like:

  • When encountering (char)(some_char * 2 + 3):
    • Recurse to generate (some_char * 2 + 3)
      • Recurse to generate some_char * 2
        • Recurse to generate some_char
          • Output code for loading some_char
          • Output code to extend to int (something like 00 SWP)
        • Recurse to generate 2
          • Output 0002
        • Output MUL2
      • Recurse to generate 3
        • Output 0003
      • Output ADD2
    • Output code to truncate to char (something like NIP)
    • Output code to extend to int (something like 00 SWP)

In the new system there would be a new flag used in expression codegen, something like truncate_to_byte. Now the behaviour would look something like:

  • When encountering (char)(some_char * 2 + 3):
    • Recurse to generate (some_char * 2 + 3) with truncate_to_byte set
      • Recurse to generate some_char * 2 with truncate_to_byte set
        • Recurse to generate some_char with truncate_to_byte set
          • Output code for loading some_char
          • Output code to extend to int (something like 00 SWP)
        • Recurse to generate 2 with truncate_to_byte set
          • Output 02 0002
        • Output MUL MUL2
      • Recurse to generate 3 with truncate_to_byte set
        • Output 03 0003
      • Output ADD ADD2
    • Output code to truncate to char (something like NIP)
    • Output code to extend to int (something like 00 SWP) but only if truncate_to_byte is not set

A more tricky case is something like (char)((1 << 8) >> 8), where we can't use 8-bit operations all the way down. That can be handled by not propagating truncate_to_byte when dealing with such operators.

An important thing to note here is that it's strictly a codegen optimisation: the type information isn't affected. I don't think it's possible to do the same trick during type assignment instead, it would break things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant