Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange invalid 16-bit optimization #140

Open
joshop opened this issue Dec 8, 2023 · 1 comment
Open

Strange invalid 16-bit optimization #140

joshop opened this issue Dec 8, 2023 · 1 comment

Comments

@joshop
Copy link

joshop commented Dec 8, 2023

I ran across this while working on my Millfork NES game, and initially patched it up with some inline assembly because I wanted to keep going, but it's getting to the point of being annoying. I can try to create a minimal reproducible example if that's necessary, but this is my code:

// i'm using addresses for these so my emulator script for drawing a debug overlay knows where to find them
array(word) enemy_xs[12] align(fast) @ $a8
array enemy_ys[12] @ $62
array enemy_types[12] @ $580
array enemy_healths[12] @ $ca
array enemy_data1[12] @ $f1
array enemy_data2[12] @ $6f
array enemy_data3[12] @ $1b
array enemy_tags[12] @ $9d
byte enemy_count @ $9a
// [...]
// part of the code for enemies dying
if (enemy_healths[i] == 0 && enemy_flags[enemy_types[i]] & $20 == 0) {
    create_effect($80, enemy_xs[i], enemy_ys[i], $20, 0) // enemy dying from damage
}
destroy_enemy(i)
i += 1

On optimization level 1 (which I don't want to use unless I have to), those arguments are compiled into

    LDA #$80                                                                                                                                                                                    
    STA create_effect$type                                                                                                                                                                      
    LDA main$i                                                                                                                                                                                  
    ASL                                                                                                                                                                                         
    TAY                                                                                                                                                                                         
    LDA $A8, Y                                                                                                                                                                                  
    INY                                                                                                                                                                                         
    STA create_effect$x                                                                                                                                                                         
    LDA $A8, Y                                                                                                                                                                                  
    STA create_effect$x + 1                                                                                                                                                                     
    LDY main$i                                                                                                                                                                                  
    LDA $62, Y                                                                                                                                                                                  
    STA create_effect$y                                                                                                                                                                         
    LDA #$20
    STA create_effect$timer
    LDA #0
    STA create_effect$flags

On O2 and up, we instead get

    LDA main$i
    ASL
    TAY
    LDA $A8, Y
    STA create_effect$x + 1    
    STA create_effect$x
    LDY main$i
    LDA $62, Y
    STA create_effect$y
    LDA #$80

Some of the arguments are optimized into registers (yay!) but the sketchy thing here is those two STA back to back. The low and high bytes of entries of enemy_xs are NOT the same. This has shown up in other situations before involving loading from array of words.

@joshop
Copy link
Author

joshop commented Dec 20, 2023

I'm hesitant to make a PR since I don't understand most of the source (nor do I know Scala), but I looked into it and it seems to be trying a "Double load to different registers" optimization (it initially uses A and X respectively for each byte of the value, and removes that later). Some of the rules under that optimization have DoesntChangeIndexingInAddrMode as part of them, but the one that it uses here doesn't; it initially uses an INY between the loads to move to the high byte, but doesn't realize that the INY changes the address on that load. It eventually removes the INY after removing the second LDA and it becomes this. If you add in the DoesntChangeIndexingInAddrMode, it stops doing this mistaken optimization on a small reproduction example I made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant