Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable ld -> ldh optimization #243

Closed
AntonioND opened this issue Feb 23, 2018 · 30 comments
Closed

Disable ld -> ldh optimization #243

AntonioND opened this issue Feb 23, 2018 · 30 comments

Comments

@AntonioND
Copy link
Member

@AntonioND AntonioND commented Feb 23, 2018

In the Game Boy you usually want to have as much control over the final output of the ROM as you can, specially in disassemblies. This automatic optimization at assembly time is pointless, there are only a couple of cases in which it is useful, and it's not even consistently useful (if you declare a label inside a HRAM section in a different object file, rgbasm can't optimize anything).

I'm thinking about removing it. I'm not sure if it is worth keeping a command line flag to disable it, doing that by default or simply remove it forever.

Opinions?

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 23, 2018

Since this would break legacy codebases, I'd hardly see the benefit. While it would certainly be useful to be able to opt out of this optimization, it would probably do more harm than good to do it for ld itself at this point — hence my request in #230.

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 23, 2018

In that case, I think that we should have a command line option to disable it in disassemblies. I really don't like the idea of adding aliases for instructions when the issue is not the mnemonic, but some strange behaviour of the tool. I'd rather be able to disable the strange behaviour.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 23, 2018

You could call the absolute to relative conversion in jr "a strange behavior of the tool" as well.

(And it's easy to solve all of these issues with macros, but it would be much better if we didn't have to.)

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 23, 2018

It's not really strange, it's what you expect as a developer. When you use jr with a label you expect it to jump to the label.

What's the real use-case for jr in your case? I can't see any right now. You can always define a label to be at a specific address.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 23, 2018

I've used db $18, $01 to skip over a return. Sure, you can define a label, but the code is cleaner without it at times. (Would jr @+1 work? Or is it jr @+3? The explicit form is clearer.)

And I've run into corner cases in macro definitions that would be handled much more cleanly without extra labels.

You normally want the conversion, but there are odd edge cases when you don't. You normally want ld [$ff80], a to be two bytes, but there are odd edge cases when you don't. It's the same problem.

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 23, 2018

I don't like that idea. A label is always clear. a jr @+3 forces you to think about the size of the following instructions. Maybe you think that removing the label makes the code nicer, but the person who has to read it after you has to spend more time that necessary to understand it.

Regarding the ld optimization, I just think that the assembler should be predictable. This kind of behaviours make it difficult to predict what's going to happen. Simply moving some code from one file to another could remove the optimization, for example. If you have a label in a HRAM section above a ROM section that uses the label, and you move the section to a different object file, the optimization will disappear.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 23, 2018

I think the argument regarding jr falls dangerously close to a "my coding style is better than yours" argument, which is an endless debate. It doesn't change that there are legitimate use cases for it (assuming the tools' purpose isn't to police coding style).

As far as the ld optimization goes, I thought it was wholly disabled for labels? That's why we don't use HRAM sections at all — we use constants (meaning that compilation would simply fail if we forgot to include them before they are used).

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 23, 2018

But the current tool doesn't prevent you from doing what you want. You can always use '@' or a label. In any case, using '@' is unmaintainable:

    jr z, skip
    ret
skip:
    inc a
    inc b

The previous one is obviously more clear than this:

    jr z, @+2
    ret
    inc a
    inc b

And if you want to change the number of instructions that it skips, you have to calculate it by hand, which is a very easy way to introduce bugs. And if you are taking a look at the code for the first time, you will have to think about it a bit before understanding where it is jumping. And when you understand it, you will start to think "why doesn't this code have a simple label that anyone can understand?".

Regarding the ld.

Symbols can be relocatable or not. A relocatable symbol won't be optimized, a constant one will. Something like this (all the code is in the same file) should make rgbasm optimize the ld:

    SECTION "HRAM", HRAM[$FF80]
variable: DS 1

    SECTION "ROM0", ROM0
function:
    ld a, [variable]
    ret

If you have the sections in two different object files, it won't be optimized.

That's why I want to disable this optimization, a simple reorganization of the code might make the assembler optimize them or not, which is inconsistent, and not obvious to someone using the tool.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 23, 2018

Regarding ld, the simpler solution would be to simply disable the optimization for symbols and expressions involving symbols. That would preserve backwards compatibility in most relevant cases.

Regarding jr, that discussion is fully based on coding style — and since there's no objective way to determine a better coding style, it's a mostly pointless debate. (What is unclear to you is clear to me; what is unnecessary to me is informational to you. It's purely subjective.) The one real difference between jr @+expr and db $18, expr is that the former absorbs a -2 into the relative calculation — which results in an awkward jr @+2+expr when trying to use the supported form.

Since there are at least two real codebases that use db $18, expr (mine and the one from that guy who opened an issue), the point about whether it is useful or not seems moot. Supporting the explicit form from the assembler would simply standardize what would otherwise require a macro (or remembering what opcode $18 converts to).

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 23, 2018

But that makes ld have yet another unpredictable behaviour. How is this supposed to behave?

rP1 EQU $FF00
ld a, [rP1]

It's much clearer to have a flag to just keep it working as it is, or don't do anything. Old code will behave the same way as it has always done, and new disassemblies know that they will always get the opcode they want.


But that's another problem. If suddenly there is a new opcode that doesn't have that -2 offset, you need to know that the new opcode doesn't have that -2 offset. It forces you to check the documentation... and that's just to see how a jump is supposed to work. It's just not a good idea. Assembly is supposed to have a simple syntax because it's hard to understand.

This suggestion looks like a hack that is going to make things even harder to understand because now you need to know which one of the 2 behaviours you are going to get, the one of '@' or the one you are proposing. This is not obvious to someone looking at the code unless that person has done that before. Labels are unambiguous, and literally anyone that has ever written code would understand that jr label means jump to label, whereas jr +1 forces you to stop and think about what the code means, the size of instructions, etc, not about what the code does.

2 codebases is anecdotal evidence. I actually discussed this with a colleague of mine and he immediately said that labels were the only way to do this properly and that if he had found your code in a code review he would have rejected it right away. And I would have done the same. If I can't see where a jump is jumping right away, it's not clear enough. And what you are asking for forces you to know the size of the instructions, which isn't impossible in the case of the GB, but is completely unnecessary when you can achieve the same effect with a simple label.

In short, if you want to do this in your projects, feel free to create a macro based on jr @+x, but this is not going to be part of this repository.

Also, regarding the other codebase... He is only using it because jr wasn't allowed between sections, and he has documented it pretty well so that it is clear where it is jumping to. He has 5 lines of comment for that one hacky line. He isn't doing it because he likes it better that way, and he understands that that code is hard to understand, so he explains exactly how it works... Which is yet another reason in my favor to avoid it now that jr works between sections. https://github.com/ekimekim/gbos/blob/1827f12573abce231cc05470acf2ee0ef8c027b4/header.asm#L42-L46

@BenHetherington
Copy link
Contributor

@BenHetherington BenHetherington commented Feb 23, 2018

Sorry to interrupt, but back to the original issue at hand...

I appreciate that disassemblies require as much control as possible, so explicit ldhs would be more desirable than auto-optimised lds for these projects, e.g. for scenarios where the original developers used the less efficient instruction for HRAM access.

However, when developing new projects, I think it’s advantageous to have this optimisation; for example, if you decide to later optimise some code by moving a commonly-used variable from WRAM to HRAM, you would have to manually go through and change all the lds manually. While it arguably makes the code more explicit, I think that having a declaration in HRAM is itself an explicit statement that we want fast access to the memory at that label. This certainly feels like the kind of scenario the assembler should be able to help with.

I agree that the current behaviour across files is inconsistent, but I think it’d be a bit of a cop-out if we fixed the inconsistency by just removing the optimisation. Making a ld to ldh pass at link time might not be as simple, but I think it’d be worth it.

So:

  • I agree it’d be sensible for us to add a flag to disable this, designed for disassemblies (or people who want to be explicit, and are willing to risk potentially picking the ‘wrong’ instruction).
  • For when the flag is absent, I think we should continue to peruse (or at least think about) adding these optimisations across files (maybe reopen #229?).

(I hope this is less provocative than the discussion above! 😆)

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 24, 2018

The problem with #229 is that the linker has to be involved, sections would need the ability to be resized, and resizing sections creates all sort of problems because the linker was never meant to do that (make patches moveable, implement recursive placement of sections...). It's simply not worth it when the coder can just use ldh to read from I/O registers and the few variables that are actually stored in HRAM, even if actually changing the code is a bit painful (you can always write a script to replace ld by ldh in all lines that contain the name of that variable).

That's why what I want to do is to simply have a flag to disable/enable the current behaviour, making it even more clever involves a huge change in the linker.

That is also why I've decided to drop any kind of optimization related to jr and jp and support for "alignment within sections", they all involve drastic changes to the linker. At that point, it would be easier to rewrite most of it from scratch.

@ISSOtm
Copy link
Member

@ISSOtm ISSOtm commented Feb 26, 2018

The most important thing with this is keeping backwards-compatibility ("LIBRUUUL"). Given the large amount of projects using RGBDS, I believe it's required to keep the old behavior by default.
As for the rest, using flags means the programmer takes responsibility for what will be done

Disabling behavior is OK for disasms, yadda yadda. Nothing to add there.

Enforcing behavior is better for development (as @Ben10do said) AND disasms (the example I have in mind right now is pokered, which defines all HRAM as constants so they are auto-optimized, with the nasty side-effect of leaving them all out of the .sym file).
I want to insist on something I said : the ld -> ldh optimization does not imply recursive section placement, since it only implies an additional pass before any linking process takes place. The point was to implement it transparently, so the core of the linking process would remain unchanged.

Furthermore, toggling between ld and ldh is a tedious process (I have several files into several folders, with up to 3 folders of recursion), and sometimes impossible because macros. It would be possible to make one macro for WRAM and another one for HRAM, but this further complicates the RAM swapping process.
A solution against that would be to have only one macro, that acts differently if the label is located in W or HRAM. But that would mean making these known to the assembler...

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 26, 2018

Ultimately, that falls back into the "section size not fixed" pitfall.

Do remember that the address passed to ld needn't be a plain label; something like ld [wFoo + $1c00], a is perfectly valid (and it wouldn't be too shocking to see it after macro expansion and constant folding). There is no way to know whether that expression will result in a $FFxx address or not without linking, and thus the only sane thing the assembler can do is assume that it doesn't.

@ISSOtm
Copy link
Member

@ISSOtm ISSOtm commented Feb 26, 2018

I agree that the assumption should be a 3-byte ld. Then, if that can resolve to a ldh, it would be replaced with it.
By the way, this "assembler assumption failure" is already what's happening with ldh : the assembler assumes the expression will resolve to a $FFXX, and the linker errs if the assumption doesn't hold true in the end.
And while the section size isn't fixed, it is known when linking passes begin.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 26, 2018

Replacing an ld with ldh after linking begins requires resizing a section.
In order to do it before linking begins, the address passed to ld must be a constant, not affected by linking.

Which is the current behavior.

@ISSOtm
Copy link
Member

@ISSOtm ISSOtm commented Feb 26, 2018

No, the current behavior is if the address passed is a constant known at compile time. My suggestion is to extend that behavior to addresses that are known at link time. This would make the current behavior more consistent (namely, the optimization would be independent from the file HRAM variables are declared in).
As for what isn't known prior to linking, this would be indeed left as a 3-byte ld. Most use cases that I have seen so far would be covered by this.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 26, 2018

@ISSOtm
Copy link
Member

@ISSOtm ISSOtm commented Feb 26, 2018

What I meant is, a huge limitation currently is that if HRAM labels are declared in another file, the assembler will optimize. Otherwise, it won't, since it doesn't have the information.
My point is, the linker has information from all the files, so it can do more than the assembler.

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 26, 2018

But that is the problem, any optimization done by the linker implies resizing sections.

@BenHetherington
Copy link
Contributor

@BenHetherington BenHetherington commented Feb 26, 2018

…And we're back at square one.

I don't deny that having the linker be able to resize sections, and therefore do any space-saving optimisation, will be tricky. As a result, I completely understand why @AntonioND thinks it's 'simply not worth it' (especially in the short term).

However, as I think many users (myself included) would find such optimisations useful, I don't think that this means we should rule it out for RGBDS's future.

I'd gladly look into doing this, if I find time between work and my own projects; but there's nothing to stop any other contributor from familiarising themselves with rgblink's current design and what's necessary to allow it to resize sections.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 26, 2018

Since we're back at square one, I think it's important to emphasise at this point that such behavior should always require an enabling flag. The linker resizing sections would be both backwards-incompatible and very non-intuitive, which is why it should not happen by default.

That being said, optimizations are always helpful, of course.

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 26, 2018

But it's not even as easy as just resizing sections. For example:

    DW function_end - function_start

function_start:
    ld [variable_in_hram],a
    ret
function_end:

For this to work, the assembler needs to know that the subtraction isn't defined at assembly time because the linker may optimize it. You need to make rgbasm create a patch for that expression instead of calculating the difference at assembly time. Which means that if you enable the optimization in the linker but not in the assembler you will have trouble.

And it's not easy to detect when the patch is needed...

@ISSOtm
Copy link
Member

@ISSOtm ISSOtm commented Feb 27, 2018

Given that there are already cases where the assembler cannot compute such an expression, for example :

func:
  ld a, [func_len]
  ds labelInOtherFile1 - labelInOtherFile2
end:

func_len:
  dw end - func

It's not a big stretch, imo.

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 27, 2018

As far as I recall from my experiences with Prism, that just fails if the assembler cannot figure it out.

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 27, 2018

That's the resizing sections problem again.

@ISSOtm
Copy link
Member

@ISSOtm ISSOtm commented Feb 27, 2018

@aaaaaa123456789 -> I find it bumming that this fails to compile. It's valid, there are uses cases, etc.

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Feb 27, 2018

It fails to compile because it assumes that the assembler can leave that value as undetermined for the linker to calculate. And sure, the linker can do that, but the only changes that the linker can do is to replace values in a section. For example, for DW label2 - label1, if rgbasm detects that the labels are unknown, it outputs 2 zero bytes and creates a patch for that section. The patch says "at this address there are 2 bytes that have to be overwritten with the value of this calculation" (a RPN expression). This is pretty straightforward.

What you want to happen to DS is pretty complicated.

  1. You need to be able to move sections around because a resized section may not fit where it is placed.
  2. The result of the DS label2 - label1 may change after moving a section, which can happen after step 1. This means that you need to iterate until you find a stable situation, which may not happen at all.
  3. While doing this, all the changes done to a section that has been resized involve modifying the address of all patches that affect that section. This is easy compared to the other steps.

In short, this is not a simple addition, this is a fundamental change to the linker. I don't know of any toolchain that allows this kind of behaviour (WLA DX doesn't, for sure, but I'm talking about any assembler for any platform, I'm curious).

@aaaaaa123456789
Copy link
Member

@aaaaaa123456789 aaaaaa123456789 commented Feb 27, 2018

I remember running head-first into this issue in #149, when the behavior wasn't a hard failure and the assembler tried its best to figure out the size — and obviously got it wrong. I wonder if the CONTINUES keyword proposed in #136 would make this possible.

That being said, this chain effect is exactly why letting the linker resize sections at all is a bad idea — I've seen a macro expand into ds foo - bar 20 times in a row, meaning that small resizes can snowball into large ones.

@AntonioND
Copy link
Member Author

@AntonioND AntonioND commented Mar 3, 2018

Well, seeing that this discussion has died, I think it's a good time to close this issue (it has been fixed in #246). If someone wants to keep discussing, feel free, of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants