New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
16-bit optimization error uses uninitialized carry? #895
Comments
|
Now that I discovered --debug-opt-output I can see where it goes wrong: So, the initial generated line at L0006 is processing the carry of two different comparison results. OptCmp8 appears to assume carry is not needed after a branch, so it makes no attempt to preserve it and eliminates the carry incorrectly. So... either the assumption made by OptCmp8 is invalid and it needs to be revised or removed, or the continued/combined use of the carry result produced by the generator is invalid? |
|
Just as a test, commenting out the line that removes the comparison: https://github.com/cc65/cc65/blob/master/src/cc65/coptcmp.c#L890 The So, even with OptCmp8 only replacing the branch, it seems to be useful. It feels like replacing the branch with a Not sure if this would still work as well with more complicated examples, but I think at least OptCmp8 by itself isn't able to produce invalid code if it doesn't eliminate the compare. |
Generates incorrect code for some 16-bit cases. See: cc65#895
|
The bug is that |
|
After writing a test for this against various types (signed/unsigned * char/int/long) it seems to only fail against unsigned int, and even then only in the < and <= cases, and would only visibly manifest at runtime if the carry happened to be clear as a pre-condition. All other types, and cases seem to be immune to the bug, as far as I've noticed. So it seems that only the code generated for unsigned int < or <= does the kind of shared branch thing that OptCmp8 would fail on? Disabling the removal of the compare instruction (as in #899) seems to be a simple and safe way to correct OptCmp8, but it does seem like its effectiveness is reduced a bit by that, since it was valid in other cases. It might be possible to re-instate the compare removal for the other cases if there was a way for OptCmp8 to know a little more information about the context? On the other hand, maybe this was the wrong place to try to optimize branching on a constant. Could it be done instead at the higher level when the code is generated in the first place? That might be safer and more effective. At any rate #899 appears to fix the invalid code, at the expense of weakening this optimization slightly. Not sure if it's important to preserve/restore that lost optimization but the test will probably help make that easier if one of us attempts it. |
|
I don't know much it helps but I tried to quantify the impact of #899. I built the Contiki 80-column webbrowser (with So the size increase is:
|
|
Hmm, well if we want to think more specifically about the cases where it fails, each time it seemed to have jumped directly to either:
The latter seems like it can be detected in the same way that OptBoolTrans detects it, I think, so it does appear possible to check for that. I'm not confident that I've encountered all ways that this can fail, but I think suppressing the removal of the compare in only these two cases should solve the known problems at least, without hamstringing the other cases. |
|
According to my understanding this seems like a viable approach. In case later on you/someone else discovers a 3.) we can still be optimistic that this 3.) will be detectable too. Or am I missing the point? |
|
Well, I tried this in bc3cc99 and it seems to pass the test I had created. On the example case, even though it retains the cpx on the first OptCmp8 pass, it actually gets removed in a later OptCmp8 pass once the intervening comparison's dead code has been removed. I don't have extensive knowledge of the code generator or optimizer, but it seems to be working OK in the cases I know about at least. |
|
Thanks for your quick reaction on my last comment :-)
I'd say that's true for all participating at cc65@GitHub. Therefore at least I personally appreciate that you nevertheless try to fix bugs in that area! I want to give other a little time to comment. If no one vetoes I'll merge... |
Generates incorrect code for some 16-bit cases. See: #895
|
Fixed with #899 - thanks :-) |
The following produces invalid code when -O is used:
Resulting assembly:
Note that the comparison has entirely disappeared. With an
unsigned charfora, the optimizer seems to be able to determine if the condition is always true or false, and can even remove the whole assignment tob. But, withunsigned int, it bungles the optimization, and simultaneously leaves the assignment in there, and branches over it (or fails to branch) with an uninitialized carry? Very odd.Source: https://forums.nesdev.com/viewtopic.php?f=2&t=18843&p=238785
The text was updated successfully, but these errors were encountered: