Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lcc: compiler hangs on complex expression #76

Open
kervinck opened this issue May 17, 2019 · 33 comments
Open

lcc: compiler hangs on complex expression #76

kervinck opened this issue May 17, 2019 · 33 comments
Assignees

Comments

@kervinck
Copy link
Owner

int f(int a)
{ 
  return 16807 * (a % 27773) - 2836 * (a / 27773);
}
@kervinck kervinck added the bug label May 17, 2019
@kervinck
Copy link
Owner Author

Simplification that still hangs the compiler:

int main(int a)
{ 
  return 1 * (a / 2);
}

@kervinck
Copy link
Owner Author

kervinck commented May 17, 2019

We seem stuck in a never ending recursion:

ASGNI2(ADDRLP2(29), LOAD(INDIRI2(VREGP(vAC))))
dumpcover(1a005e00) = stmt: ASGNI2(ADDRLP2,LOADI2(INDIRI2(VREGP))) / (action)
(listing 1a005e00)
(rallocing 1a005e00)
(replacing 1a005130 with a reload from 56)
(genreload: INDIRI2(ADDRLP2(27)))
INDIRI2(ADDRLP2(29))
dumpcover(1a006028) = stmt: reg / # 
dumpcover(1a006028) =  reg: INDIRI2(ADDRLP2) / (action)
(reprune changes 190071a0 from 1a005130 to 1a006028)
(inserting load from vAC to r1)
(targeting eb98b078->x.kids[0]=1a0060e8 to r1)
(listing 1a006028)
(freeing vAC)
(free[0]=ffffffff)
(free[1]=ffffffff)
(allocating vAC to node 1a0059c0)
(free[0]=fffffffe)
(free[1]=ffffffff)
(rallocing 1a006028)
local 30 of size 2 @ offset 58
(spilling vAC to local 58)
(genspill: INDIRI2(ADDRLP2(28)))
(targeting 1a006348->x.kids[0]=1a0062b8 to vAC)
(inserting load from vAC to vAC)
(targeting 1a006468->x.kids[1]=1a0064f8 to vAC)
ASGNI2(ADDRLP2(30), LOAD(INDIRI2(VREGP(vAC))))
dumpcover(1a006468) = stmt: ASGNI2(ADDRLP2,LOADI2(INDIRI2(VREGP))) / (action)
(listing 1a006468)
(rallocing 1a006468)
(replacing 1a0059c0 with a reload from 58)
(genreload: INDIRI2(ADDRLP2(28)))
INDIRI2(ADDRLP2(30))
dumpcover(1a006690) = stmt: reg / # 
dumpcover(1a006690) =  reg: INDIRI2(ADDRLP2) / (action)
(reprune changes 190071a8 from 1a0059c0 to 1a006690)
(inserting load from vAC to r1)
(targeting eb98b078->x.kids[0]=1a006750 to r1)
(listing 1a006690)
(freeing vAC)
(free[0]=ffffffff)
(free[1]=ffffffff)
(allocating vAC to node 1a006028)
(free[0]=fffffffe)
(free[1]=ffffffff)
(rallocing 1a006690)
local 31 of size 2 @ offset 60
(spilling vAC to local 60)
(genspill: INDIRI2(ADDRLP2(29)))
(targeting 1a0069b0->x.kids[0]=1a006920 to vAC)
(inserting load from vAC to vAC)
(targeting 1a006ad0->x.kids[1]=1a006b60 to vAC)
ASGNI2(ADDRLP2(31), LOAD(INDIRI2(VREGP(vAC))))
dumpcover(1a006ad0) = stmt: ASGNI2(ADDRLP2,LOADI2(INDIRI2(VREGP))) / (action)

@kervinck
Copy link
Owner Author

kervinck commented May 17, 2019

This looks funny:

(inserting load from vAC to vAC)

These vAC's are different symbol objects with the same name. The rightmost is created by mkreg(). The left one by genspill(). "Our" inset() treats them as different. (But so does the original LCC code before inset() was added.)

The same hangup issue occurs with the shift operators:

int main(int a)
{ 
  return 1 << (a >> 2);
}

Interestingly, these operators get the same treatment in gt1.c:

        case LSH:
        case RSH:
                // LSH reg, [1,8] are unary operators. LSH reg, reg is a helper call.
                if (generic(p->kids[1]->op) == CNST) {
                        unsigned amt = p->kids[1]->syms[0]->u.c.v.u;
                        if ((generic(p->op) == LSH && amt <= 4) || amt == 8) {
                                rtarget(p, 0, wregs[0]);
                                break;
                        }
                }
                // fallthrough
        case MUL:
        case MOD:
        case DIV:
                // Helper calls. We can pick the targets to help avoid spills. We'll put the LHS in r0 and the RHS in vAC.
                // Note that we first target the LHS to vAC in order to ensure that a copy-through-vAC is inserted.
                rtarget(p, 0, wregs[0]);
                rtarget(p, 0, wregs[1]);
                rtarget(p, 1, wregs[0]);
                break;

@kervinck
Copy link
Owner Author

This is getting a bit outside my comfort zone.
@pgavlin, is this related to #73 in some way?

@kervinck
Copy link
Owner Author

If I replace

int inset(Symbol rs, Symbol r) {
        int i;

        if (!rs->x.wildcard) {
                return rs == r;
        }

with

int inset(Symbol rs, Symbol r) {
        int i;

        if (!rs->x.wildcard) {
                return !strcmp(rs->name, r->name);
        }

then the weird 'vAC to vAC' messages are gone. But the compiler still enters an unbounded recursion :-(

@kervinck
Copy link
Owner Author

LCC gets in an infinite loop of spilling and reloading vAC. Initially it spills vAC at some point, which is fine and understandable. Later it reloads, but at that point vAC is allocated, so it wants to spill it again. etc

(The message “inserting load from vAC to vAC” looks like a red herring we can ignore for now.)

@pgavlin
Copy link
Collaborator

pgavlin commented May 20, 2019

LCC gets in an infinite loop of spilling and reloading vAC. Initially it spills vAC at some point, which is fine and understandable. Later it reloads, but at that point vAC is allocated, so it wants to spill it again. etc

Yep. I attempted to address this by extending the RA's support for register sets to include partial sets as well as complete sets and ensuring that all nodes that require one operand in vAC require the other operand in anything besides vAC. My guess is that I missed a spot somewhere.

It would be great to have a generalized solution for the infinite spill/reload cycle, though I'm not sure what that would be. I believe that the approach I described above will work, but it is more ad-hoc than I would like.

@kervinck
Copy link
Owner Author

It’s a nice brain teaser now that I understand what happens. It indeed begs for a simple generalised solution, but I don't "see" it (yet).

@kervinck
Copy link
Owner Author

kervinck commented May 20, 2019

int main(void)
{ 
  int a, b, c;
  return a * (b / c);
}

Annotated simplified debug output

(rallocing 4b80d298 = LOAD(INDIRI2(ADDRLP2(a))) )                   r1 <- vAC <- argument a
(spilling r1 to local 1)                                            spilling r1, because need again for compiling DIV
ASGNI2(ADDRLP2(1), LOAD(INDIRI2(VREGP(r1))))                        local 1 <- vAC <- r1. vAC is free, so no problem
(genreload: LOAD(INDIRI2(ADDRLP2(a))))                              what's in r1
INDIRI2(ADDRLP2(1))                                                 addressing local 1
(spilling vAC to local 2)                                           now we spill the result of DIV, for the reload of 'a'
ASGNI2(ADDRLP2(2), INDIRI2(VREGP(vAC)))                             local 2 <- vAC
(genreload: DIVI2(LOAD(INDIRI2(ADDRLP2(b))), INDIRI2(ADDRLP2(c))))  what's in vAC
INDIRI2(ADDRLP2(2))                                                 addressing local 2
(targeting ed2a6078->x.kids[0]=4b808de0 to r1)                      the reload of r1 (LHS for MUL) ??? See below
(allocating vAC to node 4b808760)
(rallocing 4b808d20 = INDIRI2(ADDRLP2(2)) )                         the reload of vAC (RHS for MUL)
(spilling vAC to local 3)                                           ehhh, no! vAC *should* be available but it aint!

The above annotation is not quite right. Some speculation: the MUL needs two reloads (LHS and RHS), but the compiler tries to emit them in reverse order?

@pgavlin
Copy link
Collaborator

pgavlin commented May 20, 2019

What does the full tree look like post-canonicalization?

@kervinck
Copy link
Owner Author

kervinck commented May 20, 2019

It is called only at the beginning:

post-canon: RETI2(MULI2(INDIRI2(ADDRLP2(a)), DIVI2(INDIRI2(ADDRLP2(b)), INDIRI2(ADDRLP2(c)))))

But after that the tree gets modified.

@pgavlin
Copy link
Collaborator

pgavlin commented May 20, 2019

Ah, I always forget that the post-canon trees aren't that helpful for diagnosing RA issues.

Immediately prior to the register allocator, we should see INDIRI2(ADDRLP2(a)) replaced by either LOAD(INDIRI2(ADDRLP2(a))) or LOAD(LOAD(INDIRI2(ADDRLP2(a)))), and INDIRI2(ADDRLP2(b)) replaced by a similar tree. These loads should be inserted as a result of this code snippet from earlier:

        case MUL:
        case MOD:
        case DIV:
                // Helper calls. We can pick the targets to help avoid spills. We'll put the LHS in r0 and the RHS in vAC.
                // Note that we first target the LHS to vAC in order to ensure that a copy-through-vAC is inserted.
                rtarget(p, 0, wregs[0]);
                rtarget(p, 0, wregs[1]);
                rtarget(p, 1, wregs[0]);
                break;

Do those LOAD nodes--which represent vreg->vAC or vAC->vreg copiess--appear to be present?

@kervinck
Copy link
Owner Author

Yes, I believe that part works as intended. I must admit that my hunt is based on partial understanding. Especially the wildcard stuff is black magic for me, and the book hasn't arrived. So it's more throwing sticks at the problem until it goes away.

BTW, I replaced this in inset() in my local copy:

        if (!rs->x.wildcard) {
-               return rs == r;
+               //return rs == r;
+               return rs->x.regnode == r->x.regnode;
        }

It looks like a genuine bug to me also present in the original LCC. And it's probably completely harmless.

Right now I'm staring now at how spillr() operates. But I should go to sleep.

@pgavlin
Copy link
Collaborator

pgavlin commented May 20, 2019

inset is all my doing, so that's my bug. Thanks for fixing it :)

@kervinck
Copy link
Owner Author

Don't worry: the original LCC expression has the same equality operator.

The reloads appear to be created in-place in the tree. That disproofs my reversal theory.

@pgavlin
Copy link
Collaborator

pgavlin commented May 20, 2019

Ahh, now this is starting to make more sense. Here's the output of dumpcover from the debug logs:

RETI2(MULI2(LOAD(INDIRI2(ADDRLP2(a))), DIVI2(LOAD(INDIRI2(ADDRLP2(b))), INDIRI2(ADDRLP2(c)))))
dumpcover(f080cdf8) = stmt: RETI2(reg) / (action)
dumpcover(f080cd68) =  reg: MULI2(reg,reg) / (action)
dumpcover(f080d298) =   reg: LOADI2(reg) / (action)
dumpcover(f080ca08) =    reg: INDIRI2(ADDRLP2) / (action)
dumpcover(f080ccd8) =   reg: DIVI2(reg,reg) / (action)
dumpcover(f080d208) =    reg: LOADI2(reg) / (action)
dumpcover(f080cb28) =     reg: INDIRI2(ADDRLP2) / (action)
dumpcover(f080cc48) =    reg: INDIRI2(ADDRLP2) / (action)

The nodes will execute in the order produced by a left-to-right postorder visit of the tree, so that tree would execute like so:

1. INDIRI2(ADDRLP2(a)) ; load local a into vAC (req'd due to the ldloc helper)
2. LOADI2(reg)         ; copy vAC into r1 for the eventual helper call to mul
3. INDIRI2(ADDRLP2(b)) ; load local b into vAC (req'd due to the ldloc helper)
4. LOADI2(reg)         ; copy vAC into r1 for the eventual helper call to div 
5. INDIRI2(ADDRLP2(c)) ; load local c into vAC (req'd due to the ldloc helper)
6. DIVI2(reg, reg)     ; call the div helper, with the result in vAC
7. MULI2(reg, reg)     ; call the mul helper, with the result in vAC

Because the result of (2) is still live in r1 at (4), r1 must be spilled. lcc attacks the spill problem by creating a tree of the form ASGNI2(ADDRLP2(spill temp), LOAD(INDIRI2(VREGP(spillee)))) at the site of the spill, a tree of the form INDIRI2(ADDRLP2(1)) at the site of the reload, and running these through the usual codegen pipeline. I added patterns to recognize the spill code, but the reloads are so generic that I don't think it's possible to match and handle them specially. As a result, the usual targeting kicks in and asks for vAC all over the place, which triggers more spills and reloads ad nauseum.

It is possible that we will need to change the lcc spiller to do something gt1-specific. I'm not sure that the architecture-independent spiller will actually work out.

@pgavlin
Copy link
Collaborator

pgavlin commented May 20, 2019

We may want to try adding a RELOAD node to the lcc IR. We could then pattern-match against this node and use the additional information to generate appropriate code that avoids trashing vAC when reloading a virtual register.

@kervinck
Copy link
Owner Author

kervinck commented May 20, 2019

I goes wrong when it wants to inject a 3rd spill. The debug output is hard to read for me, even having stared at it for almost 20 hours now. Conceptually, there's no reason to use more than 2 temporary variables. And even if vAC is constantly allocated and freed, there shouldn't arise a lock if all goes in the right order. But somehow it does...

@kervinck
Copy link
Owner Author

kervinck commented May 21, 2019

This is the sequence I think should be emitted. There is no need for special treatment when tracking the occupancy of vAC. From what I can see, the compiler is pretty close to emitting this. It just trips over its own feet at some point. Maybe it is releasing vAC just a bit later than really necessary. Something small.

1a. vAC <- local a
1b. temp 1 <- vAC
2a. vAC <- local b
2b. r1 <- vAC
3.  vAC <- local c
4.  call mul
5.  temp 2 <- vAC
6a. vAC <- temp 1
6b. r1 <- vAC
7.  vAC <- temp 2
8.  call div

@pgavlin
Copy link
Collaborator

pgavlin commented May 21, 2019

From the logs, I think that the RA is creating this sequence when it spills r1:

1. INDIRI2(ADDRLP2(a)) ; load local a into vAC (req'd due to the ldloc helper)
2. LOADI2(reg)         ; copy vAC into r1 for the eventual helper call to mul
2a. ASGNI2(ADDRLP2(1), LOAD(INDIRI2(VREGP(r1)))) ; spill r1 to t1
3. INDIRI2(ADDRLP2(b)) ; load local b into vAC (req'd due to the ldloc helper)
4. LOADI2(reg)         ; copy vAC into r1 for the eventual helper call to div 
5. INDIRI2(ADDRLP2(c)) ; load local c into vAC (req'd due to the ldloc helper)
6. DIVI2(reg, reg)     ; call the div helper, with the result in vAC
2b. INDIRI2(ADDRLP2(1)) ; reload t1 into vAC (req'd due to the ldloc helper)
7. MULI2(reg, reg)     ; call the mul helper, with the result in vAC

(2b) forces a spill of vAC, so the sequence then changes to this:

1. INDIRI2(ADDRLP2(a)) ; load local a into vAC (req'd due to the ldloc helper)
2. LOADI2(reg)         ; copy vAC into r1 for the eventual helper call to mul
2a. ASGNI2(ADDRLP2(1), LOAD(INDIRI2(VREGP(r1)))) ; spill r1 to t1
3. INDIRI2(ADDRLP2(b)) ; load local b into vAC (req'd due to the ldloc helper)
4. LOADI2(reg)         ; copy vAC into r1 for the eventual helper call to div 
5. INDIRI2(ADDRLP2(c)) ; load local c into vAC (req'd due to the ldloc helper)
6. DIVI2(reg, reg)     ; call the div helper, with the result in vAC
6a. ASGNI2(ADDRLP2(2), LOAD(INDIRI2(VREGP(vAC)))) ; spil vAC to t2
2b. INDIRI2(ADDRLP2(1)) ; reload t1 into vAC (req'd due to the ldloc helper)
6b. INDIRI2(ADDRLP2(2)) ; reload t2 into vAC (req'd due to the ldloc helper)
7. MULI2(reg, reg)     ; call the mul helper, with the result in vAC

This sequence contains an unresolvable register conflict: (2b) and (6b) both require vAC at the same point, so the code is not allocatable. I think that what is missing is a copy of the result of (2b) into r1. Though I am not entirely sure why that isn't being inserted, I suspect that it is because target is not being called again on (7). I'll have to check the book for a better explanation of the difference between kids and x.kids and the behavior of genreload.

@kervinck
Copy link
Owner Author

kervinck commented May 21, 2019

You're onto something. The genreload seems to mix something up. I now also dump the global 'head' tree at the beginning and end of genreload. At the critical stage, it transforms the head from

RETI2(MULI2(INDIRI2(ADDRLP2(1)), DIVI2(LOAD(INDIRI2(ADDRLP2(b))), INDIRI2(ADDRLP2(c)))))

into this:

RETI2(MULI2(INDIRI2(ADDRLP2(1)), INDIRI2(ADDRLP2(2))))

But here we expect a LOAD from vAC to r1 in the first argument to MUL and it isn't there. (Your 2b line).

However, during this transformation there is this line being logged:

(inserting load from vAC to r1)
(targeting ed6d4098->x.kids[0]=8a808de0 to r1)

From this I assumed all the time it was doing the right thing. But we don't see that LOAD node actually appear in the tree. This debug logging is so hard to follow...

@pgavlin
Copy link
Collaborator

pgavlin commented May 21, 2019

Here is reprune, which is called to refill the contents of kids with the appropriate entries from x.kids:

static int reprune(Node *pp, int k, int n, Node p) {
	struct node x, *q = *pp;

	if (q == NULL || k > n)
		return k;
	else if (q->x.inst == 0)
		return reprune(&q->kids[1],
			reprune(&q->kids[0], k, n, p), n, p);
	if (k == n) {
		debug(fprint(stderr, "(reprune changes %x from %x to %x)\n", pp, *pp, p->x.kids[n]));
		*pp = p->x.kids[n];
		x = *p;
		(IR->x.target)(&x);
	}
	return k + 1;
}

The call to target is what is triggering those inserting load/targeting log entries. The copies are being created, but they are immediately discarded due to the fact that the parent node (in our case, the MUL) is copied to a local variable which is then passed to target. Simply removing that copy seems like it is the first step towards fixing this issue, but it is certainly not the last: anytime a new node is inserted, it needs to be passed through rewrite. However, that may not be as simple as it sounds, as rewrite is recursive and already-rewritten nodes are not really supposed to be run through rewrite a second time as per an assertion in that function.

pgavlin added a commit to pgavlin/gigatron-rom that referenced this issue May 21, 2019
The results of targeting inserted reloads were being silently discarded,
which resulted in kervinck#76. These changes stop discarding these results and
label/reduce any resulting copies.
@pgavlin
Copy link
Collaborator

pgavlin commented May 21, 2019

A set of changes that fixes this specific instance is here: master...pgavlin:pgavlin/spill

I am not terribly confident in these changes, however. I think that the result of target needs to be checked so both kids can be labeled and reduced as necessary.

@kervinck
Copy link
Owner Author

I have some doubts as well. It feels not quite right to protrude Gigatron changes outside gt1.md all the way into lburg...

@pgavlin
Copy link
Collaborator

pgavlin commented May 23, 2019

The lburg changes are pretty innocuous. I had already made some target-independent changes to make it possible to generate code from callbacks based on a pattern match; the changes in that branch simply enhance the logging to make it clearer which callback is being invoked.

This small set of edits is the fix: master...pgavlin:pgavlin/spill#diff-26094ddff7f8a9016fd6d0f421cb26d2R830

The fix should be target-independent as well, but I'm not totally confident in that.

@kervinck
Copy link
Owner Author

For my understanding (my book hasn't arrived yet, and I dearly start to need it):

  1. what's the difference between 'action' and 'emit2()'? Are these called at a different stage?
  2. in the patch I spotted it can now generate moves through 'ht'. Does it generate such moves now for this specific case? (Because I think it's never needed as long as vAC is the last argument to such operators).

I tried to test, but when I clone your repo en checkout 'spill', 'make lcc' works but 'make ctest' crashes:

make ctest
Utils/lcc/build/lcc -ILibs -c Libs/Example.c -o Libs/Example.o
Utils/lcc/build/lcc -ILibs -c Libs/Gigatron/BusyWait.c -o Libs/Gigatron/BusyWait.o
Utils/lcc/build/lcc -ILibs -c Libs/Gigatron/ClearScreen.c -o Libs/Gigatron/ClearScreen.o
Utils/lcc/build/lcc -ILibs -c Libs/Gigatron/Newline.c -o Libs/Gigatron/Newline.o
Utils/lcc/build/lcc -ILibs -c Libs/Gigatron/PutChar.c -o Libs/Gigatron/PutChar.o
Utils/lcc/build/lcc -ILibs -c Libs/Gigatron/Random.c -o Libs/Gigatron/Random.o
Utils/lcc/build/lcc -ILibs -c Libs/Gigatron/WaitKey.c -o Libs/Gigatron/WaitKey.o
Utils/lcc/build/lcc -ILibs -c Libs/ctype/isdigit.c -o Libs/ctype/isdigit.o
Utils/lcc/build/lcc -ILibs -c Libs/errno/errno.c -o Libs/errno/errno.o
Utils/lcc/build/lcc -ILibs -c Libs/stdio/fclose.c -o Libs/stdio/fclose.o
Utils/lcc/build/lcc: fatal error in Utils/lcc/build/rcc
make: *** [Libs/stdio/fclose.o] Error 1

I will be offline for a couple of days for surgery. Please feel free to commit to the main repo if you think it's good. You're the best judge of that :-)

@kervinck
Copy link
Owner Author

kervinck commented Jun 1, 2019

The book was supposed to arrive today, but it didn't. I've been doing some testing instead, focusing on the effect of just the proposed reprune() change for now (that seems to be the meat of the fix):

int main(void)
{
  int a, b, c;
  return a * (b / c);
}

Gives:

asm.defun('_main')
asm.push()
asm.ldwi(0x2)
asm.call('enter')
asm.ldw('sp')
asm.subi(10)
asm.stw('sp')
asm.ldi(2)
asm.call('ldloc')
asm.stw('r1')           
asm.stw('ha')           <-- spilling r1, but how does it know vAC equals r1?
asm.ldi(8)
asm.call('stloc')
asm.ldi(4)
asm.call('ldloc')
asm.stw('r1')
asm.ldi(6)
asm.call('ldloc')
asm.call('div')
asm.stw('ha')
asm.ldi(10)
asm.call('stloc')
asm.ldi(8)
asm.call('ldloc')
asm.stw('r1')
asm.ldi(10)
asm.call('ldloc')
asm.call('mul')
asm.stw('rv')
asm.label('.L1')
asm.ldi(10)
asm.addw('sp')
asm.stw('sp')
asm.ldwi(0x8000)
asm.call('leave')
asm.ldw('rv')
asm.popret()

The above works, and is roughly what you expect. The asm.stw('r1') is unneeded, but fine, I can imagine why it puts it there. However, I mistrust the code that follows and spills r1, annotated in the above. Even if it works out here, it looks "funny": I would expect a simple and naive asm.ldw('r1') in there before the asm.stw('ha') etc. But this ldw() isn't there. My confusion is "How does it know at that point, when spilling r1, that vAC already contains the value of r1?". It looks like a shortcut I haven't seen it take elsewhere. Hence my mistrust.

@kervinck
Copy link
Owner Author

kervinck commented Jun 1, 2019

Just for laughs, the following from #77 results in a spilling fest.

int main(int c)
{
  if (c == -1)
    return 1;
}

It generates no less than 3 temporary variables.

asm.defun('_main')
asm.push()
asm.ldwi(0x80)
asm.call('enter')
asm.ldw('sp')
asm.subi(6)
asm.stw('sp')
asm.ldi(10)
asm.call('ldloc')
asm.stw('ha')
asm.ldi(2)
asm.call('stloc') <-- one
asm.ldwi(-1)
asm.stw('ha')
asm.ldi(4)
asm.call('stloc') <-- two
asm.ldi(2)
asm.call('ldloc')
asm.stw('ha')
asm.ldi(6)
asm.call('stloc') <-- three
asm.ldi(4)
asm.call('ldloc')
asm.stw('r7')
asm.ldi(6)
asm.call('ldloc')
asm.subw('r7')
asm.jne('.L2')
asm.ldi(1)        <-- return 1
asm.stw('rv')
asm.ldwi('.L1')
asm.jr()
asm.label('.L2')
asm.ldi(0)        <-- an overzealous LCC. It could just as well leave 'rv' uninitialised.
asm.stw('rv')
asm.label('.L1')
asm.ldi(6)
asm.addw('sp')
asm.stw('sp')
asm.ldwi(0x200)
asm.call('leave')
asm.ldi(2)
asm.addw('sp')
asm.stw('sp')
asm.ldw('rv')
asm.popret(

One temp is to store the constant -1. The others for getting the operands for subtraction in the right order. Ironically, the order doesn't matter, because we're only testing for equality. One mitigation is to replace SUB with BXOR here. (I already have that prepared in another workspace).

@kervinck
Copy link
Owner Author

kervinck commented Jun 1, 2019

I found this article by Fraser and Hanson on LCC's register spilling:
Simple Register Spilling in a Retargetable Compiler
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.6848&rep=rep1&type=pdf

This might be helpful to understand what is going on in gen.c

@kervinck
Copy link
Owner Author

kervinck commented Jun 1, 2019

This is a nice reference article on the code generation:
The lcc 4.x Code-Generation Interface
https://drhanson.s3.amazonaws.com/storage/documents/interface4.pdf

And here a report on retargeting LCC to a DSP. It might be of interest as they had to overcome register constraints as well. LCC is introduced in chapter 4.
Retargeting a C Compiler for a DSP Processor
http://www.diva-portal.org/smash/get/diva2:19902/FULLTEXT01

@kervinck
Copy link
Owner Author

kervinck commented Jun 8, 2019

Party time: my LCC book finally arrived

@kervinck
Copy link
Owner Author

kervinck commented Jun 27, 2019

Due to personal circumstances I'm unable to focus on register allocation. Too difficult.

[ As a side project, we have 6502 emulation in a pretty advanced state now. This might open up a path to use cc65 as a C compiler. ]

@lb3361
Copy link
Contributor

lb3361 commented Aug 25, 2022

This issue should be closed since glcc fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants