Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lcc: optimisations #70

Open
kervinck opened this issue May 9, 2019 · 9 comments
Open

lcc: optimisations #70

kervinck opened this issue May 9, 2019 · 9 comments

Comments

@kervinck
Copy link
Owner

kervinck commented May 9, 2019

[Note: This issue is an aggregation for optimisations of the emitted code. I want to park all ideas here for future reference. Regarding priorities in LCC the order should be 1. Correctness, 2. Usability, 3. Optimisations.]

Ideas (some simpler than others. some realistic, some are nonsense):

  1. POKE is often preceded by ANDI 0xff, but this is almost never needed
  2. entermask/leavemask can sometimes use LDI
  3. many cases of stw(x) + ldw(x) or other way around. Can be optimised
  4. eliminate ldw(vAC) and stw(vAC)
  5. if we have a known value in vAC, use SUBI to get a small negative number
  6. comparisons eq/ne with small negative constants can avoid LDWI
  7. don't 'pusha' each argument, but allocate the argument area in one go
  8. option to use vCPU stack as data stack (lots of work, unclear if it will bring anything)
  9. use DEF for string pointer initialisers
  10. is it feasible to use INC more often?
@kervinck
Copy link
Owner Author

Low priority

@kervinck
Copy link
Owner Author

kervinck commented May 15, 2019

Two ideas to reduce the number of thunk functions in rt.py:

  1. Move the start of the pixel lines to offset 0x60:

videoTable[1] = 0x60;

No need to let code do this: do this as a 1-byte segment somewhere in the .gt1 file (near the end?). Pixels then run from 0x60 to 0xff while code can live at offset 0 instead of 0xa0. This eliminates the need for thunk1. Adjust PutChar, Newline and ClearScreen accordingly. Actually, these become slightly simpler because testing for end of line is now simpler.

  1. thunk2 hops over from the end of page 4 into page 8. We can do that by placing thunk2 at the beginning of page 5 (C stack area), and jump in there with a CALL thunk0 from page 4.
0500 2b tt     STW  tt
0502 11 00 08  LDW  $0800
0505 2b 1a     STW  vLR
0507 21 tt     LDW  tt
0509 ff        RET

In essence, the above eliminates four zero page bytes and one helper function.

@kervinck
Copy link
Owner Author

kervinck commented May 17, 2019

  1. Use XORW instead of SUBW before the == and != operators.

(Sometimes the compiler even juggles the order of SUBW operands...)

@kervinck
Copy link
Owner Author

  1. MULI2(CNSTI2(1), ...

Simplify. Same for CNSTI2(0) and MULU2, DIVXX etc

@kervinck
Copy link
Owner Author

  1. Remove 'rv' from rt.py, use 'ha' instead for return values

@kervinck
Copy link
Owner Author

kervinck commented Jun 1, 2019

  1. More aggressive purging of unused library functions. For example:
int main(void)
{
  return 0;
}

Still gives an .gt1 file of more than 4 KB in size. It seems that references from other functions are not purged (e.g. div in rt.py references divu, and divu is never purged because of this?) Also some references come from the data space, such as the flush methods in the FILE objects of stdin.c/stdout.c).

@kervinck
Copy link
Owner Author

kervinck commented Jun 1, 2019

  1. LCC inserts explicit return values in places where "don't care" will work. This makes some code larger than needed.

See this comment for an example: #76 (comment)

@kervinck
Copy link
Owner Author

  1. Some C11 code crept into src/gt1.md. Workaround solved with Fix lcc compiler compile error #97. It's would be nicer to make it all ANSI C compliant (aka C89).

@lb3361
Copy link
Contributor

lb3361 commented Aug 25, 2022

This issue should be closed since glcc already does such optimizations (when they make sense in the new code)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants