Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(classic) Complete gameboy support #1287

Open
suborb opened this issue Sep 28, 2019 · 17 comments
Open

(classic) Complete gameboy support #1287

suborb opened this issue Sep 28, 2019 · 17 comments

Comments

@suborb
Copy link
Member

@suborb suborb commented Sep 28, 2019

The following need to be completed for a full target:

  • Basic gencon (tile)
  • "Native" terminal output
  • joystick() maps into GBDK joypad code
  • Selection of font at compile time by defining CRT_FONT
  • Setting of font/udgs via ioctl()
  • Gencon mode switching (tile, APA)
  • Gencon "colour" setting (APA)
  • Lores plotting via gencon
  • [ ] Get MBF32 running (tricky due to lack of JP P/M/PO)
  • [ ] Hires plotting via APA - accessible via gbdk, so hold off
  • Fixup return values to match SDCC abi (or fix up in compiler for the __smallc case?)
    • ctype
    • stdio
    • string
    • stdlib
  • Import SDCC crt0 library code
  • Get SDCC compiles working
  • Banking
  • 32 bit multiplication

The following should be done:

  • Ticks: Properly emulate the flags behaviour
@suborb suborb added this to Todo in z88dk 2.0 via automation Sep 28, 2019
@suborb suborb moved this from Todo to In progress in z88dk 2.0 Sep 30, 2019
@suborb
Copy link
Member Author

@suborb suborb commented Oct 1, 2019

Re banking. It looks like sdcc supports #pragma bank NN which sets the code segment to _CODE_NN, leaves bss, data, rodata sections as usual.

Within z88dk we use BANK_NN as the section name for the SMS target and I've copied that over to the Gameboy target. These are targeted with the appropriate pragma to change the section names. This suggests that sccz80 should support #pragma bank as a shortcut for setting code and rodata sections.

I think placing all z88dk library code within the always paged in bank makes life easier so we'll only end up with user code being manually placed within banks.

Conventionally, GBDK used the __banked annotation to support banked calls, this resulted in the following code being emitted:

call banked_call
defw [function address]
defw [bank]

Where banked_call switches pages and offsets the stack, this stack offset can be handled by zsdcc and sccz80 with the annotation __z88dk_params_offset(X).

Function address can be populated using a regular z80asm patch expression. The bank could be populated by an appmake stage as follows:

  1. Lookup the address of banked_call in the .map file: XXYY
  2. Search the binary for sequences of: CDYYXX [AA BB] [00 00] where BBAA will be the address of a function.
  3. Reverse search for BBAA in the .map file, find the section name, parse and deduce the bank.

This does, however feel fragile - I'd be worried about functions having clashing addresses in different banks (which will happen for functions at the start of a bank unless we fudge the first usable address within a bank).

Thus, I think we will need z80asm support for this - effectively a way of parsing the section name of symbol and allowing that to be used within source code. Thoughts, @pauloscustodio ?

@feilipu
Copy link
Collaborator

@feilipu feilipu commented Oct 2, 2019

Re Banking.

This matches the way that it is done on YAZ180 with the __call_far function. Though I use call, defw, defb (signed relative), for the bank call.

Using a signed bank definition allows easy relative bank calls (i.e. call a bank above, or a bank below the current bank), and zero refers to the current bank. But this is not baked in concrete. Happy to align to whatever becomes the standard way to do this.

Would it be sensible to make the addressing somewhat linear with a defq address definition, where the lower 16 bits are addressing and the upper 16 are either bank identifier or linear address space, depending on the platform implementation?

The patching mechanism for banks looks quite like the REL format, with the bitmap attached to indicate items (call, jp, etc) to be patched.

@pauloscustodio
Copy link
Member

@pauloscustodio pauloscustodio commented Oct 2, 2019

Re Banking.

Two options below. Please comment or suggest alternatives.

Extend z80asm to handle 24- or 32-bit addresses, where the lower 16-bits are the address seen by the CPU, and the upper 8- or 16-bits are the bank id (platform dependent, e.g. value to be written to the bank register). CALL, JP, ... need to accept the 32-bit address and ignore the upper 16-bits. For the Spectrum 128K (which I know better than the GameBoy), one could write

  section xxx1		; name is not relevant
  org $00C000       ; select page 0 at address $C000
  
  public func1
  func1: ...
  
  section xxx2
  org $01C000        ; select page 1 at address $C000
  
  extern func1
  ...
  call banked_call   ; platform-dependent function that switches banks and calls the function
  defp func1         ; patch in a 24-bit address (Spectrum 128k); use defq for a 32-bit address

This solution is easy to implement in z80asm; all the infrastructure is in place, we just need to handle the 32-bit addresses gracefully.

Let the linker automatically resolve banked calls.

Same as above, use the upper 16-bit of a 32-bit address as the bank id. Create new opcodes for banked calls that reserve space in the object code for the call to the banked_call function (platform dependent function, part of library) and the 24- or 32-bit address (call24, call32).
At link time, include either the call to banked_call and a defp/defq with the called address, or, if the target is in the same bank, a regular call followed by 3 or 4 nop.

  section xxx1		; name is not relevant
  org $00C000       ; select page 0 at address $C000
  
  public func1
  func1: ...
  
  section xxx2
  org $01C000        ; select page 1 at address $C000
  
  extern func1
  
  func2: ...
  ...
  call24 func1       ; assembled & linked as: call banked_call : defp func1
  call24 func2       ; assembled & linked as: call func2 : nop : nop : nop

This solution is a bit more complex, but gives additional flexibility in arranging the code in banks.

@suborb
Copy link
Member Author

@suborb suborb commented Oct 2, 2019

I quite like option 1 since it offers the compiler (or assembler author) more control as to how to invoke a function - I do suspect there may well be many ways of actually invoking the trampoline (for example via a rst is going to be an obvious option) so the pseudo op-code doesn't feel quite right.

The automatic conversion in this case:

call24 func2       ; assembled & linked as: call func2 : nop : nop : nop

would be incorrect if func2 was a C function with parameters (the parameters wouldn't be at the expected stack offset).

pauloscustodio added a commit that referenced this issue Oct 2, 2019
Extend z80asm to handle 24- or 32-bit addresses, where the lower 16-bits are the address seen by the CPU, and the upper 8- or 16-bits are the bank id (platform dependent, e.g. value to be written to the bank register). CALL, JP, ... need to accept the 32-bit address and ignore the upper 16-bits. For the Spectrum 128K (which I know better than the GameBoy), one could write

  section xxx1  ; name is not relevant
  org $00C000       ; select page 0 at address $C000

  public func1
  func1: ...

  section xxx2
  org $01C000        ; select page 1 at address $C000

  extern func1
  ...
  call banked_call   ; platform-dependent function that switches banks and calls the function
  defp func1         ; patch in a 24-bit address (Spectrum 128k); use defq for a 32-bit address

As a side-effect of this change, z80asm no longer emits warnings for 16-bit values out of range.
@pauloscustodio
Copy link
Member

@pauloscustodio pauloscustodio commented Oct 2, 2019

Option 1 implemented in z80asm_32bit_addresses branch. Please let me know of any issues,

@suborb
Copy link
Member Author

@suborb suborb commented Oct 3, 2019

Thank you Paulo, I'll give it a try this evening I hope.

@suborb
Copy link
Member Author

@suborb suborb commented Oct 3, 2019

The z80asm changes works for my requirements - I've successfully had a banked call execute and return a value.

I don't think losing the range checking is too much bother so please feel free to merge.

pauloscustodio added a commit that referenced this issue Oct 3, 2019
Allow 32-bit addresses in z80asm (see #1287, #1292, #1290)
@suborb suborb moved this from In progress to Done in z88dk 2.0 Feb 3, 2020
@basxto
Copy link

@basxto basxto commented Jul 27, 2020

It looks like sdcc supports #pragma bank NN which sets the code segment to _CODE_NN

That’s correct -bo also generates _CODE_N and -ba generates _DATA_N (SRAM banks)

Inside the linker _CODE_N becomes 0xN4000 and _DATA_N becomes 0xNA000, which get mapped to the real rom addresses when the IHX gets created.

Adding bank and address after the call became --legacy-banking in SDCC trunk.

EDIT:

Fixup return values to match SDCC abi

What does that mean? gbdk-n followed the return in e, de, hlde SDCC uses on gbz80

@suborb
Copy link
Member Author

@suborb suborb commented Jul 27, 2020

Fixup return values to match SDCC abi

What does that mean? gbdk-n followed the return in e, de, hlde SDCC uses on gbz80

The z88dk libraries and the z80 targets use l, hl, dehl so for the libraries to work with zsdcc (gbz80) 8/16 bit functions have the return value in de and hl.

@basxto
Copy link

@basxto basxto commented Jul 28, 2020

Does that also mean that you have __z88dk_fastcall on gbz80 now?

@suborb
Copy link
Member Author

@suborb suborb commented Jul 28, 2020

We haven’t changed anything in sdcc regarding gbz80. However sccz80 supports fastcall and callee functions for gbz80

@basxto
Copy link

@basxto basxto commented Jul 28, 2020

Well, if __z88dk_fastcall was to be implemented in sdcc, it should be compatible enough to call function compiled with z88dk.

I was nearly done with implementing __z88dk_fastcall for gbz80 so that parameters are put into the registers used for return values. But I’ll try to implement a different kind of fastcall then, l for 8bit parameters and return values is quite inconvinient if it’s paired with bigger functions, which have variables on the stack.

@basxto
Copy link

@basxto basxto commented Mar 10, 2021

Well, back to this.
Last week I finally understood that z88dk has it's own compiler, assembler etc. besides it's pached sdcc.
Loading values into four registers for a 8bit return value is indeed quite undesirable.
I suspect __smallc is solely for SCCZ80 compatibility?

I could try to implement __smallc with return registers hl, hlde how SCCZ80 generates them. (Even though __smallc gets accepted, it does not push 8b as 16b currently)
And __z88dk_fastcall using return registers as parameters:

  • __z88dk_fastcall using e, de, hlde
  • __z88dk_fastcall __smallc using hl, dehl
@suborb
Copy link
Member Author

@suborb suborb commented Mar 10, 2021

I'm all over the place at the moment, working on far too many things at the same time, so I've not had a chance to fix the other issue - apologies.

Yes, __smallc is purely for sccz80 compatibility (likewise sccz80 has __z88dk_sdccdecl for sdcc compatibility).

There's two cases to consider for mixing/matching. Libraries and user code. Although it's theoretically possible to mix-and-match compilers for user-code in classic it's not particularly well tested and there are caveats (there's a wiki page somewhere but I can't find it at the moment)

Getting library interop working is important though, to get sdcc to work together with the libraries I had to make the following modifications:

  • The libraries never take a char parameter - everything got promoted up to an int
  • The library routines that return an int, return it in hl as well as de.
  • The library routines that return a long just don't work (there's not many of them though)

As a explanatory note, for library routines, sdcc enters via the labels _strlen and sccz80 via strlen which does allow the library to handle the register requirements, but fixing everything up is extremely tedious obviously so the fewer times we need to do that the better.

My feeling is that the priority order is:

  1. Fixing up __smallc with return registers in hl,dehl - that will allow the library routines that return a long value to work correctly with sdcc without needing any workarounds and allows the long functions to work.

  2. The input registers for __z88dk_fastcall. For library work (which is in asm) the input registers could be worked around with an ex de, hl equivalent which isn't particularly onerous/expensives

  3. The promotion from char to int on library functions is suboptimal but is an easy workaround so comes last

@basxto
Copy link

@basxto basxto commented Mar 10, 2021

I'm all over the place at the moment, working on far too many things at the same time, so I've not had a chance to fix the other issue - apologies.

No problem, it's not urgent, I was just playing around a bit with it and noticed that stuff.

The libraries never take a char parameter - everything got promoted up to an int

Is that a general rule for sdcc or for gbz80 specifically?

3 is part of 1, sdcc user guide explicitly says that __smallc is left to right and that 1 byte arguments are passed as 2 bytes, with the value in the lower byte. Though it does not say anything about the return value. Do they never return <2B?

Do I have to care about the upper byte of chars or can I just push trash into them?

The equivalient to ex de, hl is

ld a, d
ld d, h
ld h, a
ld a, e
ld e, l
ld l, a

which are 6 bytes and 24 cycles wasted.
Or if you go for size and do

push de
push hl
pop de
pop hl

it would be 4 bytes and 54 cycles wasted.
Fetching a 16b argument from stack is just 5b and 32c: (+1b 16c for the push)

ldhl sp, #2
ld a, (hl+)
ld h, (hl)
ld l, a

My main interest is indeed to have a e,de,hlde __z88dk_fastcall, but I also want z88dk to work somehow.

And this is probably a bug

_get_mode:
ld hl,__mode
ld e,(hl)
ld l,a
ld h,0
ret

@suborb
Copy link
Member Author

@suborb suborb commented Mar 10, 2021

The libraries never take a char parameter - everything got promoted up to an int

Is that a general rule for sdcc or for gbz80 specifically?

It's not desirable but came out of necessity for the interop - the z80 1b pushing was only fixed a couple of years ago.

| 3 is part of 1, sdcc user guide explicitly says that __smallc is left to right and that 1 byte arguments are passed as 2 bytes, with the value in the lower byte. Though it does not say anything about the return value. Do they never return <2B?

I suspect that everything in the libraries is rounded up to be 2b return value (of which only 1b is of significance).

| The equivalient to ex de, hl is...

Yes, it's not pretty, for single parameter entry it's just 3 bytes since we just need to do ld hl, de. I was just mentioning it since the solution can be staged and can get value without having to do everything all at once.
| And this is probably a bug

Oh yes, thank you.

@basxto
Copy link

@basxto basxto commented Mar 11, 2021

It looks like __smallc already succesfully pushes 1B values as 2B, I just did not recognize them.
It does

dec	sp
ld	a, #0x04
push	af
inc	sp

which is a weird way of doing

ld	e, #0x04
push	de

So it really only needs different return registers.

Implementing fastcall would be trivial, the return would probably only need to care about __smallc.

It's not designed for changing registers of calling conventions dynamically, but I can maybe treat them as completely new calling conventions (smallc_return, smalc_fastcall).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
z88dk 2.0
  
Done
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants