Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PCE] Add new platform. #121

Merged
merged 39 commits into from Jun 25, 2023
Merged

[PCE] Add new platform. #121

merged 39 commits into from Jun 25, 2023

Conversation

asiekierka
Copy link
Contributor

@asiekierka asiekierka commented Jun 18, 2023

The PCE platform corresponds to the NEC PC Engine, also known as the TurboGrafx-16 in the US. It is an interesting target for llvm-mos due to its particularly fast (7.16MHz) 6502 CPU, as well as large (up to 1MB; up to 8.5MB with extra mapper logic - not implemented here) ROM size and flexible banking.

The pull request will be marked as ready once I've collected enough feedback on it. Feel free to comment on what's already here, though - I'd be more than happy to receive feedback!

TODO (for this PR):

  • Provide a solution for banking.
  • Provide a solution for interrupt handling, modeled after the NES target.
  • Add just enough VDC/VCE functions to allow bringing up a screen. Maybe.

Prerequisite PRs:

Documentation: https://llvm-mos.org/wiki/PCE_target

Banking is not yet implemented, so limited to eight-kilobyte ROMs.
This will be improved soon...
@asiekierka asiekierka force-pushed the upstream/pce branch 2 times, most recently from c417b2f to fba0866 Compare June 18, 2023 13:26
- Provide missing one-bank getters and setters.
- Use C where using ASM provides no "bank chain" size advantage to allow for
  inlining opportunities.
@asiekierka asiekierka force-pushed the upstream/pce branch 2 times, most recently from 12a1df2 to 6690434 Compare June 18, 2023 13:28
@asiekierka asiekierka changed the title Add PCE platform. [PCE] Add new platform. Jun 19, 2023
@mysterymath
Copy link
Contributor

The PCE platform corresponds to the NEC PC Engine, also known as the TurboGrafx-16 in the US. It is an interesting target for llvm-mos due to its particularly fast (7.16MHz) 6502 CPU, as well as large (up to 1MB; up to 8.5MB with extra mapper logic - not implemented here) ROM size and flexible banking.

The pull request will be marked as ready once I've collected enough feedback on it. Feel free to comment on what's already here, though - I'd be more than happy to receive feedback!

TODO (for this PR):

  • Provide a solution for banking.
  • Provide a solution for interrupt handling, modeled after the NES target.
  • Add just enough VDC/VCE functions to allow bringing up a screen. Maybe.

Prerequisite PRs:

LMA setup:

Bits Description
31 - 24 Group. 0x00 => base PCE address space; 0x01 => PCE card RAM address space (CD-ROM, Populous); 0x02 - 0x11 => "Street Fighter 2" mapper address space
23 - 16 Bank. For group 0x00: 0x00 - 0x7F => ROM, 0xF7 => CD backup RAM, 0xF8 => RAM, 0xF9 - 0xFB => SuperGrafx RAM, 0xFF => I/O. For group 0x01: 0x40 - 0x5F => RAM (Populous), 0x68 - 0x7F => RAM (Super System Card), 0x80 - 0x87 => RAM (PC Engine CD). For groups 0x02-0x11: 0x40 - 0x7F => ROM.
15 - 0 VMA

Is there a quick write-up I could read on the semantics of the vbank system? It would be useful to have documentation of the form avaiable on the Wiki for the NES targets; this would make it much easier to review this, and should help folks build a mental model of how the target works.

E.g. see https://llvm-mos.org/wiki/NES_targets

.global pce_memmove
.global pce_memset

.section .text.pce_rmemcpy, "ax", @progbits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the little self-modifying assembly I've written in llvm-mos, I just wrote an assembly language routine, and added labels to each instruction to be modified, but placed it in .data rather than .text. Then, another routine (or itself, I suppose) can use those labels to adjust the routine, then simply jump to it. This minimizes the overhead of calling the routine, since only the portions that change between calls are overwritten, and the result is very readable, being actual assembly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Six of the eight bytes are caller-provided; as such, I believe the performance gain from doing so would be marginal (especially as the routine is laid out in such a way that - at least for memcpy() - two of the six caller-provided bytes are already in the right place at the moment of call), while the RAM use would increase by eight bytes (and we only have 8KB!).

rmemcpy() is a little more complicated as it has to set operands source + length - 1, destination + length - 1, length; rather than source, destination, length; likewise, memset() is a little more complicated as it has to set the first byte manually, then set operations destination, destination + 1, length - 1.

In addition, pce_memop() (which exists to allow access to the alternate transfer modes Increment Alternate and Alternate Increment - see memory.h) uses the opcode as a type operand (there are five variants total) - which would itself prohibit it being "very readable assembly".

The "ideal" way I'd like to resolve this is to emit the block copy instructions (TII/TDD) at the compilation stage if all the operands are known, minimizing the usage of these routines.

It could also be worthwhile to move some of the operand computation to C and rely more on pce_memop() itself, taking advantage of LLVM's optimization passes - but this is out of scope for this PR, which is meant to provide basic functionality.

@asiekierka
Copy link
Contributor Author

Initial wiki page: https://llvm-mos.org/wiki/PCE_target

@asiekierka asiekierka marked this pull request as ready for review June 21, 2023 15:39
@asiekierka
Copy link
Contributor Author

asiekierka commented Jun 21, 2023

Okay, I'm done with this PR for now.

Note: While the target appears generally stable, I can't promise the lack of short-term API breakage as more veteran PC Engine developers look into it and provide their two cents. At the same time, I'd rather wait for the existence of some kind of binary builds to ask a wider audience to look into it, as to not ruin a first impression - I could provide my own, of course.

mos-platform/pce/libpce/include/pce/hardware.h Outdated Show resolved Hide resolved
mos-platform/pce/libpce/include/pce/system.h Outdated Show resolved Hide resolved
* @brief Set the VDC's copy method to copy every row.
* Alias for @ref pce_vdc_set_copy_word .
*/
#define pce_vdc_set_copy_row() pce_vdc_set_copy_word()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a specific reason to use the function call version here? This will prevent taking the address of pce_vdc_set_copy_row, which is surprising, (but unlikely).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly many of these might be better served by static inline functions; they'll almost certainly get inlined away, and if they aren't, then its likely because they've completely supplanted the thing they call, which saves space. That is, if a program only calls pce_vdc_set_width_tiles, then pce_vdc_set_width may get inlined into pce_vdc_set_width_tiles and deleted from the program. Then each use is just a JSR to set_width_tiles, rather then setting up the args for a call to pce_vdc_set_width.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I can never keep the semantics of C inline in my head; I've no idea whether it'll actually de-dup static inline versions of these during LTO. I've been in the habit of just making these kinds of things real functions in the .c file; it's pretty reasonable to assume that if you want fast code on llvm-mos you're doing LTO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly, I've been in the habit of doing weird things like this because most previous platforms I worked with either lacked LTO or had buggy LTO.

Do you think it's a good idea to just standardize on non-static/non-inline functions across the library, then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's the best default policy. Numeric constants are one thing (no consternation in C), but the simplicity of semantics for regular functions is compelling, and it gives rhe inliner the most freedom.

mos-platform/pce/libpce/src/memory-c.c Outdated Show resolved Hide resolved
* @brief Set the VDC's copy method to copy every row.
* Alias for @ref pce_vdc_set_copy_word .
*/
#define pce_vdc_set_copy_row() pce_vdc_set_copy_word()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's the best default policy. Numeric constants are one thing (no consternation in C), but the simplicity of semantics for regular functions is compelling, and it gives rhe inliner the most freedom.

mos-platform/pce/libpce/src/memory.S Show resolved Hide resolved
mos-platform/pce/libpce/src/memory.S Show resolved Hide resolved
mos-platform/pce/libpce/src/memory.S Show resolved Hide resolved
mos-platform/pce/libpce/src/psg.c Outdated Show resolved Hide resolved
mos-platform/pce/link.ld Show resolved Hide resolved
…ions.

They are not suitable as a general replacement for memcpy/memmove/memset, as
they stall interrupts while they are being executed.

These can be reinstated in a separate PR, once the best way to expose this
to the end user is evaluated.
@asiekierka
Copy link
Contributor Author

Done. I've also removed the following features (in separate commits, in case we want to revert those in the future):

  • pce_memcpy/pce_memmove/pce_memset - useful as they are, they currently block interrupts for the duration of executing the block transfer; this is good enough for a low-level function like pce_memop, but might need to be reconsidered for higher-level wrappers that replace C memcpy/memmove/memset.
  • the virtual bank system - as per a lengthy discussion on Discord, it's better to add a feature in the future than to remove one, and virtual banks have the following two problems:
    • they are likely to be wholly superseded by automatic bank layouting,
    • mos24bank(vbank_symbol) wouldn't always return the expected physical bank - if the virtual bank was over 8KB, and the symbol was more than 8KB into the virtual bank, it would still return the first physical bank of the virtual bank, not the one the symbol actually resides in. This is a minor issue, but frustrating if far pointers were to be implemented.

@mysterymath mysterymath merged commit d3caa19 into llvm-mos:main Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants