New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PCE] Add new platform. #121
Conversation
Banking is not yet implemented, so limited to eight-kilobyte ROMs. This will be improved soon...
c417b2f
to
fba0866
Compare
- Provide missing one-bank getters and setters. - Use C where using ASM provides no "bank chain" size advantage to allow for inlining opportunities.
12a1df2
to
6690434
Compare
Is there a quick write-up I could read on the semantics of the vbank system? It would be useful to have documentation of the form avaiable on the Wiki for the NES targets; this would make it much easier to review this, and should help folks build a mental model of how the target works. |
mos-platform/pce/libpce/src/memory.S
Outdated
.global pce_memmove | ||
.global pce_memset | ||
|
||
.section .text.pce_rmemcpy, "ax", @progbits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the little self-modifying assembly I've written in llvm-mos, I just wrote an assembly language routine, and added labels to each instruction to be modified, but placed it in .data
rather than .text
. Then, another routine (or itself, I suppose) can use those labels to adjust the routine, then simply jump to it. This minimizes the overhead of calling the routine, since only the portions that change between calls are overwritten, and the result is very readable, being actual assembly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Six of the eight bytes are caller-provided; as such, I believe the performance gain from doing so would be marginal (especially as the routine is laid out in such a way that - at least for memcpy()
- two of the six caller-provided bytes are already in the right place at the moment of call), while the RAM use would increase by eight bytes (and we only have 8KB!).
rmemcpy()
is a little more complicated as it has to set operands source + length - 1
, destination + length - 1
, length
; rather than source
, destination
, length
; likewise, memset()
is a little more complicated as it has to set the first byte manually, then set operations destination
, destination + 1
, length - 1
.
In addition, pce_memop()
(which exists to allow access to the alternate transfer modes Increment Alternate
and Alternate Increment
- see memory.h
) uses the opcode as a type operand (there are five variants total) - which would itself prohibit it being "very readable assembly".
The "ideal" way I'd like to resolve this is to emit the block copy instructions (TII/TDD) at the compilation stage if all the operands are known, minimizing the usage of these routines.
It could also be worthwhile to move some of the operand computation to C and rely more on pce_memop()
itself, taking advantage of LLVM's optimization passes - but this is out of scope for this PR, which is meant to provide basic functionality.
Initial wiki page: https://llvm-mos.org/wiki/PCE_target |
Okay, I'm done with this PR for now. Note: While the target appears generally stable, I can't promise the lack of short-term API breakage as more veteran PC Engine developers look into it and provide their two cents. At the same time, I'd rather wait for the existence of some kind of binary builds to ask a wider audience to look into it, as to not ruin a first impression - I could provide my own, of course. |
* @brief Set the VDC's copy method to copy every row. | ||
* Alias for @ref pce_vdc_set_copy_word . | ||
*/ | ||
#define pce_vdc_set_copy_row() pce_vdc_set_copy_word() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a specific reason to use the function call version here? This will prevent taking the address of pce_vdc_set_copy_row, which is surprising, (but unlikely).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly many of these might be better served by static inline functions; they'll almost certainly get inlined away, and if they aren't, then its likely because they've completely supplanted the thing they call, which saves space. That is, if a program only calls pce_vdc_set_width_tiles, then pce_vdc_set_width may get inlined into pce_vdc_set_width_tiles and deleted from the program. Then each use is just a JSR to set_width_tiles, rather then setting up the args for a call to pce_vdc_set_width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I can never keep the semantics of C inline in my head; I've no idea whether it'll actually de-dup static inline versions of these during LTO. I've been in the habit of just making these kinds of things real functions in the .c file; it's pretty reasonable to assume that if you want fast code on llvm-mos you're doing LTO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly, I've been in the habit of doing weird things like this because most previous platforms I worked with either lacked LTO or had buggy LTO.
Do you think it's a good idea to just standardize on non-static/non-inline functions across the library, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that's the best default policy. Numeric constants are one thing (no consternation in C), but the simplicity of semantics for regular functions is compelling, and it gives rhe inliner the most freedom.
* @brief Set the VDC's copy method to copy every row. | ||
* Alias for @ref pce_vdc_set_copy_word . | ||
*/ | ||
#define pce_vdc_set_copy_row() pce_vdc_set_copy_word() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that's the best default policy. Numeric constants are one thing (no consternation in C), but the simplicity of semantics for regular functions is compelling, and it gives rhe inliner the most freedom.
…ank offset mapping system.
…ions. They are not suitable as a general replacement for memcpy/memmove/memset, as they stall interrupts while they are being executed. These can be reinstated in a separate PR, once the best way to expose this to the end user is evaluated.
Done. I've also removed the following features (in separate commits, in case we want to revert those in the future):
|
The PCE platform corresponds to the NEC PC Engine, also known as the TurboGrafx-16 in the US. It is an interesting target for llvm-mos due to its particularly fast (7.16MHz) 6502 CPU, as well as large (up to 1MB; up to 8.5MB with extra mapper logic - not implemented here) ROM size and flexible banking.
The pull request will be marked as ready once I've collected enough feedback on it. Feel free to comment on what's already here, though - I'd be more than happy to receive feedback!
TODO (for this PR):
Prerequisite PRs:
Documentation: https://llvm-mos.org/wiki/PCE_target