-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NES: Rework CNROM, UNROM and UNROM-512 to properly handle NMIs. #198
NES: Rework CNROM, UNROM and UNROM-512 to properly handle NMIs. #198
Conversation
EDIT: I withdraw most of this; I noticed this was based off of jroweboy's work w/ Action53. I need to read that more carefully and fold this into my opinion. |
Perhaps it'd help to get more eyeballs on the PR? There are a few NES LLVM-MOS-SDK users by now; they might have their own opinions to contribute. |
Yeah, paging @jroweboy , if you get the chance. |
mos-platform/nes-cnrom/bank.c
Outdated
@@ -8,10 +8,37 @@ | |||
|
|||
__attribute__((section(".zp.bss"))) char _BANK_SHADOW; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should always be a real shadow of the register; CNROM shouldn't be the only mapper without this. The other mappers have another byte to deal with this, and one should really be spent here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be volatile, similarly for the other shadows. They're written in regular contexts and read in interrupt contexts; that's classic volatile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should always be a real shadow of the register
UNROM also uses only one byte; it just so happens that it doesn't have any CHR/video-related functionality. UNROM-512 uses two bytes, but that's because it has both CHR/video-related and PRG-related mapper configuration functionality, so it'd require two bytes regardless.
Is there a developer-side reason to allow reading back the CNROM shadow for CHR? It's not the kind of thing where you'd typically have a trampoline/"push-pop" pattern for development, like with doing things with PRG data... In addition, having a separate real shadow requires more cycles spent on upkeeping them (as now NMI can get fired between writing to shadow A and shadow B, as well as shadow B and the actual MMIO register).
Overall, I'm not quite convinced.
This should be volatile, similarly for the other shadows. They're written in regular contexts and read in interrupt contexts; that's classic volatile.
Volatile will force the bank shadow to be present even if the code never reads from it; it is possible to construct a hypothetical NES game's code in such a way that the bank shadow would be unnecessary (NMI using fixed bank only + careful PRG bank arrangement), thus making MMIO register writes atomic and speeding up the whole ordeal a bit. The reason I opted for memory barriers instead is to allow the compiler to optimize _BANK_SHADOW
out entirely, but that is currently not possible as the NMI handler cannot be expressed with C code alone; even then, I don't think the compiler would allow modeling what I'm trying to achieve here, so might be best to just move back to volatile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another EDIT: My original reply was substantially quite incorrect; I had misremembered why I added these.
This should always be a real shadow of the register
Is there a developer-side reason to allow reading back the CNROM shadow for CHR? It's not the kind of thing where you'd typically have a trampoline/"push-pop" pattern for development, like with doing things with PRG data... In addition, having a separate real shadow requires more cycles spent on upkeeping them (as now NMI can get fired between writing to shadow A and shadow B, as well as shadow B and the actual MMIO register).
I had originally thought that I had copied these over from nesdoug, but I went back to the original sources, and it looks like they were inventions of mine. The reason for their inclusion was then certainly because they seemed simple to include and for orthogonality with PRG-related shadows, but the closer examination done here reveals that they're not actually very simple to maintain.
There is still an orthogonality argument; if we pull out CHR-ROM getters, we should do it across mappers and take the ABI hit. Increasingly, I think that's what we should do; there's nearly always a good reason to maintain careful PRG state shadows (banked_call
, but the case is harder to make for CHR, and it's something that a user can take care of themselves to their desired degree of fidelity. Library code is usually worse than user code, since we have to defensively deal with situations that the user can know never happen, so I'd be in favor of ripping all this out.
Volatile will force the bank shadow to be present even if the code never reads from it; it is possible to construct a hypothetical NES game's code in such a way that the bank shadow would be unnecessary (NMI using fixed bank only + careful PRG bank arrangement), thus making MMIO register writes atomic and speeding up the whole ordeal a bit. The reason I opted for memory barriers instead is to allow the compiler to optimize
_BANK_SHADOW
out entirely, but that is currently not possible as the NMI handler cannot be expressed with C code alone; even then, I don't think the compiler would allow modeling what I'm trying to achieve here, so might be best to just move back to volatile.
Lost the original comment on this one, but the gist was this: the memory barriers are quite a bit harder to read than volatile would be, and it doesn't appear we get any gain from them today. When we can know for sure that there's a design that can use them to provide tangible benefit, that's a good time to put that in, but the best way to know something like that is to have an implementation of it in-hand. Taking the opposite policy tends to cause strange code to accumulate in projects like this, and it's difficult to root it out once it's forgotten whether or not its quirks are load-bearing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pulling out the CHR getters is fairly separate from this review, so we can table that discussion for this one and introduce a disparity between the mappers. The set/swap/split API seems worth including regardless. I'd also like to see it backported to MMC1/MMC3; we need to figure out the right way forward there, trading off nesdoug tutorial compatibility and internal SDK consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pulling out the CHR getters for now is a good idea; it's less breaking to add an API some time later than to remove an API some time later.
91530a1
to
591197b
Compare
@@ -6,12 +6,32 @@ | |||
#include <rompoke.h> | |||
#include "bank.h" | |||
|
|||
__attribute__((section(".zp.bss"))) char _BANK_SHADOW; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called something like _BANK_NEXT
, since it entirely controls the bank as of next NMI, and it has no relation to the current bank state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
274c13b
to
5bf99b1
Compare
Fixes #195 for these three mappers. I think this leaves only MMC3 as the remaining mapper without NMI handling (MMC1 and Action53 already had their own; NROM does not require it).
I've implemented my own approach to handling NMIs:
For CNROM and UNROM-512, I've extended this with additional functionality:
As this may not be immediately intuitive to someone tinkering with the code in the future, I have added a test for these two mappers which automatically validates it.
This required removing
get_chr_bank
on CNROM - well, it didn't strictly require it, but then it would require two RAM bytes instead of one; I expect the functionality to be sufficiently unnecessary as to not be worth the added cost. Likewise, the public API of UNROM-512 has changed; as neither API has yet shipped in a production build of the SDK, this is currently not a breaking change.