Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MEGA65: introduce faster, shorter, more compatible, easy to maintain/understand/modify/follow memory and I/O decoding #209

Closed
lgblgblgb opened this issue Nov 10, 2020 · 6 comments

Comments

@lgblgblgb
Copy link
Owner

lgblgblgb commented Nov 10, 2020

Current memory decoding inside the MEGA65 emulator is horrible. Slow, overcomplicated, almost impossible to maintain and modify, especially for outsiders, but even for myself ... Fun fact, that it was written in the hope to have a better approach in the past, but in overall, it became a "monster" and maybe even worse than the previous one :( The current plan:

  • "KISS" - keep it simple, stupid ... at least initially, that was the problem in the past too, trying to mix in huge amount of "optimizations" in the very early stage which did not helped at all on longer term. Let's first build something acceptable performing but simple enough scheme
  • do not use function pointers (at least not in general cases), it is always slow ... In most cases (normal RAM) memory access can be decoded as a single pointer access which can be tested with a single "if" (the current solution has the mental defect that it wants to avoid conditional check, thus using function pointer array always, which has the opposite effect than on performance what hoped for ... not even mentioning the complex calculations on each memory accesses after calling the function pointer)
  • do not try to "over-optimize" and "head-of-filling" tables, but do "on-demand" bases, the current solution, again tries to do optimizations which in fact much slower, than it should be ...
  • if everything turns out to be OK, try to optimize SOME things, but not just blindly and always test the impact on performance, also judgement must be carefully taken if it worth the more complexity, not to make the same mistakes again

Things like issue #202 (emulation of 32 bit opcodes) is hard now, because it would be extremely slow, also very complex, partly because the issues described here.

Also #29 (VIC-IV stuff) is better to do after messing up significant amount of the code base ;) [DONE]

Breaking news: new findings how memory management works on C65, and how MEGA65 should behave as well: #378

@lgblgblgb lgblgblgb added this to the MEGA65 new mem+I/O+VIC milestone Nov 10, 2020
@lgblgblgb lgblgblgb self-assigned this Nov 10, 2020
@lgblgblgb lgblgblgb added the WIP label Nov 12, 2020
@lgblgblgb lgblgblgb changed the title MEGA65: introduce faster, shorter, more easy to maintain memory and I/O decoding MEGA65: introduce faster, shorter, more easy to maintain/understand/modify/follow memory and I/O decoding Nov 13, 2020
@lgblgblgb
Copy link
Owner Author

lgblgblgb commented Nov 13, 2020

There are some questions here, I was always unsure (or forget?) about. Maybe this wasn't even correct till now in Xemu, so with a memory decoding rewrite, it's a nice thing to sort out these as well. And should be btw., since it's always hard to patch things later than knowing all the constraints before and plan then implement it that way :)

Questions to clear

  1. ROM protection can be only turned on/off from hypervisor mode? I'm aware that we should use a hypervisor trap to turn on/off ROM protection, but would actually work to try it directly outside of hypervisor mode (with the corresponding register to modify this)? If it's only allowed in hypervisor mode, it would help to simplify the logic, as hypervisor calls anyway clears all Xemu mapping tables, but if it's allowed to do outside of hypervisor mode as well, it requires further logic considering!
  2. Colour RAM in VIC-II I/O mode really acts like a C64? ie., you see upper 4 bits of colour RAM always set to '1', and any writes only affects the lower 4 bits? (the second question: what happens if you write like $55, I guess you read back then $F5 but if you switch to VIC-III or VIC-IV I/O mode, what you'll see when you try to read the byte was written in VIC-II I/O mode?) And vice-versa. It's a really important question since it decides many implementation details in Xemu, how colour RAM access is actually implemented. I guess the 2K C65 mapped (@ $1F800) colour RAM it's always all the bits can be seen directly, I meant for the colour RAM can be seen at the legacy I/O $D000 area or MAYBE even the $FFD0xxx??? I think it's kinda important from C64 compatibility view of point, as on C64, the upper four bits (IIRC) seems to be always set when read back, and there can be C64 software depends on this feature!
  3. Is there a memory region in the last Mbyte slice which is the actual "legacy" I/O space? ie., there is one for VIC-II, VIC-III, VIC-IV, but there is an unused area as well. Is that corresponding the legacy I/O at $D000 (if enabled ...)? Or the legacy I/O at $D000 can ONLY been accessed every at $D000 if enabled? This question helps decide how to implement all the legacy I/O in Xemu, mapped to a real region of M65 address space, or it's truly virtual, and the right one from the three I/O modes can be viewed as legacy I/O at $D000? I mean these:
  • $FFD0xxx - I/O area for C64 (VIC-II) I/O mode
  • $FFD1xxx - I/O area for C65 (VIC-III) I/O mode
  • $FFD2xxx - I/O area for M65 (VIC-IV) plus Ethernet I/O mode, the second 2K of I/O space is "covered" by the Ethernet buffers
  • $FFD3xxx - I/O area for M65 (VIC-IV) I/O mode
  1. Things like 2K colour RAM mode in legacy I/O (and other M65 specific things! if I remember maybe there is something for mapping Ethernet buffer this way, and also SD-card buffer ... or whatever it was) alters the VIC-II, VIC-III and VIC-IV I/O areas as well, or only the legacy I/O "view" at $D000, corresponding to the actual I/O mode?
  2. More on the last question: if 2K video RAM is allowed (or stuffs like that which maps extra stuffs into the I/O) what happens if some change VIC I/O mode, will that feature remain or will be reset automatically?

@lgblgblgb lgblgblgb changed the title MEGA65: introduce faster, shorter, more easy to maintain/understand/modify/follow memory and I/O decoding MEGA65: introduce faster, shorter, more compatible, easy to maintain/understand/modify/follow memory and I/O decoding Nov 13, 2020
@gardners
Copy link

Q1. The $D640 -- $D67F registers that set things like ROM protection are only visible in the Hypervisor mode.
Q2. At the moment, all 8-bits of colour RAM are always visible as far as I can tell. I vaguely recall that this created some compatibility problem somewhere, and that I did something to fix that, but I can't remember what and whether I did.
Q3. $Dxxx IO resolution is implemented by mapping to $FFDyxxx, where y is provided by the IO mode.
Q4. Ethernet buffer at $Dxxxx is IO mode 2 (i.e., y=2 in Q3), which is accessed via a different $D02F knock ($45, $54)
Q5. CRAM@DC00 is a separate control bit to changing the IO mode (y in the above)

@lgblgblgb
Copy link
Owner Author

New branch "mem" has been created for the work.

@lgblgblgb
Copy link
Owner Author

Breaking news: new findings how memory management works on C65, and how MEGA65 should behave as well: #378

lgblgblgb added a commit that referenced this issue Sep 7, 2023
So far, "ethernet I/O mode" (much like VIC-4 I/O mode but with Ethernet
registers in the high 2K of I/O space) was largely ignored by Xemu. This
commit tries to introduce a more-or-less correct support. Note, this
alone still not enough to have working ethernet emulation again
(see issue #242 for that ...).
@lgblgblgb
Copy link
Owner Author

branch mem is not sync'ed with next and I try to sketch some bare-bone "clean" but working version and let's see what it leads to ...

lgblgblgb added a commit that referenced this issue Feb 13, 2024
Plus a 'shade' command to set the transparency of the matrix background.
lgblgblgb added a commit that referenced this issue Mar 15, 2024
It seems to work without too much testing yet. The bad news though: it's
seems to a be a bit slower than the old one :( I expected to opposite.

This also uses the baking/mapping behaviour as mentioned in #378
lgblgblgb added a commit that referenced this issue Mar 19, 2024
More optimizations and modifications:

* It seems the old method is superior: using linear addresses to the
  actual slot backend readers/writers
* Introducing optional data-pointer driven mechanism for further
  emulation performance
* Unfinished "memory-watch" mode
* Various minor modifications and/or optimizations
@lgblgblgb lgblgblgb removed the WIP label Mar 25, 2024
@lgblgblgb
Copy link
Owner Author

It seems to work, currently in the dev branch only but evolving and under testing. Let's assume the basic foundation is now OK, which was the scope of this issue, thus let's close this for now.

@lgblgblgb lgblgblgb unpinned this issue Mar 25, 2024
lgblgblgb added a commit that referenced this issue May 12, 2024
This quite big change rewrites the MEGA65 emulation memory decoder
subsystem. Some of the reasons:

* Create a new, not overcomplicated decoder which can be understood at
  all without major headache, and can be also faster as per #209
* Taking account some new findings about how C65 worked (both Xemu and
  MEGA65 was wrong here) and adopted since then by MEGA65 as per #378
* More about the future (not ready in this commit): allow functionality
  for debugger implementation "watchmem" which can monitor about every
  memory r/w events debugger want to check as per #11

This commit is quite big as was mentioned, and may introduce some
regressions. So it's important to have feedback in those cases. Also it
changes the behaviour of memory decoding (#378). The new decoder seems
to be already faster than the old one and more easy to understand and
maintain. Also there is some room for future improvements still in every
areas, including optimizations. The "memwatch" part currently is not
usable though, but the basis are there to be able to implement that at
all (wouldn't be possible with the older scheme).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment