-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query about contention delays #10
Comments
I think what happens here is that we implement the zx contention model defined in terms of memory and port cycles and not T-states. Primarily because this is how we used to see it described in the Spectrum world. For example, the timing paper of the zx project, https://github.com/kosarev/zx/blob/master/test/screen_timing/SCREEN_TIMING.md, gives the start ticks for the whole cycles and in the same time takes it into account that the actual access takes place one T-state later.
Yes, that's my reading of the timining diagrams in the Z80 User Manual as well. But this z80 emulator doesn't imply any specific approach to implementing contentions. If for your needs it works better to apply contention ticks on T2 of the cycle, then you can just say so in your definitions of cycle handlers. Or, maybe it's me missing something? |
Ah, that explains it! I think it also explains curious -1 offset I mentioned was part of 'zx' but not my implementation. I was finding that once the timings locked on to a contention point they remained correct. However at the boundaries to events with a different alignment there was a timing mismatch. Interrupts are one of the unaligned events on my target machine, and it resulted in the interrupt duration sometimes being too long or too short. Test programs that rely on cycle-accurate timings were failing and I couldn't explain why, until now. I could probably copy and modify the base implementation for some extra cases, as I have for the fetch/read/write/input/output cycle handlers. Though that would still require correcting for the offset in uses of the tick counter, such as screen drawing, raster position registers, etc. It feels like a Spectrum-specific optimisation has crept into the generic CPU core implementation, which is affecting the derived tick count value that the base class doesn't control. As a possible alternative, could the base implementation use Having a pure tick counter benefits unit testing, where tests can be run from specific time points. That's more difficult to do at the moment without applying the offset of 1 at various places, which could be error-prone. |
I've created a small patch to implement the suggested changes for testing. To apply contention I just provide The contention hooks are called after T2 for MREQ and T4 for IORQ, and currently only for the Z80 core not 8080. It does now pass my timing test suite so it fixes the last of the differences I was seeing. These positions need checking in case there's an offset bias in my own code that means these offsets are a cycle too late. I think the mechanism should be general enough to be in the base code for any system wanting contention. Though I think the extra ULA contention that depends on the address bus value may be fairly Spectrum-specific and still need the extra cycle handlers to be overridden? Though if it's general enough for more hooks with an empty base handler then that would further simplify zx too. |
I've just noticed my patch causes a test mismatch due to some re-ordering of operations. Zaks page 70 shows T1: PC OUT, T2: PC = PC + 1, T3: INST into IR. Incrementing PC was being performed last so it was moved up before the IR assignment. The tests will need to be updated to match if the other changes remain. |
I see. So if SAM Coupé relies on CPU's wait states, then implementing the cycle handlers would indeed require applying offsets to the current tick. It's not the same for ZX as its ULA just stops the CPU when a clash on the address bus is possible -- something an implementation of which wouldn't benefit from having a lower-level interface, it seems. Revisiting the situation with the existing tick offsets is in my TODO list, though.
The Z80 emulator was actually written with the idea of providing a simulator interface in mind, which is why in early commits it was referred to as a simulator. So the idea is that by giving the cycle handlers custom implementations, it is possible to implement a pin-level interface and, by queing operations like So yes, I absolutely agree on both the points -- adding lower-level handlers and getting rid of tick offsets is the way to go.
+1
Looks great! If you don't mind, I will borrow this and also re-check it on my side once again that it matches the diagrams.
No worries. I will take care of this. (BTW, can you give a link to that Zaks page? I'm not sure I have it.) Thanks! |
Hmm, it seems ~MREQ goes active on T1 and for ~IORQ it's T2, so doesn't look like the right time for wait states? |
I do suspect the offsets might be wrong. When integrating your CPU core I experimented with my memory/IO wait state alignment until all the tests passed. That might mean my figures are skewed by the same error. You're welcome to use any parts of the patch that still apply. Someone sent me an analyzer trace showing OUT (C),B [ED 41] being executed on real SAM hardware to change the border colour. All memory access are contended and the I/O access is also contended (but with different wait state rules): MREQ does go low during T1, and WAIT is pulled low during the same tick if memory contention applies. It looks like T1 is complete but since WAIT is still low at the start of the next rising CPU edge then memory wait states are applied there (just 1 in this case). I think that means T2 should start at the next rising CPU edge that WAIT is also high. The 'ED' opcode seems to stabilise on the address bus during the 3rd tick, which should be T2 when PC is incremented. This trace actually shows MREQ getting pulled low 4 times, with memory refresh after each instruction fetch. I'm not taking refresh into account for contention, despite the trace showing 1 wait cycle on 3 of the 4 accesses. I don't know whether that means it only applies to RD/WR cases or whether the wait state rounding rules happen to cover it. Does the trace make sense for your understanding of executing ED 41? Pages 70-71 of this book show the details I was looking at: |
Oh, that's perfect. So my reading of this is, here we have:
Then if your aim is to catch the moment of sampling ~WAIT, shouldn't we just call something like |
Or maybe it should be just |
…t states. Based on patches by Simon Owen.
…t states. Based on patches by Simon Owen.
…it states. Based on patches by Simon Owen.
[#10] Introduce a handler for sampling the ~WAIT line and appying wait states.
…it states. Based on patches by Simon Owen.
In the zx project memory contention is applied in places such as
on_fetch_cycle
andon_fetch_cycle
before any base operations are performed. That would imply that wait states happen even before the address has been placed on the bus. Is that correct?I would expect T1 to always execute first to place the address on the bus, before device contention is considered. Applying contention relies on the current tick count too, so it seems important to have applied the initial cycle(s) before any rounding up to the next uncontended position. The Zaks book doesn't cover it in detail but other online sources suggests that T2 might effectively repeat as long as WAIT is held low. I'd guess that means the T2 tick is added after the contention.
For example, if contention only allows memory reads at multiples of 4 ticks (0, 4, 8, ...):
Executing two NOPs using the current code from T=0 takes 8 cycles, even though the memory access points within the instruction aren't aligned to the access boundaries. I feel the timing should perhaps be 11 cycles:
T1 C C C T2 T3 T4 T1 T2 T3 T4
.I may be missing something, but does what I'm asking make sense?
The text was updated successfully, but these errors were encountered: