Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors with Krikzz's Mega-CD Verificator #105

Closed
benderscruffy opened this issue Jun 25, 2024 · 6 comments
Closed

Errors with Krikzz's Mega-CD Verificator #105

benderscruffy opened this issue Jun 25, 2024 · 6 comments
Labels
bug Something isn't working fixed-next-release Fixed on master, will go into next release md Genesis / Mega Drive

Comments

@benderscruffy
Copy link

mcd-verificator.zip

@jsgroth
Copy link
Owner

jsgroth commented Jun 25, 2024

I used this when I was first working on Sega CD and it helped fix bugs in a bunch of games that depend on obscure behaviors, but I obviously didn't get all of the tests passing.

Current results with a disc in the drive:
mcd-verificator

VAR tests error 02, IRQ test error 09, and REG 8030 error 07 are all timing errors that seem to indicate that the Genesis 68000 is running slightly too fast relative to the Sega CD hardware. All three of these pass if I change the SCD oscillator frequency from 50 MHz to 50.52 MHz but I don't think that's right.

Word RAM error 20 is caused by not emulating some of the word RAM access behaviors. Based on this test, if the sub CPU accesses word RAM while it's in 2M mode and owned by the main CPU, the sub CPU should halt until the main CPU returns word RAM. This is currently not emulated at all.

CDC flags error 22 is because of incorrect behavior when CDC DMA is set to transfer data to one of the CPUs directly via the CDC host data register. The CDC interrupt should fire when the final word is moved to the host data register, not when the CPU reads it.

CDC DMA2 error 04 looks like it's testing an obscure behavior where CDC DMA to PRG RAM will only ever transfer an even number of bytes, so if the DMA length is odd then it will transfer one less byte than requested.

Finally, CDC DMA3 error 04 is caused by the EDT bit in the CDC status register not being set correctly when one of the CPUs is reading words via the CDC host data register and there is only one word left to read. This is closely related to the CDC flags error.

The CDC DMA tests check lots of other things after where they're currently failing so there will probably be other issues after fixing the current errors.

@jsgroth jsgroth added bug Something isn't working md Genesis / Mega Drive labels Jun 25, 2024
@jsgroth
Copy link
Owner

jsgroth commented Jun 25, 2024

Stalling the main CPU for 2 out of every 128 mclk cycles fixes the VAR tests but doesn't fix the IRQ test or the REG 8030 test. It causes them to fail in the other direction, where it looks like the main CPU is running very slightly too slow compared to the Sega CD hardware:

mcd-verificator

Expected value ranges are 224-226 for the IRQ test and 1286-1288 for the 8030 test. The IRQ test counts how many timer interrupts occur with interval=1 in the time that it takes the main CPU to generate 1024 software interrupts for the sub CPU, and the 8030 test counts how many iterations of a simple loop pass before a timer interrupt triggers with interval=255.

I suspect the refresh delay is slightly different compared to the standalone Genesis when the main CPU is executing out of Sega CD BIOS ROM, but this aspect of the hardware is not documented at all and is very hard to test. All three of these tests pass if I stall the main CPU for 2 out of every 172 mclk cycles instead of 2/128:

mcd-verificator

@jsgroth
Copy link
Owner

jsgroth commented Jun 25, 2024

IRQ error 0E is caused by a timing/synchronization issue between the two CPUs. This test has the main CPU set the software interrupt flag 3 times in rapid succession and it only expects the sub CPU to execute its INT2 handler twice. Because of how execution between the two CPUs is interleaved right now, the sub CPU ends up handling all 3 interrupts which is wrong.

Part of the reason this is so tight is that the main CPU routine has this C code:

    ga->IFL2 = 1;
    ga->IFL2 = 1;
    ga->IFL2 = 1;

Which compiles to this:

move.l ($0196A4), a1
move.b #1, (a1)
move.l ($0196A4), a1
move.b #1, (a1)
move.l ($0196A4), a1
move.b #1, (a1)

Those MOVE.L instructions are very slow and give the sub CPU a lot of time to handle the interrupt and execute its very short INT2 routine, which is just this:

move.w #2, ($8026)
add.w #1, ($8028)
rte

Right now the sub CPU ends up acknowledging the second interrupt 7 CPU cycles before the main CPU sets the flag for the third time.

Based on this, when the 68000 handles an interrupt, it doesn't begin to acknowledge the interrupt until 10 cycles into its 44-cycle interrupt handling process. My 68000 implementation doesn't currently support any level of sub-instruction timing, but buffering and delaying the sub CPU interrupt acknowledge by 10 cycles fixes this test:
mcd-verificator

@jsgroth
Copy link
Owner

jsgroth commented Jun 26, 2024

The word RAM and CDC flags tests now pass:
mcd-verificator

Word RAM 20 was fixed by halting the sub CPU if it accesses word RAM in 2M mode while it's owned by the main CPU. The sub CPU remains halted until the main CPU writes DMNA=1. This also fixes #101

Actual hardware halts the CPU mid-instruction, but that's very difficult to emulate efficiently with my 68000 implementation, so instead I halt it immediately after the instruction that accessed word RAM. If that instruction wrote to word RAM then the write is buffered until the sub CPU is unhalted. This isn't completely accurate but it's good enough to pass the test and to fix Marko's Magic Football.

There were a number of bugs revealed by the CDC flags tests:

  • 22: When DMA destination is set to the host data register (for either CPU), the transfer end interrupt should fire when there is one word left for the CPU to read, not after the CPU reads the last word. This also fixes CDC DMA3 test 04
  • 26: Transfer end interrupts should not generate INT5 for the sub CPU if the previous transfer end interrupt was not acknowledged by writing to DTACK
  • 30: Decoder interrupts should trigger at 75Hz when enabled (DECEN + DECIEN), even if the decoder is not receiving new sectors from the CDD
  • 32: This is actually a bad test - it reads from STAT3 (register 15) and then immediately reads the next CDC register, expecting to read IFSTAT (register 1). Making invalid register and COMIN reads return 0xFF instead of 0 gets the test to pass
  • 34: Decoder interrupts should not trigger INT5 if there is an unacknowledged transfer end interrupt pending
  • 35 (supposed to be 36): Transfer end interrupts should not trigger INT5 if there is an unacknowledged decoder interrupt pending
  • 40: The decoder interrupt flag should automatically clear about 40% of the way through a 75Hz frame
  • 44: The decoder interrupt flag should always get set in IFSTAT at the appropriate times even if decoder interrupts are not enabled
  • 46: Dumb bug, $FF800A (CDC DMA address) was not correctly mapped for reads in the sub CPU memory map

@jsgroth
Copy link
Owner

jsgroth commented Jun 26, 2024

All tests now pass:
mcd-verificator

The various CDC DMA issues were:

  • DMA2 04: If DMA length is odd, DMAs to PRG RAM and word RAM should skip the last byte (because transfers are word-size)
  • DMA2 12: DMAs to PCM RAM should not skip the last byte if DMA length is odd, because transfers to PCM RAM are byte-size
  • DMA3 28: If one of the CPUs reads from the host data register while it's assigned to the other CPU, it should read the current contents of the register without progressing the DMA
  • DMA3 44: DMAs to word RAM should halt while word RAM is in 2M mode and owned by the main CPU
  • DMA3 48: DMAs to PRG RAM should halt while the sub CPU is removed from the bus via BUSREQ
  • DMA3 50: Changing device destination mid-DMA should only reset the DMA address to 0, not terminate the DMA
  • DMA3 60: Writing to the host data register should progress host data DMA, same as reading

I don't love emulating memory refresh delay as stalling the main CPU for 2 out of every 172 mclk cycles, but it shouldn't hurt anything since it's less of a delay than games are likely to encounter (and there was no delay emulation at all before).

@jsgroth jsgroth added the fixed-next-release Fixed on master, will go into next release label Jun 26, 2024
@benderscruffy
Copy link
Author

great job

@jsgroth jsgroth closed this as completed Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed-next-release Fixed on master, will go into next release md Genesis / Mega Drive
Projects
None yet
Development

No branches or pull requests

2 participants