Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"MAD EFFECT" demo doesn't work correctly #699

tomcw opened this issue Oct 6, 2019 · 3 comments

"MAD EFFECT" demo doesn't work correctly #699

tomcw opened this issue Oct 6, 2019 · 3 comments


Copy link

@tomcw tomcw commented Oct 6, 2019

Download the demo and source code from here.
NB. You'll need to have a Mockingboard (in slot-4) and be running in 50Hz(PAL) mode.

@Archange427 emailed me with these details:

Just a quick note to let you know that the new French Touch release (MAD EFFECT) breaks AppleWin again.
The good news is that it is easy to have a special/fixed version for AppleWin (and AIPC that has the same issue).

The bad news is that to make a fixed version for AW and AIPC, I have to add 6 cycles in my precise counting!
So there is between a real APPLE II PAL and AppleWin 6 cycles of differences.

A few more details:
I finally have a routine which allows to have a precise horizontal synchronization at each frame (without obviously counting each cycle during the complete frame).
This allows you to play music with CPU usage that is not always the same for each tick.

I use only one interrupt BUT I use several specificities of the 6502 which require a real precision at the emulation level.

  • how an interruption occurs during a 6502 instruction (and how many cycles will have elapsed depending on the case)
  • at which cycle the actual reading happens during an LDA style instruction. I know we've talked about this problem many times, but I think this time it's a really important issue here.
  • and maybe other things (the number of cycles that occur during a 6502 interruption) .

I will of course release the source code at the same time at the DSK.
And I particularly detailed it at the cycle level. This should make it possible to compare at which level the differences appear.

@tomcw tomcw added the bug label Oct 6, 2019

This comment has been minimized.

Copy link
Contributor Author

@tomcw tomcw commented Oct 9, 2019

I asked:

Are you using any different opcode addressing modes for reading the floating bus or video mode changing? EG. You discussed using different opcode addressing modes in #665

Arnaud replied:

No, I used Absolute mode only this time.


This comment has been minimized.

Copy link
Contributor Author

@tomcw tomcw commented Oct 9, 2019

In MAD effect, what is problematic is the horizontal synchronization part that I use.

As it is the problem with the emulators (AppleWin/AIPC). I will only refer to it. All the rest of the code is perfectly handled by AppleWin/AIPC so no need to talk about it :)

So the main goal of this part is to get horizontal synchronization at each frame start.
As I use modes/pages switches during each line of the display to do the "main effect", I need a precise sync.
And as I play music (and do another things) during VBL, I don't want to have to count cycles and especially not having to have constant cycles code during VBL.
It's such a pain (I did it so often before) and totally archaic I think. It was time to do better!

  • The first step is to gain precise synchronization at the beginning of the last line of VBL.
    As I want later to be precisely at cycle 0 of the first line of display, I need a little time to put some code. Hence the choice of the last line of VBL.
    To do that, I used the beautiful (and crazy) code from Jim Sather. I don't think we need to get into that. But this code, alone, is more wonderful than anything else here!
    But alas, impossible to use it for the main goal (alone) because you never know how many frames it will take to synchronize.
    So I used it one time only during the first step.
    So once that I'm synchronized at the beginning of the display (technically he raises at cycle # 52 of the last line of VBL), I spend time to do... nothing to come back to the beginning of the (next) last line VBL.
    Where I wanted to be!

  • Second step: I define my interrupt here from cycle #0 of the last line of VBL. It takes me 14 cycles. We can do shorter but the important thing is to know how much it will take cycles precisely.
    If I count 14, it's because the effective WRITE of the STY opcode occurs at the 4th cycles of this opcode. And it's when the WRITE occurs at T1C_1-High (6522) that the counter is launched.
    There is potentially here a first problem with emulators. But as with emulators, the VBL will also have been detected at cycle 1 (and not 4th). Maybe not...

So the main part is done ;) Now I just to need to wait until INT occurs for the first time. These first two steps will only be used once.
The goal was to define the interrupt. The delay is is obviously an entire PAL delay (20280 cycles).
note: even if MAD EFFECT only works in PAL (because I do not have enough time during a NTSC VBL to play the music, do what I have to do for the display, etc.), the synchronization code works with NTSC too.

So here the main trick...
As you know, when an interruption occurs, you never know how many cycles will have elapsed when you arrive at the beginning of the interrupt code!
The 6502 internal operations uses 7 cycles (this part is constant). But during which opcode executed when the INT occurs, a "number" of cycles is added.
if I believe some documents (if you take all the 6502 possible opcodes), this number is between 2 and 9!

  • The main idea for the third step is therefore to reduce this uncertainty!

So I build a large table of NOP opcodes to force the interrupt to occur during a NOP! This is the third step.
With a 2 cycles opcodes like NOP, when INT occurs, there are only 2 possible cycle numbers to add: 2 or 3.
Alas, this number is not fixed yet...

So in my interrupt code, I must have a way to further reduce this uncertainty. ...

  • Fourth step: when INT occurs...
    Cycles elapsed here since the beginning of the last VBL line:
    7 cycles (6502 INT internal operations) <- fixed
    2 OR 3 cycles (INT during NOP) <- inconstant (but not so much ;)
    14 cycles (remember , INT was defined at cycle 14 of last line of VBL...) <- fixed

So second trick: go until cycle #52 (or #53) of the last VBL line to check if we are already in display!
-> at cycle #53 if 3 cycles have elapsed after NOP/INT occured
-> at cycle #52 if only 2 cycles have elapsed after NOP/INT occured.

and a LDA VERTBLANK / BPL do the tricks...
if BPL is taken: + 3 cycles (we were at cycle #52).
If BPL not taken: +2 cycles (we were at cycle #53).

After this BPL, we're in Cycle #55! Whatever happened!

So I just need to finish the last VBL line with 10 cycles more and... I synchronized cycle 0 of first line of display!

Obviously no RTI here. So previously, I pulled the return address stacked by 6502 INT internal operations and added a CLI to restore INT.
And I reset the interrupt flag. I had time to spare until cycle #52 (or #53)...

So we are now at cycle 0 of first line of display.
During the 192 lines, I display my stuff.
During next VBL, I play music and modify some code for the next display... and JMP to my NOP TABLE to wait the next INT... (third step)

And go to fourth step and so on...

You will note that with APPLEWIN, when fourth step occurs, we don't have at all 7+14+(2 or 3) cycles elapsed.
That's why I need to add 6 cycles here to compensate, to do that "LDA VERTBLANK/BPL" do the trick...

And maybe another issue here with the famous read effective at 4th cycle (or first with emulators...)...

Just for information and to end this lonnnng text, I already have a functional v2 synchronization code.
No more using NOP Table, just using the TIMER1 counter to know where I am when INT occurs...
(as the counter never stop, I can exactly know how many cycles elapsed since INT occurs by retrieving the counter information).
So just need to compensate again we are!
But without surprise, it does not work with emulators either. I even think it's even worse because this time, emulators will have emulate the exact cycles of an opcode to determine how many cycles occur exactly when an interruption raises and this, for each opcode! The famous 2/9 cycles...

But the main advantage of this v2 is that it does not use this big NOP table and therefore saves space. And then it is more smart;)
( if it interests you, I have a DSK with simple test program using this v2)


This comment has been minimized.

Copy link
Contributor Author

@tomcw tomcw commented Nov 18, 2019

Closing as fixed in AppleWin here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
1 participant
You can’t perform that action at this time.