Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Serial Service Request In for User Port #7

Closed
dtimber opened this issue Aug 5, 2022 · 30 comments
Closed

Support for Serial Service Request In for User Port #7

dtimber opened this issue Aug 5, 2022 · 30 comments

Comments

@dtimber
Copy link

dtimber commented Aug 5, 2022

Hi,

I'm currently designing a new revision for the SNAC2IEC adapter together with venice1200. For the new revision I want to implement the Serial Service Request In line (C128 IEC port pin 1 used for Fast Serial Clock). Because C64 didn't use this signal there is currently no user IO pin defined for this signal. If I've checked correctly IO6 on the user port should be free to use. Therefore my question is if you can assign user port IO6 to the Serial Service Request In line even if the Fast Serial Clock isn't implemented on your core yet?

Thanks for the reply in advance.

@eriks5
Copy link
Collaborator

eriks5 commented Aug 5, 2022

Fast serial is implemented, just pushing the commits (branch iec) for that right now. It even already has SRQ on IO6. I did not have a way to test it, so keep me posted!

I'll post the rbf on the MiSTer forum soon.

@dtimber
Copy link
Author

dtimber commented Aug 5, 2022

Thanks for the fast feedback. It will take some time to finish the design and assemble the PCB. I will keep you updated. Thank you also for the effort you put into the core. I wish you a pleasant weekend.

@thierer
Copy link

thierer commented Aug 27, 2022

Just as a a quick feedback (I'll dig deeper when I find time): If I connect the SRQ line and use a 1571 as an external IEC device, the C128 hangs for me during the auto boot probe, so something in the communication doesn't seem to be working (Loading doesn't work either).

The same drive works fine as an external device if I don't connect the SRQ line (or when connected to a real C128, for that matter).

@eriks5
Copy link
Collaborator

eriks5 commented Aug 29, 2022

That's unfortunate, but not really unexpected since I could not test it. I'll have a look through the code again to see if I can find an obvious mistake, but I probably need to get a hardware setup going to debug. I'm patiently waiting for my MiSTer IronClad plus to arrive, until then I'm running the DE10 bare so no way to easily hook the drive up. I already have a C64 userport IEC adapter which I think is easily converted to add the SRQ line by soldering on a patch wire (very retro ;)

@thierer
Copy link

thierer commented Aug 29, 2022

I played around a bit and I think the problem is that the CIA SR doesn't seem to honor the SRs data direction flag (CRA bit 6) and so sp_out pulls DATA low if the last bit transferred was a 0, even if the SR is switched to input again.

You seem to have logic to handle that for the internal 1571 and 1581, but it doesn't seem to affect the external IEC and imho this belongs in the CIA model, anyway. If the SR is in input mode, sp_out should be 1'b1.

[Edit]

I already have a C64 userport IEC adapter which I think is easily converted to add the SRQ line by soldering on a patch wire

You probably know this, but just in case: The IEC bus is 5V, so you also need some kind of level shifter to safely connect it to the MiSTer board.
[/Edit]

@eriks5
Copy link
Collaborator

eriks5 commented Aug 29, 2022

The data direction switching is external logic using a couple of 74LS chips (U58 with a bunch of inverters and pull up resistors around it), its not handled by the CIA directly. In the C128 the fast serial direction bit is actually provided by the MMU, probably to hide it in C64 mode. I haven't had time to look but my suspicion was indeed that there's something wrong in my translation of that logic circuit. The schematics were tough to understand initially (for me).

You probably know this, but just in case: The IEC bus is 5V

The Ironclad has the DB9 style SNAC connector which handles the level shifting afaik. Not looked into it in detail since I don't have it yet, but the SNAC/DB9 to IEC adapter I have is passive, so I think a single wire on IO6 to the correct pin on the DIN connector should do the trick.

@thierer
Copy link

thierer commented Aug 29, 2022

The data direction switching is external logic using a couple of 74LS chips (U58 with a bunch of inverters and pull up resistors around it), its not handled by the CIA directly. In the C128 the fast serial direction bit is actually provided by the MMU, probably to hide it in C64 mode.
I suspected this part first, but it seems to work as intended.

That SP isn't disconnected from the bus when CRA bit 6 is cleared is imho a bug: If I clear this bit on my C128 with FSDIR set and the last bit sent from the SP was a 0, then the DATA line on goes high. If I do the same on the MiSTer, DATA stays low.

That said, it might not be the only problem, because it seems to affect receiving as well, when FSDIR should be cleared.

This is from trying to load a directory (the part where the actual content starts to be transferred, that's why the first two bytes are $01 and $04):

C128

image

MiSTer

image

DATA should be released after every byte, the listener (in this case the host) should then set it to acknowledge the byte. This is what happens with the C128, but with the MiSTer it keeps the value of the last byte received in the SR.

This isn't where the transfer hangs, one more byte is transferred:

C128

image

MiSTer [Edit]Sorry, posted the C128 image twice, first[/Edit]

image

I'm not really sure what could happen here, I'll have to check the protocol a bit better, but my guess is that the host waits for DATA high after the second byte which only happens after the drive starts sending the next byte the then the communication is somehow mixed up and stalls.

The Ironclad has the DB9 style SNAC connector which handles the level shifting afaik.

I haven't heard about the "DB9-MiSTer" before, but I would indeed guess, that the models mentioned ("SLIM, DRIVE, MINI and PLUS") do level shifting between the board and the DB9 connector, so if you have one of these you should be fine. I just assumed you use a barebone de10 nano.

@eriks5
Copy link
Collaborator

eriks5 commented Aug 29, 2022

Found the possible issue by reviewing the code. The incoming SRQ line is not fully connected internally. The (missing) DATA high is an ack that fast serial clk is detected. That ack is not sent because the incoming SRQ is not received. Since SRQ out is properly connected, one side tries to communicate using fast protocol while the other side uses the slow protocol...

The issue is on line 1189 in c128.sv. it should read similar to the line above.

Done all this on my phone, don't have access to my pc with Quartus right now, I'll try to get the fix in and an .rbf for you to test tomorrow

@thierer
Copy link

thierer commented Aug 29, 2022

Done all this on my phone, don't have access to my pc with Quartus right now, I'll try to get the fix in and an .rbf for you to test tomorrow

Oh wow, sorry, it's not that important. I didn't mean to cause any pressure!

Of course I'm happy to test, but no hurry! Unfortunately I haven't looked into building the core myself, yet, so yes, I would need a new .rbf.

@eriks5
Copy link
Collaborator

eriks5 commented Aug 31, 2022

Please try this release when you have the time.

Made several changes including the one you suggested for sp_out. Somehow it didn't initially register with me you were referencing the 6526, but I figured it out :)

Not sure if it is the issue though, since the 1571/1581 in the MiSTer had no problem with this. If it doesn't solve this, it could be a timing issue. In the MiSTer everything is clocked from a single master clock, but with a real 1571 the clocks aren't in sync.

@thierer
Copy link

thierer commented Aug 31, 2022

Thanks!

This looks better, but still doesn't work. This is the part where reading the directory starts, as above:

image

Now DATA goes high after both bytes, as with the C128.

Then some more bytes are transferred, but after a few bytes it always hangs with DATA high, which makes me suspect the host is missing a SRQ signal and keeps waiting for a byte:

image

I wrote a small test program to run on the C128 which waits for a byte to arrive in the SR and then changes the background color and displays the received byte value in the upper left corner of the screen.

When I send data, there's both cases where bits are apparently missed (background color doesn't change after the byte has been sent) or the wrong bit value is sampled (color changes but byte value is wrong).

I'll attach the program in case you'd like to play with it (cl65 -t none srtest.s -o srtest with cc65, but should be easy to port to a different assembler. SYS 8192 to start). I used a digital discovery adapter to send the data, but any (preferably SPI-capable) MCU board connected to SRQ and DATA should do.

srtest.zip

@eriks5
Copy link
Collaborator

eriks5 commented Sep 2, 2022

I'm suspecting noise on the SRQ and/or DATA lines to cause this. In the C128 circuitry there is a 74LS14 schmitt-trigger inverter (U16 on this diagram) on the input side of DATA and SRQ going to the SP and CNT pins of the CIA which probably are there to filter out noise on these lines.

Right now, any low-to-high transition on SRQ causes a shift-in of the next DATA bit. If SRQ is noisy it could register multiple transitions where there should have been one, causing the observed issues.

Is there any way you can test for this? I'll look into creating a digital filter in the FPGA on the input lines to test the theory, shouldn't be too hard.

--edit

Doesn't even have to be noise. If the signals rise or fall too slow, it could trigger a false transition as right now the SRQ line is sampled at 32 MHz. That probably should be on the CIA clock of 1 MHz, but even then it could occasionally trigger a false transition

@thierer
Copy link

thierer commented Sep 2, 2022

I can't rule that out, but I'm using the external IEC with the C64 core a lot (obviously without the SRQ line) and never found any issues.

I'm measuring the the signals before the level shifting, so in theory if the noise is added later, it could affect the core without being visible on the trace. To test that, I could directly interface the digital discovery to the FPGA pins, which should remove that potential problem. I'll do that.

Doesn't even have to be noise. If the signals rise or fall too slow, it could trigger a false transitions as the SRQ line is sampled at 32 MHz

Could you maybe for a test version route the internal CNT and the "data ready" signal from the CIA to some other FPGA pins (maybe to the USER_IO 0 and 1 pins normally used for UART, but really any signal that's accessible on the GPIO pins should be fine) so I could include these signals in the LA trace? (I sample at 12 MHz, so the signal shouldn't be too short, maybe just toggle the output on every change?).

[Edit]
What also could be useful to check if there are spurious signals on the SRQ line is if you use an internal counter to measure the time between positive CNT edges and if it's less than ~7us (the 1571 sends with a cycle time of 7 us) then raise a signal on a GPIO pin. I could then sample with a much higher rate and trigger on that signal.
[/Edit]

@eriks5
Copy link
Collaborator

eriks5 commented Sep 2, 2022

The C128 does debouncing of the clk and data line for the slow protocol in software:

debpia
	lda d2pra	;debounce the pia
	cmp d2pra
	bne debpia
	asl a		;shift the data bit into the carry...
	rts		;...and the clock into neg flag

I'm sure the C64 and the drives do it as well. The CIA fast serial is pure hardware, so any debouncing done by the C128 hardware needs to be in the MiSTer.

If you had Quartus you could use Signal Tap to look at the internal signals. I'll look into exporting some signal.

@thierer
Copy link

thierer commented Sep 2, 2022

The C128 does debouncing of the clk and data line for the slow protocol in software

But the fastloaders generally don't. But I agree, the situation is different, as the edge detection is done in hardware.

If you had Quartus you could use Signal Tap to look at the internal signals.

I do have Quartus (13.1 and 20.1), I just couldn't yet be bothered to install 17.0 as a third version :) Maybe 20.1 would do too, I haven't looked into that yet.

Then again, my knowledge of Verilog / VHDL is only very basic, anyway. Plus I've never even used Signal Tap. So there would be a rather steep learning curve.

@thierer
Copy link

thierer commented Sep 2, 2022

To test that, I could directly interface the digital discovery to the FPGA pins, which should remove that potential problem.

Did that. I connected the digital discovery directly to the FPGA GPIO SRQ/DATA pins and sent data while running my "srtest" program on the MiSTer core.

I also changed the data rate to 1 bit/s, so I hoped I could see if the core received a spurious CNT signal, because then the background color would change before the complete byte is transferred.

With this setup I could still easily reproduce the transfer errors, but without any indication that the byte arrived early because of a spurious CNT (but, see below).

As a test, I also once only sent 7 bit and waited to see how likely an extra CNT would be, because this would complete the byte and change the background color. It sat for minutes without completing the byte (but CNT is high in this case; maybe noise is more likely while CNT is low, or, as you suggested, during an edge).

Now, in one case I actually observed an early byte transfer, probably caused by some noise. So, yes, there should be some filtering equivalent to the Schmitt-Trigger, but it doesn't seem to be the main problem.

Another thing I noticed (probably not related) was that in this one error case, the SR stayed out of sync in following transfers.

I would have expected that toggling the SR direction should take care of that? Am I missing something?

https://github.com/eriks5/C128_MiSTer/blob/ee8d38774c4095ea70b2fd0dd34e8278649aa999/rtl/mos6526_8520.v#L545

What I do:

    ; reset SR
    lda CRA
    ora #$40
    sta CRA
    and #<~$40
    sta CRA

[Edit]I think I know what's happening: I'm doing this immediately after a byte has been received in the SR. If the transfer is out of sync, bits are still transferred at this time. I would have to wait for 8s to be sure the current transfer is done before clearing the SR.[/Edit]

@thierer
Copy link

thierer commented Sep 2, 2022

Maybe 20.1 would do too, I haven't looked into that yet.

That wasn't too bad. Just copied pll_q17.qip to pll_q20.qip and off it went. And the resulting rbf even at least seems to work.

Now I only have to figure out how to do something useful :)

@eriks5
Copy link
Collaborator

eriks5 commented Sep 3, 2022

Signal tap needs a lot of memory cells in the FPGA to be useful, unfortunately the C128 fills them up almost to the limit. I use the changes this patch to free up some memory cells so I can do longer traces: reduce-mem.zip. It disables the C128 standard romset and reduces VDC memory to 16k.

The following .stp file traces IEC data I/O: iec_stp.zip

If it all works, you can get the following trace when loading something on the internal 1581:

iec

To see the external IEC signals, you'll need to tap emu:emu|ext_iec_*

@eriks5
Copy link
Collaborator

eriks5 commented Sep 3, 2022

Could you try the new version in the iec branch?

@thierer
Copy link

thierer commented Sep 3, 2022

Could you try the new version in the iec branch?

I just did a quick test trying to load the directory, but unfortunately still doesn't work. Still gets stuck with DATA high after a few bytes.

I guess the best way forward is to familiarise myself with Signal Tap. It seems to be the appropriate tool for the job.

@thierer
Copy link

thierer commented Sep 3, 2022

I haven't yet gotten Signal Tap to do what I want, but made an interesting (unless I'm making some stupid mistake) observation: What actually does help is your "reduce-mem" patch 🤡.

I couldn't believe it and went back to the unpatched version, but it seems to be reproducible.

It's still not perfect; I can load the directory, but some characters are garbled.

@thierer
Copy link

thierer commented Sep 3, 2022

Registering cnt_in seems to fix it for me; even without be1d159e and 5a6141b0. (I haven't done extensive tests yet, but it looks way better than everything else so far).

I'm not sure this is the correct way to do it and cnt_in_r should probably replace cnt_in in other places, too, but you get the idea.

diff --git a/rtl/mos6526_8520.v b/rtl/mos6526_8520.v
index 88b9a94..0657e66 100644
--- a/rtl/mos6526_8520.v
+++ b/rtl/mos6526_8520.v
@@ -83,6 +83,7 @@ reg        sp_transmit;
 reg [ 7:0] sp_shiftreg;
 reg        icr3;
 
+reg        cnt_in_r;
 reg        cnt_in_prev;
 reg        cnt_out_r;
 reg [ 2:0] cnt_pulsecnt;
@@ -502,7 +503,7 @@ always @(posedge clk) begin
     end
 
     if (!cra[6]) begin // input
-      if (cnt_in && !cnt_in_prev) begin
+      if (cnt_in_r && !cnt_in_prev) begin
         sp_shiftreg = { sp_shiftreg[6:0], sp_in };
         if (cnt_pulsecnt == 3'h0) begin
           sdr  <= sp_shiftreg;
@@ -532,18 +533,21 @@ end
 always @(posedge clk) begin
   reg cra6_prev;
   if (!res_n) begin
+    cnt_in_r     <= 1'b1;
+    cnt_in_prev  <= 1'b1;
     cnt_out_r    <= 1'b1;
     cnt_out      <= 1'b1;
     cnt_pulsecnt <= 3'h0;
     cra6_prev    <= 1'b0;
   end
   else begin
-    cra6_prev <= cra[6];
-    cnt_in_prev <= cnt_in;
-    cnt_out <= cnt_out_r;
+    cra6_prev   <= cra[6];
+    cnt_in_r    <= cnt_in;
+    cnt_in_prev <= cnt_in_r;
+    cnt_out     <= cnt_out_r;
 
     if (cra[6] != cra6_prev) cnt_pulsecnt <= 3'h0;
-    if (cra[6] ? (!cnt_out_r && cnt_out) : (!cnt_in && cnt_in_prev)) cnt_pulsecnt <= cnt_pulsecnt + 1'b1;
+    if (cra[6] ? (!cnt_out_r && cnt_out) : (!cnt_in_r && cnt_in_prev)) cnt_pulsecnt <= cnt_pulsecnt + 1'b1;
 
     if (phi2_p) begin
       if (!cra[6]) cnt_out_r <= 1'b1;

@eriks5
Copy link
Collaborator

eriks5 commented Sep 4, 2022

Great! However I don't understand why this works but the debouncing didn't. The debouncing effectively also latches cnt_in, but a bit earlier in the chain. The only difference is that cnt_in is delayed 1 clock cycle now. We might be running into Quartus doing a weird optimization that is defeated with the extra register.

Anyway, I'll commit these changes and will take another look at it to see if I can figure out why this works when I get the hardware to connect a 1571 to the MiSTer myself.

Do you know if it works with 5a6141b in? That fix should probably stay in as the 6526 would also sample cnt at the phi2 clock.

@thierer
Copy link

thierer commented Sep 5, 2022

However I don't understand why this works but the debouncing didn't.

Me neither. I plan to investigate it, it's just that I thought I'd start with the completely broken version, because triggering on random transfer errors is harder.

Do you know if it works with 5a6141b in?

Just checked and for some reason it doesn't. But 5a6141b alone doesn't work any more, either. I don't know if I'm doing something wrong or it's just that the whole timing is very unreliable. The effect I observed from the "reduce-mem" patch would suggest that.

That fix should probably stay in as the 6526 would also sample cnt at the phi2 clock.

Good point. But I think this still differs from what the 6526 does. I did some tests a while ago, and from what I remember CNT gets latched in the phi2 high phase, but SR is sampled only 2 phi2 cycles later.

The implementation should probably reflect that, but I'm not sure what would be the best way. (I don't know what's exactly happening in the first phi2 cycle following the positive CNT edge, either). I'd have to look into what I did back then.

@eriks5
Copy link
Collaborator

eriks5 commented Sep 7, 2022

I did some tests a while ago, and from what I remember CNT gets latched in the phi2 high phase, but SR is sampled only 2 phi2 cycles later.

I read about that a while back! And then promptly forgot about it again 🤣 Can't remember where I read it and can't find it again.

The implementation should probably reflect that, but I'm not sure what would be the best way. (I don't know what's exactly happening in the first phi2 cycle following the positive CNT edge, either)

Yes, should have that too if it's to properly communicate with real hardware. It might just be due to the way things are implemented internally. Or it might be some signal debouncing happening in the 6526!

The board I was waiting for finally shipped, I should be able to connect a 1571 to the MiSTer myself in about 2 weeks. I'll dive back into this then.

@thierer
Copy link

thierer commented Sep 7, 2022

I read about that a while back! And then promptly forgot about it again rofl Can't remember where I read it and can't find it again.

I have no idea where you could have read it, but I discussed my findings in this thread on forum64.de. (At the time I was under the impression that CNT gets "sampled" at the negative phi2 edge, but I now think that was a lack of understanding on my part, it's probably just that CNT is latched by phi2 and then registered on the next positive phi2 edge. Also, in the post I linked, I wrote that the worst case was 3.5 cycles. That was a typo, it should have read 2.5 cycles.

In post #42 of that thread the OP details what he thinks the CNT input stage looks like. I have no idea he did any more research (that doesn't happen to be you, does it 😂? I think he's from the Netherlands, too).

Using the FPGA board to debug was a bit tedious, so I recently designed a hat for a STM32 Nucleo-64 board to directly control a 6526. Using the STM32 it's possible to drive the lines with at least 20ns resolution (Unfortunately there aren't enough usable 100 MHz timers for all signals). I ordered prototype boards just yesterday, most of the software still needs to be written, though 😁.

@eriks5
Copy link
Collaborator

eriks5 commented Sep 7, 2022

That topic looks familiar, so it probably was it 😄 No I'm not the OP.

The serial port implementation of the 6526 in the MiSTer C64 core was not working for me when I initially tried getting fast serial working, so I googled a lot to find hints on how to fix it.

@eriks5
Copy link
Collaborator

eriks5 commented Sep 25, 2022

Fixed the fast serial IO issue. I was finally able to test with real hardware, and it now works for me.

I have split up the existing 'iec' branch I've been testing with until now and extracted the fast-serial specific fixes into a new branch (https://github.com/eriks5/C128_MiSTer/tree/iec-ext-fast-serial) to keep things clean. This new branch does not contain any of the 1571 changes, just the changes for using fast serial with the internal 1581 (which already worked) and the external IEC.

There was also an issue with iec not working properly when both internal and external IEC was active, this is fixed now too, so it's possible to e.g. have drive 8 be an internal drive and drive 9 an external one.

rbf can be downloaded here: https://github.com/eriks5/C128_MiSTer/releases/tag/20220925-fclk

@eriks5 eriks5 closed this as completed Sep 25, 2022
@thierer
Copy link

thierer commented Sep 30, 2022

I tested the new version (sorry it took me so long...) and can confirm that it works for me like a charm, too!

I also tested (only with the first disk, but I think that should still give a good indication) the "Colour Spectrum" demo, which doesn't use the kernal for loading but an IRQ loader that also uses burst mode for transfers. I thought this would be a good test for the CIA implementation as the loader might use it in a slightly different way than the kernal, but it seems to work fine! The title screen has some issues, but that's probably due to the VDC implementation. I couldn't spot any problems with the graphics in the actual slideshow.

Thank you so much, I'm really excited for this C128 core!

@eriks5
Copy link
Collaborator

eriks5 commented Oct 1, 2022

Awesome that it works for you too! And thanks for the pointer to that demo. It doesn't work with the internal 1571 I'm working on.

It's also useful for testing the VDC! The initial video corruption is because it's using interlaced mode which I haven't implemented yet. And I'm actually surprised the scroller works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants