Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADF write corruption #32

Closed
jamarju opened this issue Dec 10, 2017 · 54 comments
Closed

ADF write corruption #32

jamarju opened this issue Dec 10, 2017 · 54 comments
Labels

Comments

@jamarju
Copy link

jamarju commented Dec 10, 2017

I'm getting sector checksum errors when writing to an ADF image. Sorry for being so vague, I will try some way to reproduce it consistently. So far it's happened while saving a civilization game but also just copying files over from Workbench. If there is useful data from the firmware out to the serial port I can try and dump that too. Also THANKS for building this, this software is a godsend (therefore you must be a god).

@bstrobel
Copy link

bstrobel commented Dec 12, 2017

I noticed something similar recently. I copied real floppies using the internal floppy disk drive to a Gotek connected as the second/external drive on an A500. I used the Workbench 1.3 diskcopy utility from the command line. Everything seemed to work fine until I tried to use the copies from the disks. I get disk corrupted errors (I do not remember the exact error message at the moment, need to check) for many of them but not all.

I have programming skills in C, an oscilloscope, a logic analyzer and a ST_LINK/V2-1 debugger/programmer from a Nucleo-64 board. The only thing I lack is time to really dive into it. But I hope this is better during the Christmas holidays. Happy to get your advice @keirf where to start and how to approach the issue.

Anyway thanks for the cool project and providing it for free!

@keirf
Copy link
Owner

keirf commented Dec 12, 2017

Always worth trying another USB stick.

bstrobel: First port of call is to make a debug build and then you get serial logging. This will tell you, for example, whether any writes were missed (which, if writing to an ADF, will generally mean your flash drive is very slow on some writes).

@bstrobel
Copy link

Ok, will try this. Will take some time since I have to set up the tool chain first. I'll come back once I have it up and running.

@bstrobel
Copy link

Sorry, that I need to ask, but how to make a debug build? I have set up the dev env and have also executed a "make dist" successfully. But how do I tell make to build a debug version of FlashFloppy?

@keirf
Copy link
Owner

keirf commented Dec 12, 2017

debug=y make -j8 gotek

You can adjust the baud rate at the top of src/console.c (default is 3Mbaud)

@jamarju
Copy link
Author

jamarju commented Dec 23, 2017

Here is the log file of FlashFlopy 0.95a while copying a file that resulted in checksum errors. FlashFloppy compiled with debug=y make gotek and baudrate = 230400 in src/console.c:

ff.log

Aside the normal "Write x/y" lines, I noticed these:

New max: read_us=1809
New max: read_us=3532
New max: read_us=4607
New max: read_us=4906
New max: read_us=8486
Trk 86: late 534us
Trk 102: late 576us
Trk 104: late 302us
Trk 116: late 769us
Trk 128: late 945us
Trk 156: late 19us
Trk 6: late 112us
New max: read_us=30309
Trk 36: late 222us
Trk 42: late 524us
Trk 44: late 774us
Trk 38: late 1032us

An attempt to read the copied file (it's a lha file so I'm just doing a lha t file.lha to test for errors) results in:

"checksum error on disk block 980".

@keirf
Copy link
Owner

keirf commented Dec 24, 2017

Can you attach the damaged ADF? There is nothing obvious in the log. All the Write x/y lines appear to come in groups of 11 (as expected), and there are no warnings about bad headers or missed writes. This might therefore indicate that writes are getting into Gotek RAM okay but writeback to USB is going wrong...

@jamarju
Copy link
Author

jamarju commented Dec 24, 2017

Here it is:

damaged_adf.zip

@jamarju
Copy link
Author

jamarju commented Dec 29, 2017

Okay, so I extracted the test file from the adf ignoring the checksum errors, hexdumped it and took a diff -C 4 GOOD.hex BAD.hex> good-bad.txt, the results of which are:
good-bad.txt

Then I noticed that the errors consist of either:

  • 64 byte block duplications:

captura de pantalla 2017-12-29 a las 22 49 39

  • or right-aligned 56 byte duplications within a 64 bytes block (there are a few examples in that diff, too).

@keirf
Copy link
Owner

keirf commented Dec 30, 2017

If it's not a hardware problem (stick, Gotek) then another plausible explanation is a bug in the USB stack, perhaps sending the same packet twice (USB full-speed packets are 64 bytes). There's a data FIFO in the STM32 chip's USB hardware, and I wonder what happens if that doesn't get filled quickly enough as data is sent. You could try raising the priority of the USB IRQ -- it is USB_IRQ_PRI at the bottom of inc/util.h and you could try changing the value from 14 to 5.

@jamarju
Copy link
Author

jamarju commented Dec 30, 2017

I'll try that and report back, thanks.

@tomse
Copy link

tomse commented Dec 30, 2017

I was installing MSDOS622 using 0.9.5a today, and I had several read errors.
it seemed that the error occured on tracks 37 and 57, could be coincidence I don't know.
I wasn't able to boot on the newly installed DOS - that may or may not be related at all as I installed using a new IDE->SD adapter + SD as a harddrive.

@jamarju
Copy link
Author

jamarju commented Jan 3, 2018

Raised the USB IRQ priority to 5, didn't seem to help. Also I compared two ADFs after doing the same operation on both (copy testfile df0:). Both ADFs were corrupted at random different places, so this is not deterministic at the ADF level.

However, I can see a pattern: duplications always happens at offsets ending in 0xC0.

*** 37625,37636 ****
  00092f80: c6fe 502b 7916 a8c7 32d2 6006 cc70 ac47  ..P+y...2.`..p.G
  00092f90: 88ba 0393 8995 95a1 21a4 a417 67c0 5560  ........!...g.U`
  00092fa0: ab4c 48a8 ff1f 308d 85c3 8016 004d 865d  .LH...0......M.]
  00092fb0: 7586 3a41 f023 0258 eb10 7490 2a80 d113  u.:A.#.X..t.*...
! 00092fc0: c6fe 502b 7916 a8c7 32d2 6006 cc70 ac47  ..P+y...2.`..p.G
! 00092fd0: 88ba 0393 8995 95a1 21a4 a417 67c0 5560  ........!...g.U`
! 00092fe0: ab4c 48a8 ff1f 308d 85c3 8016 004d 865d  .LH...0......M.]
! 00092ff0: 7586 3a41 f023 0258 eb10 7490 2a80 d113  u.:A.#.X..t.*...
  00093000: 0000 0008 0000 0372 0000 0122 0000 01e8  .......r..."....
  00093010: 0000 0499 6d5a 65ff 2db0 dc73 8019 8384  ....mZe.-..s....
  00093020: 22fc 1117 4c43 87d4 f895 c8af 23f8 30cd  "...LC......#.0.
  00093030: d793 e1c8 7495 c50c e16e 51cd ab44 a84d  ....t....nQ..D.M
--- 37625,37636 ----
  00092f80: c6fe 502b 7916 a8c7 32d2 6006 cc70 ac47  ..P+y...2.`..p.G
  00092f90: 88ba 0393 8995 95a1 21a4 a417 67c0 5560  ........!...g.U`
  00092fa0: ab4c 48a8 ff1f 308d 85c3 8016 004d 865d  .LH...0......M.]
  00092fb0: 7586 3a41 f023 0258 eb10 7490 2a80 d113  u.:A.#.X..t.*...
! 00092fc0: 5ccb 04f4 69e9 0aef 9829 fe68 5a08 3473  \...i....).hZ.4s
! 00092fd0: 986b 4017 c51e d401 1d6f 85ef 17ed bd85  .k@......o......
! 00092fe0: 4e3a db79 2bb4 f8a9 0210 e81b af8f f5d8  N:.y+...........
! 00092ff0: 7920 0819 5629 1e24 a8f1 23c6 3857 a5fc  y ..V).$..#.8W..
  00093000: 0000 0008 0000 0372 0000 0122 0000 01e8  .......r..."....
  00093010: 0000 0499 6d5a 65ff 2db0 dc73 8019 8384  ....mZe.-..s....
  00093020: 22fc 1117 4c43 87d4 f895 c8af 23f8 30cd  "...LC......#.0.
  00093030: d793 e1c8 7495 c50c e16e 51cd ab44 a84d  ....t....nQ..D.M

@keirf
Copy link
Owner

keirf commented Jan 3, 2018

Wouldn't particularly expect the corruption to be deterministic, this sort of bug often isn't. You haven't said whether you've tried other USB sticks or Gotek hardware, yet?

@jamarju
Copy link
Author

jamarju commented Jan 5, 2018

I might have solved this problem and it was NOT related to FF.

I reformatted the Amiga hard drive (actually a 2 GB CF) and installed WB 2.1 but this time I used scsi.device 43.45 + PFS3 filesystem instead of the default scsi.device + FFS. I'm using Kickstart 37.300 which is known to cause problems with HD sizes >40 MB so it might make sense that corruption was actually originated at the HD reads.

I'm cautious because bad HD reads don't explain ADF FFS metadata corruption very well (CRC errores reading back from the ADF). But so far I've tested writing to 8 ADF images on both a Sandisk Cruzer Edge 16 GB (4-5 MB/s write speed) and a Kingston Datatraveler G4 16 GB (~16 MB/s write speed) and they all worked fine whereas I would get 100% chance of corruption with my previous config.

@keirf
Copy link
Owner

keirf commented Jan 5, 2018

I wonder if there are any viruses can cause that sort of behaviour?

The definitive test would be to revert your setup, reproduce the errors on FlashFloppy, then test same configuration with a real floppy drive. But that sounds like quite a pain!

Do you want to test some more before closing this ticket down?

@tomse
Copy link

tomse commented Jan 5, 2018

You did of course set maxtransfer to 0x1fe00
otherwise this will lead to read corruption on flash cards

@jamarju
Copy link
Author

jamarju commented Jan 5, 2018

@tomse Yes, I used maxtransfer=0x1fe00 in both scenarios. I think switching from kickstart's scsi.device to 43.45 was the key.

@keirf Well there were two more reports but you can count mine as a no-bug. The FFS metadata corruption is plausible considering that there is no memory protection and a misbehaved scsi.device could affect anywhere in RAM.

@jamarju
Copy link
Author

jamarju commented Jan 5, 2018

Also, I'm attaching the FF binary up to commit 09e5b85 with debug enabled, serial bitrate = 230400 and USB IRQ priority = 5 to help people test without going through installing the ARM toolchain and all.

FF_Gotek-v0.9.5a.upd.zip

@keirf
Copy link
Owner

keirf commented Jan 6, 2018

bstrobel seems to have gone quiet. I will probably close this ticket down next week if there are no further reports.

@bstrobel
Copy link

bstrobel commented Jan 6, 2018

Quiet yes, but I have not lost interest in this topic. Anyway, I had some problems with my Amiga that I needed to solve first. That's why I couldn't investigate further. The connector of the PSU was corroded and causing unstable power supply. It took me some time to find this out. The Amiga seems to be stable now. I will test tomorrow. I think the PSU problem could also be a possible cause of the corrupted files. I'll let you know.

@bstrobel
Copy link

bstrobel commented Jan 7, 2018

No good news. Each file is corrupt it seems.

Test procedure:

  • Created an empty ADF file using WinUAE (well known Amiga emulator for PC).
  • made 5 copies (named test(1).adf, test(2).adf, ...) of the ADF file and copied them to a know good USB stick, not the one that i use to use with my Gotek
  • inserted the USB stick in to the Gotek attached to the Amiga before switching everything on
    • Gotek is attached as the 3rd floppy disk in the system. First the internal drive, then a regulare external 3,5" floppy drive and finally the Gotek as the last drive.
    • Reading from the Gotek always works like a charm (i.e. without any errors or data corruption).
    • When I did my first tests a while ago, I had the Gotek connected as the second drive with the same results.
  • booted the Amiga from the first generation copy of the original Amiga Workbench 1.3 disk (AmigaDOS).
  • "inserted" the first ADF test(1).adf using the rotary encoder and the OLED display on the Gotek
  • opened a shell window from the Workbench and executed the following command "diskcopy from DF0: to DF2: name test"
  • after copying finished I "ejected" the ADF file and "inserted" the next one until I had copied the same data to all 5 ADF files.

Result:

  • All 5 ADF files are different.
  • Used WinMerge to compare.
  • Used HexEditor of my EPROM Programmer Tool (Batronix Prog-Express) to actuall display differences.

The 5 ADF files and screenshots showing the comparison results can be found here:
https://www.dropbox.com/sh/bm18ky3s1q3x59z/AAAAmZNdZPAtr1OX0iadtDuia?dl=0

@bstrobel
Copy link

bstrobel commented Jan 7, 2018

I just noticed that I can easily attach files and pictures to these messages. So for convenience here are the screenshots:

How I copied the files on the Amiga:
amiga screen

WinMerge displaying the differences:
winmerge_test1_test2
winmerge_test1_test3
winmerge_test2_test3

My hex editor showing one occurrence of a difference. Note the strange data pattern there...
hex_test1_test2

Zip containing the 5 ADF files that should be identical:
GotekProb.zip

@bstrobel
Copy link

bstrobel commented Jan 7, 2018

One more thing I forgot to mention. I used AmigaDOS diskcopy in verify mode. So it verifies each track after it copied it to the destination disk. This verification always succeeds. It never shows an error. I wonder how this can be. Does your software cache the data somehow? I would expect that it always reads the data from the file from the stick if it is asked to... If this is the case it would more point to a problem on the Amiga, but how can this be?

@bstrobel
Copy link

bstrobel commented Jan 7, 2018

And finally for today the debug log. I have built the version 0.9.5a with debug enabled and baudrate set to 115200. I have renamed the version to 0.9.5a-d and used the .upd file to update the firmware of the Gotek.

I think your right that the USB stick might be too slow, because I see this in the log:
Write start 50663 us Write 118/0... 3782 us Write 118/1... 3994 us Write 118/2... 3507 us Write 118/3... 3526 us Write 118/4... 3497 us Write 118/5... 4008 us Write 118/6... 3555 us Write 118/7... 3973 us Write 118/8... 3537 us Write 118/9... 3998 us Write 118/10... *** Missed write 322261 us Write start 69602 us Write 120/0... 3927 us Write 120/1... 3603 us Write 120/2... 3631 us Write 120/3... 3614 us Write 120/4... 3566 us Write 120/5... 4028 us Write 120/6... 3619 us Write 120/7... 4048 us Write 120/8... 3568 us Write 120/9... 3453 us Write 120/10... 3498 us

Here is the full log. It covers a whole disk copy:
debug_log.txt

I have only 3 USB sticks (don't need them so often any more these days). They are all of the same type: "Kingston Data Travel 2.0 16GByte" USB 2.0 Sticks. They work well in my oscillograph, for OS installs on PCs etc.

@keirf
Copy link
Owner

keirf commented Jan 8, 2018

Stick was too slow, try another. I will improve performance with slower sticks in a future release. If you can find one these days, I like Lexar. The reason slow sticks are a problem for disk copiers (except ones that do write+verify track by track) is that there is no flow control: disk copier moves immediately to next track and starts writing to the target, while USB stick is still writing previous track, and while that's happening FlashFloppy can't buffer that next track meanwhile (I will fix that in future).

And your Diskcopy verify works because it only verifies that it sees valid amigados track data, it doesn't check that the track data is correct (ie the result of the diskcopy)!

@keirf keirf closed this as completed Jan 8, 2018
@bstrobel
Copy link

bstrobel commented Jan 8, 2018

Yes, I think your right. But... As I wrote above, the disk copy software almost always reads back immediately what it has written to verify the success (not only in the Amiga). I think it is a bug that the disc copy software seems to be happy but the data has actually not been written to the stick.

The only explanation for this to me is that FlashFloppy returns the just written data from its internal memory and does not actually read it from the stick.

To illustrate the severity just look at my use case. These sticks where not the cheapest and are not very old. They are not fast but sufficient for occasional data transfer. (And Kingston is even something like a "brand".) So I as a user would assume they work fine. How can I know that they are too slow for FlashFloppy? How can I be sure that the data has been successfully written even if I use different sticks? There needs to be consistent behavior in this case and definitely a notification to the user.

Not sure how to fix this. Didn't look at your code. But I think if it is not easily achievable to always read the data from the stick I would at least display an error message on the display or blink the LEDs or something. I'd suggest you reopen this case...

@bstrobel
Copy link

bstrobel commented Jan 8, 2018

Oh sorry, I just noticed that you already answered that diskcopy thing. Yes, you're right they most probably dont compare the read back data to the data they wrote. Didnt think of this. Anyway, you need to notify the user.

@keirf
Copy link
Owner

keirf commented Jan 8, 2018

For example try with XCopy with verify enabled. You will find your copies are reliable.

Regarding "notify the user" you mean indicate via display when writes are dropped?

@keirf keirf reopened this Jan 8, 2018
@bstrobel
Copy link

bstrobel commented Jan 8, 2018

Yes, exactly. For Amiga I would even "eject" the floppy image. That would notify the user of the Amiga that the copy has failed.

You know, I copied about 20 floppies to images in a row only to find out afterwards that I cannot trust a single one of the image copies.

@keirf
Copy link
Owner

keirf commented Jan 8, 2018

Well AmigaDOS diskcopy is a bit crap to be fair.

Still... it should at least be a config option to error on missed write. I might add:
missed-write = error | warn | ignore

And make either warn or error the default...

@bstrobel
Copy link

bstrobel commented Jan 8, 2018

Yes, I agree about Amiga diskcopy. :) I guess I had to learn it the hard way since I have the Amiga for a few month only yet.

Aside from that I wonder if other applications are always better than the Amiga. For instance I just can't imagine that a keyboard/synthesizer from the 80's or 90's has a better copy program (my Gotek was originally shipped and packaged for fitting into a synthesizer).

I still think ejecting the image - which looks like ejecting the floppy to the device - is the best way to show something is wrong. The image is unusable anyway. On the OLED you could just display something like "Stick too slow, image ejected". On the 7 seg display "SL" for "too SLow" maybe?

@jamarju
Copy link
Author

jamarju commented Jan 8, 2018

@keirf Are you thinking of double-buffering the track? I can try to implement that and send a PR over for you to review. No guarantees at all since I'm probably overoptimistic about my understanding of the code.

@keirf
Copy link
Owner

keirf commented Jan 8, 2018

@jamarju if you can decipher my code and implement that then you are a coding ninja :)

@jamarju
Copy link
Author

jamarju commented Jan 8, 2018

I'm thinking of setting dma_wr->state = DMA_inactive before calling F_Sync. By the time F_Sync is called the flux ring should have been consumed fully and be prepared to receive a new track while F_Sync happens in parallel. That's assuming F_Sync is the bottleneck and not the various F_writes inside the ADF writing handler.

@bstrobel
Copy link

Hi again! So I ordered new USB sticks from Amazon: SanDisk Ultra Flair 16GB USB-Flash-Laufwerk USB 3.0 mit bis zu 130 MB/Sek.

I also used X-Copy 6.0.1 for copying, but:
x-copy-error

Here is the debug log:
new_stick_debug_log.txt

Anyway, I also did some tests and this is what I noticed:

  • If let X-Copy retry to write the erroneous block it never succeeds. The block stays bad. In the log I see "missed writes" every time I retry. I wouldn't expect this. If I retry after a few seconds to write the same block again, the stick should have recovered from the possible overrun condition and should be able to write at least that 512 byte block successfully.
  • I checked on Wikipedia that an Amiga floppy runs 11 sectors of 512 raw bytes with 300rpm (5 rounds per seconds) which is 28.16 KB/sec. Which USB stick is so slow? I doubt that seriously. Even if FlashFloppy needs ten times the overhead to process "all this data" it should only need 280 KBytes/sec or 0.28 MBytes/second.
  • With the new USB sticks I get "missed writes" for almost every sector although X-Copy doesn't report a very error. Didn't attach this log but I can if you need it. It just contains a lot of "missed writes messages".

To sum up I doubt the "slow USB stick" theory seriously. I think there are some other things wrong. What do you think?

@keirf
Copy link
Owner

keirf commented Jan 15, 2018

@bstrobel looking at the log you are certainly correct there is something else up. Actually X-Copy glitches the WGATE signal at the start of track writes for some reason. I added a workaround for that back in August but I think I broke that workaround not long after!

Please can you try the patch on branch wgate-glitch-filter (i.e., "git checkout wgate-glitch-filter")? Hopefully it will work better. Really you should not see missed writes in X-Copy Verify mode. Let's drill into this further if you still do. You may for example need to put a logic analyser on the floppy interface (especially the WGATE line).

@keirf keirf added the bug label Jan 15, 2018
@bstrobel
Copy link

Thanks @keirf! I tested the wgate-glitch-filter branch. Same result. At least as far as I can tell from the log. Here it is:
wgate-glitch-filter-branch_debug_log.txt

I noticed one strange thing. Where the log stops there is no "missed write" message but actually X-Copy is reporting "Verify Fehler". At lot of "missed write" errors were actually logged before when X-Copy didn't report an error. So it seems "missed write" messages and actual verify errors in X-Copy are not (directly) connected.

I'm happy to invest some time into this. No problem! Starting today I'm 3 days at home with lots of time for this. I will connect the logic analyser to the Gotek and try to make some sense of the signals. Do you have good link where I can find diagrams for the expected data flow?

@keirf
Copy link
Owner

keirf commented Jan 15, 2018

@bstrobel I would expect to see approx 400ms or so spent on each track during write. 200ms with WGATE asserted and activity on WDATA. Then 200ms with RDATA active as the Amiga reads the track back to verify..

The "missed write messages" seem to occur at the end of your track writes. Perhaps on your system X-Copy is glitching the WGATE line at the end of a track write? A logic analyser trace would determine what is going on there.

The log shows track 54 being written twice. That's weird. It either means X-Copy deliberately made two attempts on that track (unlikely you would get a verify fail on two consecutive writes though, would expect fails to be randomly distributed across tracks). Or perhaps a head-step signal was missed. Worth getting a logic analyser trace for the STEP and SIDE signals too.

You don't have a speaker mod on the STEP line I assume?

I am around tomorrow, doing some other jobs including an A2000 repair, but will have time to advise you if you want to communicate on this ticket or via email (my email is in the Git repo commits). I may also have time to try X-Copy myself and see if I can repro the issue.

@bstrobel
Copy link

It appears I will not have as much time as I thought today. Anyway I'm going to connect the LA and set everything up now until I have to leave for some other stuff. I found the Amiga Service Manual provides all information that is needed. I downloaded it from here http://amiga-manuals.xiik.net/amiga.php.

I use the Gotek as an external floppy drive. The external connector and logic board is from Ebay. Not sure I can trust it fully. I have another external "real" 3.5" floppy drive as well.

I'll play around a bit and see if can find something noticeable. By end of tomorrow I should be able to post a few screenshots from the LA. I will also check the signal quality directly at the Gotek connector with my oscilloscope. Maybe there are some unwanted glitches on the signal lines. I'll keep you posted.

@bstrobel
Copy link

I was able to spend some time on this. I connected the LA and used X-Copy 6.01 to do a DOSCOPY+. It failed after the second sector:
xcopy-la

This is what the LA shows:
x-copy_la1

As you can see WGATE is active (low) for 2 times of about 200msec length with a very short break. I assume this is the first sector. That one is verified okby x-copy. For the second sector WGATE is also active for two 200msec periods but with a break of about 200msec in between. May this be the issue?

Nothing happens after what you can see in the screenshot. I triggered on DS0 because this is what the drive sees (although it is DS2 in fact.)

This is the full capture:
x-copy-LA1.zip

You can download the LA software from the Saleae website and drill into it yourself. But I guess you know this already. :)

@keirf
Copy link
Owner

keirf commented Jan 16, 2018

Thanks I use Saleae myself so this is useful. SIDE signal would have been more interesting to sample than, say, TRACK00. Then we can tell when it is trying to read/write same track back-to-back, vs switching side and doing other track on same cylinder (if you see what I mean).

I will also see about trying this myself. I know last time I tested X-Copy it was a different version (X-Copy (1993)(Cachet).adf). Can you provide me a link to exactly which 1992 version you tested (eg. a link on this site: http://jope.fi/xcopy/masterdisk.html). It is possible that different versions behave differently, also I may have tested DOSCOPY rather than DOSCOPY+, but not sure that should make much difference...

@bstrobel
Copy link

May the INDEX signal from the Gotek be the problem? Isn't supposed be indicating the beginning or end of a revolution of the disc and should be very stable with the same period? There is one missing between the first 2 writes. And measuring its period width I noticed it seems to jittering between exactly 200msec and 192msec. This seems to me a considerable difference if - as I assume - this is the reference point for de-serializing the data.
x-copy-index-jitter

@bstrobel
Copy link

I will do another trace with SIDE signal instead of TRACK00. I thought with TRACK00 we could reference with the STEP and DIRECTION signal which track is actually written. But we are stuck on track 0 already, so it doesn't make sense.

@bstrobel
Copy link

Here is the sample:
x-copy-LA2 - SIDE signal.zip

Regarding the X-Copy version, I use a disk that I got with my Amiga. When I boot from it I get this screen:
xcopy-version

Then I press F6 for version 6.01.

@keirf
Copy link
Owner

keirf commented Jan 17, 2018

Discussed this with @bstrobel offline and it seems there are possibly two issues:

  1. X-Copy issues writes back-to-back for the two sides of each cylinder, then does the verifies back-to-back. i.e., side-0 write, side-1 write, side-0 verify, side-1 verify, step to next cylinder and repeat. The time between the back-to-back writes is approx 3ms so unless you have a reliably very fast stick for 512-byte writes you will generally miss the second write for each cylinder. The only sticks I know are fast enough in my own testing are some Lexar models (Lexar S45/S75).
    The solution sadly is to implement multi-buffered writes. It's a chunk of work. I will raise a separate issue to track it.

  2. We observed the INDEX pulse inexplicably occurs some milliseconds early after the first write to each cylinder. That is, the first INDEX pulse that occurs after FF reverts to read mode (because it missed the start of the second write) is earlier than expected, with no serial log message to warn of this. It is probably not related at all to the ADF write corruptions, but I will try to reproduce this and see if there are bugs in my INDEX timing management. I may raise an issue for this also.
    [EDIT: I realise now that this behaviour is expected, as INDEX gets resynced after a write. Furthermore, to fix Slow read/write speeds ST/HFE on Atari ST (WD1772) #44 I am probably going to even more deliberately break INDEX sync across writes.]

@keirf
Copy link
Owner

keirf commented Jan 28, 2018

FYI I will work on pipelined/buffered writes after I release v0.9.7a next week. With luck it will be in v0.9.8a.

@bstrobel
Copy link

bstrobel commented Jan 28, 2018 via email

@keirf
Copy link
Owner

keirf commented Feb 6, 2018

Please can you try the following patched firmware. Ignore the version number (it is v0.9.7a with patches on top). Pipelines writes. Should fix your issues.

Alternatively you can build tip of current master (this contains the same patches).

ff_32_1.zip

@bstrobel
Copy link

bstrobel commented Feb 6, 2018

Thanks Keirf! I tried your ZIP. Unfortunately it doesn't work. It always fails on the track 0, side 1:
pipeline-0 9 7_verify_error_doscopy

I've also run the LA once with DOSCOPY and once with DOSCOPY+. Both recordings are in this zip file:
pipeline-0.9.7_LA1.zip

@keirf
Copy link
Owner

keirf commented Feb 6, 2018

Oh dear. Can you get a serial log output? I may also add more serial logging for you. Perhaps we can continue this by email (my email is in my Github profile page)?

I should add I was unable to reproduce this back-to-back writes behaviour, but perhaps it is something to do with using multiple/external drives (I tested DF0-to-DF0 only).

The LA output is weird in some respects, eg READY does not follow SEL0. Should be permanently asserted (ie LOW) while the Gotek is selected.... Perhaps READY is actually SIDE? That would make more sense...

A further note: RDATA is inactive now for the second back-to-back write, which indicates pipelining is somewhat working for you. It seems to read both sides of DF1, then write both sides of DF2, then verify side 0, then verify side 1 (and fail), then retry write-verify-write-verify of side 1 before failing. Very unhappy on that side 1.... needs serial logging.

@keirf
Copy link
Owner

keirf commented Feb 7, 2018

Closing this as @bstrobel issue is not related to the new write pipelining patches, and we suspect hardware problems. If further analysis tells us different then we will open a new issue ticket specific to that.

@keirf keirf closed this as completed Feb 7, 2018
@bstrobel
Copy link

bstrobel commented Apr 2, 2018

Just to finally confirm the hardware issue. Maybe this can also help others that use external floppy adapters from kmtech.co.uk which are available on ebay.

This adapter uses a 4 NAND 74HC00 instead of the required open collector version 74LS38. This leads to conflicts between the external floppies on the RDY line which obviously caused this problem.

I swapped the 74HC00 with a SN7403N (another open collector version of the 7400). Now I can copy from my 3.5" Floppy (DF1:, 2nd drive on the Amiga) to my Gotek (DF2:, 3rd drive on the Amiga) without problems.

Thanks again, Keir, for the great work and the patience with me. :)

PS:
I can't recommend these adapters from kmtech.co.uk at all. There were 2 more issues with them. The most striking one was that they had the connector on the ribbon cable attached in the wrong way. This completely short cut the floppy bus making the Amiga absolutely silent when the adapter was attached. Fortunately my Amiga wasn't damaged and the connector could easily be detached and attached correctly.

The other issue was, the input lines of the unused gates and flipflops of the 74HCs are not connected at all. This is just plain bad design and can cause random problems.

@keirf
Copy link
Owner

keirf commented Apr 2, 2018

Ah yes I will bear this in mind for future user problems. In fact I have seen these problems in the KMTech design myself, years back, and informed Kevin Mount but never received any response. So I cannot recommend any of his products.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants