Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emd/c128-vdc.s is broken #969

Open
mrdudz opened this issue Oct 31, 2019 · 33 comments
Labels
bug

Comments

@mrdudz
Copy link
Contributor

@mrdudz mrdudz commented Oct 31, 2019

relaying from this VICE bugreport: https://sourceforge.net/p/vice-emu/bugs/1157/

apparently the driver does not handle the VDC status flag correctly, causing transfers to fail. (it works in VICE because it doesnt emulate the VDC correctly).

so, someone who has a real C128 might want to fix it :)

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 10, 2019

I suspect this code:

VDCReadByte:
        ldx     #VDC_DATA
VDCReadReg:
        stx     VDC_ADDR_REG
@L0:    bit     VDC_ADDR_REG
        bpl     @L0
        lda     VDC_DATA_REG
        rts

VDCWriteByte:
        ldx     #VDC_DATA
VDCWriteReg:
        stx     VDC_ADDR_REG
@L0:    bit     VDC_ADDR_REG
        bpl     @L0
        sta     VDC_DATA_REG
        rts

I would change the bpl's by bmi's. If it's not working in this way, it should work in the other way.
The documentation of this reg. is a pity anyway, and it was causing major performance issues with c128 software, because everyone put in wait states on those register writes.
As far as I remember, this is necessary only for memory copying functions, but not for single register reads and writes.

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 10, 2019

I offer to try this on my c128.

@polluks

This comment has been minimized.

Copy link
Contributor

@polluks polluks commented Nov 10, 2019

@MonteCarlos Please take care of reg 36, the RAM refresh rate

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 10, 2019

Here is a trial i made using vdc registers without waitstates and it works pretty well on my C128 D:

https://my.pcloud.com/publink/show?code=VZ6J9QkZoLqRBLqOgqkA0pkP1aroSYzn5VDk

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 10, 2019

Maybe you need to start the file with sys. i don't remember actually.
-> sys 4900

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 10, 2019

I see that i used "ldx $d600" "bpl ..", too at one place in my code. So probably i have to correct my statement that the "bpl" should be exchanged with "bmi" in the c128 vdc code.

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 17, 2019

I just ran the tgidemo on my c128D. It loaded and ran perfectly in 80 column mode.
so, what's actually broken?
Can you specify more detailed how to reproduce the problem?

@mrdudz

This comment has been minimized.

Copy link
Contributor Author

@mrdudz mrdudz commented Nov 17, 2019

check the original bug report linked in the first post, please. perhaps the bug is in em-test.c ... or perhaps even in the compiler used? that would be odd though :)

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 17, 2019

Sorry, i did not get test c128-vdc-emd.prg testprogram running, neither in c64 nor in c128 mode.
In c64 mode i do not see the typical basic trampoline (do i need to enter a sys?) and in c128 mode it loads into the basic memory resulting in a basic mess. I tested with 1541U on my C128D. Maybe that's the reason. Could you provide more info how the testprg should be run and what preconditions needs to be fulfilled?

Is the testprogram the same as em-test.c, which i found at the location
https://github.com/cc65/cc65/tree/master/testcode/lib? I did not found it at https://github.com/cc65/cc65/tree/master/libsrc/c128/emd like written in the given link, btw.

However, i think i found a starting point by accident: Obviously the c128-vdc.s in libsrc/c128/tgi/c128-vdc.s works as i tested the c128 tgi output in 80 column mode. From what was said,
the c128-vdc.s in libsrc/c128/emd/c128-vdc.s is errornous. Therefore i made a diff of both files.

Among several unrelated differences, there are some major differences: All register accesses in the tgi code goes over the subroutines at the end of the source file. This is not the case for the em code. The routine "transferin" at line 190 uses direct register accesses with an suspicious two byte read at once optimization and the routine "vdcsetsrcaddr" at line 328 also uses direct register access omitting waitstate on setting the high byte of the address.
From my experience the waitstates are really needed on buffer operations but not with single register reads writes. Therefore i would guess that the optimization from line 199 to line 203 is incorrect, wheres the vdcsetsrcaddr is likely to work on real hw as it does not use buffer operations.

I suggest to open another issue which asks for removing the different naming conventions used in the two sources and extracts common code. My diff (command line and meld) of the both files is really polluted because of this. F.e., the em code uses "vdcgetreg" whereas the tgi code uses "VDCReadReg".

@mrdudz

This comment has been minimized.

Copy link
Contributor Author

@mrdudz mrdudz commented Nov 17, 2019

https://sourceforge.net/p/vice-emu/code/HEAD/tree/testprogs/memory-expansions/c128-vdc-emd.prg

load and run in 4o columns mode (works for me in VICE ... BUT it should hang!)

i agree with everything else you said... but without a C128, i cant be of much more help :)

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 17, 2019

Should it pass on real HW or should it fail?

In summary, c128-vdc-emd.prg fails on real hardware because the test program does not fully respect the VDC busy flag. note: The source for the test program is available at https://github.com/cc65/cc65/tree/master/libsrc/c128/emd (link kindly provided by Groepaz)

Now you say, it fails on VICE (hangs) ??? Also, if the test program gets sth wrong, why searching the bug in the library?

Please make the point. I get the impression that the problem is not clearly stated, as obviously the test program is not the source of the evil.
The first thing to find a bug is to clearly state the problem.
Sorry for the nitpicking. I don't want to be mean, i'm just a little bit confused...

@mrdudz

This comment has been minimized.

Copy link
Contributor Author

@mrdudz mrdudz commented Nov 17, 2019

Should it pass on real HW or should it fail?

it should hang - see the original bug report (and the first post :))

Now you say, it fails on VICE (hangs)

no, it works in VICE. but it should hang instead (this is a bug in VICE!) :)

so (if we also look at VICE), there are two problems:

  • the test program (supposedly) hangs on a real C128
  • the test program does NOT hang in VICE. this is a VICE bug (the status bit is not emulated correctly, and writes to registers take no time i think)

ie you should be able to load the above linked .prg on a real C128, and when you run it the C128 should hang. and since this is a cc65 compiled program it should not matter if you run in 80col or 40col mode (however since it uses the vdc memory, 80col will not turn out well =P).

i only tested in 40 col mode in VICE to confirm the .prg is actually working fine as far as it comes to loading and running it - which you apparently had problems with. perhaps if you use 1541u, copy the prg to a d64 and mount it (do not run via "dma load")

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 18, 2019

i'll try the test prg again using a d64.
let's clarify:

  1. the test is expected to hang on real hw
  2. the test should run under vice but it should hang as on real hw
  3. the test actually does not hang under vice due to incomplete emulation of the real hw

did i now understand correctly?

@greg-king5 greg-king5 changed the title c128-vdc.s is broken emd/c128-vdc.s is broken Nov 18, 2019
@greg-king5 greg-king5 added the bug label Nov 18, 2019
@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 18, 2019

Funny enough the test ends with the ready promt on the real hw. however it outputs "data mismatch in page $0001 at $2edd. Data is $0000 (should be $0001)". The filling counter has the value $3f when the program ends. So according to the total page count of $40, all pages have been tested.

@mrdudz

This comment has been minimized.

Copy link
Contributor Author

@mrdudz mrdudz commented Nov 18, 2019

did i now understand correctly?

yes! :)

as for the result on real hw... i am willing to accept that on real hw the buggy program doesnt actually always produce the same result (ie with another VDC you might get different results). the z64k author for example uses NTSC, which could also change the timing requirements.

@oliverschmidt

This comment has been minimized.

Copy link
Collaborator

@oliverschmidt oliverschmidt commented Nov 18, 2019

Just for the record: Quite some time ago, I wrote in
https://github.com/cc65/wiki/wiki/Reminders#C128_library:

Test EM drivers with the multidemo sample; and, fix the broken ones.

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Nov 22, 2019

Just to clarify, the c128-vdc-emd.prg test program fails on real NTSC and PAL hardware. see http://c-128.freeforums.net/thread/610/c128-vdc-vice-test-repository with results from multiple machines. I confirm it always fails on my real PAL C128D/64kb VDC with varying results. It returns to command prompt and outputs message similar to "data mismatch in page $0001 at $2edd. Data is $0000 (should be $0001)" as observed MonteCarlos. The page count detected is either $0040 or $0100.

The bug in the test program is most likely caused by a read or a write to a VDC register while the VDC is still busy from a previous write to a VDC register. Any write to a VDC register with trigger the busy flag and won't be released until the VDC register operation is complete. VDC GFX fetch and DMA refresh cycles take precedence and will delay the VDC register updating. Some of the observations of the VDC busy flag behavior are documented here.

In summary the VDC busy flag should be always tested if you are unsure exactly when the busy flag will be released, including for operations that are not buffer related. :)

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 23, 2019

There is an ambiguity in the test. We don't know if it fails in line 135 or 193.
Would it be possible to split the program into more small tests?
I will also run the test several times to get some statistics.

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 25, 2019

I called the testprg 15 times, one after each other; and, every 5th time, I reset the machine via the side button.
This is my result:

page at act exp page count
0001 2edd 0 1 256
0001 2edd 257 1 ~
0000 2edd 64 0 ~
0000 2edd ~ ~ ~
0000 2edd ~ ~ ~
;----
0001 2edd 257 1 64
0001 2f0b ~ 1 ~
0001 2edd ~ 1 256
0000 2edd 64 0 ~
0000 2edd $4040 0 ~
;----
0001 2edd 257 1 64
0001 2edd 0 1 256
0001 2edd 257 1 ~
0001 2edd ~ 1 ~
0001 2edd ~ 1 ~

On my opinion, the results appear very systematic; whereas, considering a status flag bug,
one would expect a more random result.
Besides the status flag, there may be another more systematic source of error; I suspect,
F.e., counters.

@mrdudz

This comment has been minimized.

Copy link
Contributor Author

@mrdudz mrdudz commented Nov 25, 2019

but does waiting for the status flag fix it?

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Nov 25, 2019

When I disable the status flag emulation in Z64K the test passes without error. When the status flag emulation is enabled the test fails exactly like real hardware. I applied a hack to Z64K to see if I can find the register being written to/read from while the VDC status flag is busy and I discovered 2 culprits. Writing to register 18 and reading from register 31. Reading of register 31 before the data is valid would be the cause of the Data mismatch. Waiting for the status flag should fix it.

@greg-king5

This comment has been minimized.

Copy link
Contributor

@greg-king5 greg-king5 commented Nov 26, 2019

Here are some versions of the test program. They were built with different versions of the Extended Memory Driver. Those versions wait for permission to access register 18, register 31, or both of them. ("vdc-em-contents.txt" tells you which one does which waiting.)

Which ones work or don't work on real hardware?
vdc-tests.zip

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Nov 26, 2019

Cool. I just did a quick test using Z64K. The tests "both-em-test" and "r31-em-test" pass as expected. The rest of the tests have data mismatches as expected. I will test on real hardware tonight.

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Nov 27, 2019

I just tested on real c128D with 64kb VDC. A fail is when a Data mismatch occurs. I ran the tests 20 times each

  • R31-EM-TEST is the only one that passes but occasionally fails when the page count detected is $0040. It always passes when the correct page count of $0100 is detected.

  • BOTH-EM-TEST and R18-EM-TEST always fails with incorrect page count of $0040.

  • EM-TEST and C128-VDC-EMD always fails. Page count detect is either $0040 or $0100.

I will have a look at the source code and see if I can see anything that could make the program unstable when run on real hardware.

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Nov 27, 2019

There must be something wrong with the VDC ram size detection code. I modified the tests slightly so it always has a page count of $0100. BOTH-EM-TEST and R31-EM-TEST always passes on my real C128D VDC 64kb with the modification. R18-EM-TEST always fails on the same machine with the modification - Data Mismatch in page $0001 at $2e82 Data is $0000 (should be $0001)

; cmp ptr2+1
beq @have64k
bne @have64k

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 27, 2019

also tested on realhw :
both test: fails with pc of $40 and dta mism. @2d07,pg 0: ff3f instead of ffff on second test.

r18: fails on first test, also with pc of $40. dta mismatch @ 2e82, pg1, 0 vs 1

r31: fails on second test with pc $40, mismatch @2d02, pg0: ff3f vs ffff

em_Test: fails with pc of $100 on first test. mism @2e83, pg1. 0101 vs 0001

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Nov 27, 2019

it seems that the additional waitstates resulted in modified mismatch locations and values but did not resolve the issue. i will have a look into the code and test further, tomorrow. it seems odd to me that fails occur seldomly although the suspicious code is executed so many times.

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Nov 30, 2019

Apparently, the order of writing to the update location VDC registers (18 and 19) matters. I've never seen mention of this in the C128 Programmer's Reference Guide or Mapping the C128. I replaced the vdcsetsrcaddr function with the following code, and tested on real hardware, with positive results. This offers some explanation why adding the wait for register 18 was causing the page detection not to work. :)

vdcsetsrcaddr:
	pha
	tya
        ldx     #VDC_DATA_HI
 	; ldx     #VDC_DATA_LO
        stx     VDC_ADDR_REG
@L0:    bit     VDC_ADDR_REG
        bpl     @L0
        sta     VDC_DATA_REG
	inx
        ; dex
        ; tya
	pla
        stx     VDC_ADDR_REG
@L1:    bit     VDC_ADDR_REG            ; XXX: Test waiting for register 18
        bpl     @L1
        sta     VDC_DATA_REG
        rts

Using the above code modification, BOTH-EM-TEST and R31-EM-TEST detected correct memory size, and the tests pass with no errors. In R18-EM-TEST and EM-TEST, the page detection is correct; but, the test always fails with data mismatch. That is the behavior I expected on real hardware.

On another note, the RAM detection will most likely fail on a 16kb-only VDC because of how its memory layout is. Referring to the table here, locations $0200 and $4200 are different locations for a 16K machine in 64K mode, so will incorrectly detect a 64KB VDC. I would suggest testing locations $0200 and $8200 instead.

@greg-king5

This comment has been minimized.

Copy link
Contributor

@greg-king5 greg-king5 commented Nov 30, 2019

Using the above code modification, BOTH-EM-TEST and R31-EM-TEST detected correct memory size, and the tests pass with no errors.

Did your modified R31-EM-TEST not wait for register 19?

On another note, the RAM detection will most likely fail on a 16KB-only VDC because of how its memory layout is. Referring to the table here, locations $0200 and $4200 are different locations for a 16K machine in 64K mode, so will incorrectly detect a 64KB VDC. I would suggest testing locations $0200 and $8200 instead.

It means also that the RAM-type bit should be restored only if 16 KB is detected.

@greg-king5

This comment has been minimized.

Copy link
Contributor

@greg-king5 greg-king5 commented Nov 30, 2019

vdcsetsrcaddr:
	pha
	tya
        ldx     #VDC_DATA_HI
 	; ldx     #VDC_DATA_LO
        stx     VDC_ADDR_REG
@L0:    bit     VDC_ADDR_REG
        bpl     @L0
        sta     VDC_DATA_REG
	inx
        ; dex
        ; tya
	pla
        stx     VDC_ADDR_REG
@L1:    bit     VDC_ADDR_REG            ; XXX: Test waiting for register 18
        bpl     @L1
        sta     VDC_DATA_REG
        rts

I'm amazed that so many of us have looked at that subroutine, over so many years, and not noticed that it's inefficient! The .Y index register doesn't need to be transferred to the accumulator. It can be stored directly to the VDC.

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Nov 30, 2019

No, I just cut and paste the modified code to the different tests so neglected to not wait for register 19 in R31-EM-TEST. I just updated that test and ran on real hardware a couple of times. It still passes OK without needing to wait.

@willymanilly

This comment has been minimized.

Copy link

@willymanilly willymanilly commented Dec 1, 2019

Lol. I should have noticed that inefficiency when I modified the code! It seems so obvious now. :)

@MonteCarlos

This comment has been minimized.

Copy link

@MonteCarlos MonteCarlos commented Dec 1, 2019

I can integrate those new insights into my pull request if they are not already integrated.
#987

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.