Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

z80: interruptible execution #10047

Merged
merged 41 commits into from Aug 30, 2023
Merged

z80: interruptible execution #10047

merged 41 commits into from Aug 30, 2023

Conversation

holub
Copy link
Contributor

@holub holub commented Jul 7, 2022

Credits:

  • undocumented flags - Manuel Sainz de Baranda y Goñi, Peter Helcmanovsky, Patrik Rak
  • tables removal - smf, happppp, Lord Nightmare

z80 interruptible implementation is ready for review
Main execution don't overflow scheduler frame more than 3 icounts (atomic for IO read/write)
take_interrupt() hasn't been converted yet (possibly TODO)

Drivers which must be updated before all TIME_GUARD sections completle removed from the code:

  • Amstrad
  • MSX
  • System1

@holub holub marked this pull request as ready for review July 8, 2022 11:55
@holub holub marked this pull request as draft July 8, 2022 13:40
@holub holub marked this pull request as ready for review July 11, 2022 23:58
@holub
Copy link
Contributor Author

holub commented Jul 11, 2022

More confident now... Let's try to make it happen!

@holub
Copy link
Contributor Author

holub commented Jul 16, 2022

drives identified as broken with PR:

  • sigmab98.cpp - all machines

@holub
Copy link
Contributor Author

holub commented Jul 16, 2022

All not working drivers identified during testing fixed!

@simzy39
Copy link
Contributor

simzy39 commented Aug 29, 2022

Can you mark this as Ready for Review?

@holub
Copy link
Contributor Author

holub commented Aug 29, 2022

Can you mark this as Ready for Review?

I think it is. Yet actuated with current master.

@ghost
Copy link

ghost commented Aug 30, 2022

There seem to be 2 lines of thought with this, yet surprisingly nobody leaving comments here.

The first is that the Z80 code is already ugly, relies too much on macros, and trying to fit these changes in a core never designed for it is a waste of time and we should just wait for a complete new core.

The other is that this is the best way to get these changes, as it built on top of a mostly trusted Z80 core, so if things break in the process it'll be easier to identify and fix them, whereas a brand new core would almost certainly come with it's own set of problems at first that might make it difficult to identify the source of issues.

I'm in the latter camp, I think this is worth doing, but maybe the lack of other comments / movement suggests that is a minority opinion.

@holub
Copy link
Contributor Author

holub commented Aug 30, 2022

I agree lack of comments on this PR is discouraging.
IMHO we can wait for new z80 core for another 15 years or take advantage from these changes immidiatly. It's already provides big step forward and I think using it I can reimplement Spectrum memory contention using WAIT state, without playing with icount.
Macroses are ugly but without them the changes look even worse with anonymous functions. And frankly a lot of original macroses removed here.

I open to any suggestions if anybody is interested.

@smf-
Copy link
Member

smf- commented Aug 31, 2022 via email

@holub
Copy link
Contributor Author

holub commented Aug 31, 2022

We should expect performance drop because old code was highly optimized with inlining which is not possible in inerruptible version. Based on my naive "max frame skip" performance test, the drop is ~40% from original.
That is still acceptable for z80 as it's not expected to run with clock faster than 20MHz.

@ghost
Copy link

ghost commented Sep 1, 2022

ultimately this is going to be the decision of @cuavas

will you allow this route to be taken? aside from a performance decrease (which will likely happen even if a new cycle-accurate core is written from scratch) this doesn't inherently seem worse than what we have on a technical level.

personally I'd like it if at the point we get python generated cores, that it be possible to generate 2 different cores, one 'performance' core, where we're confident that cycle accuracy never comes into play, and another where it does, in order to reduce any performance loss for the less demanding cases, but if such a core is written we can deal with it at the time.

as it stands there are a number of cases where improving the Z80 is going to be important and could lead to improvements in the emulation of other components.

@ghost
Copy link

ghost commented Nov 30, 2022

@cuavas since no consensus can be reached on this, how about some kind of compromise, like including it as a secondary core, /z80_interruptible or similar?

that way, it can be used where there seem to be benefits, with no risk to anything else, and can prove its worth in those situations?

@rb6502
Copy link
Contributor

rb6502 commented Nov 30, 2022

I think it's fine to actually just be the Z80 core. I'm not really concerned about perf on any Z80 drivers. Would be nice to get a second opinion from @cuavas or @galibert first though.

@holub
Copy link
Contributor Author

holub commented Aug 12, 2023

Custom tables only left in the system1. Need inspiration or help...

@simzy39
Copy link
Contributor

simzy39 commented Aug 12, 2023

Are you a member of the forums? Can post threads there or chat in the chat box. https://forums.bannister.org/
Great work tho, hopefully @cuavas can give feedback

@holub
Copy link
Contributor Author

holub commented Aug 12, 2023

added as a comment - sharing for discussion here:

system1 hardware change clock for m1. There are two possibilities to get rid of custom tables here (suprloco requires the same).
1. Keep overclocked z80 and use multipliter + adjust icount respectively in m1 (opcode_read)
	pros: more precise timings
	cons: z80 requires extra multiplier state which no other drivers use
2. Use proper z80 clock and only adjust icount in certain m1 reads
	pros: z80 can drop multiplier support
	cons: for couple commands timings going to be behind actual

@holub
Copy link
Contributor Author

holub commented Aug 13, 2023

I can declare a victory now!

@holub
Copy link
Contributor Author

holub commented Aug 17, 2023

'Q flag' for ccf/scf fixed:
image

@holub
Copy link
Contributor Author

holub commented Aug 17, 2023

If there are concerns regarding this PR and we still don't see it happen, I can move tables removal and flags update to master branch.

@holub
Copy link
Contributor Author

holub commented Aug 20, 2023

all tests in Patrik Rak's (raxoft) z80test - passed 100%
+ rest interrupted block instructions uncovered by them:
image

@holub
Copy link
Contributor Author

holub commented Aug 20, 2023

I'm currently reimplementing the 68000 for. among other things, interruptibility. Once it fully works I'm going to add all that's needed for wait states & friends (which shouldn't take that long) and test it throughly on the ST and the Lisa. What about we wait on the z80 until then (eg ~ a month) to see how we can make it fit in what's needed for the wait states?

@galibert your turn

@simzy39
Copy link
Contributor

simzy39 commented Aug 20, 2023

What`s the performance drop now?

@holub
Copy link
Contributor Author

holub commented Aug 21, 2023

What`s the performance drop now?

If I did everything right... ~30%:

» ./mame-master spectrum -bench 300
Average speed: 2768.18% (299 seconds)
------------------------------------------------------------------------------------------------
» ./mame-int spectrum -bench 300
Average speed: 1878.82% (299 seconds)

@galibert galibert merged commit a94254a into mamedev:master Aug 30, 2023
5 checks passed
@rb6502
Copy link
Contributor

rb6502 commented Aug 30, 2023

This cannot be compiled on a Mac with 16 GB of RAM and no other applications open. That's a serious problem.

@rb6502
Copy link
Contributor

rb6502 commented Aug 30, 2023

John IV reports "pacman -bench 90 from 18110 to 5749". That's 30% of the former speed, which is also a serious problem.

@john-iv
Copy link

john-iv commented Aug 30, 2023 via email

@holub
Copy link
Contributor Author

holub commented Aug 30, 2023

Let's revert it then. I don't have enough technical knowledge to fix these performance issues quickly.

@holub
Copy link
Contributor Author

holub commented Aug 31, 2023

@rb6502 @john-iv just to be clear... you were talking about compilation with DEBUG=0 ?

@john-iv
Copy link

john-iv commented Aug 31, 2023

Yep.
make -Werror REGENIE=1 -j21 STRIP_SYMBOLS=1 ARCHOPTS=-fuse-ld=lld
The numbers were on my i7-12700k at 5.2Ghz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants