Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of BOSS ZSort ucode (WDC, Stunt Racer) #1685

Closed
Gillou68310 opened this issue Dec 14, 2017 · 76 comments

Comments

Projects
None yet
10 participants
@Gillou68310
Copy link
Contributor

commented Dec 14, 2017

@gonetz, @olivieryuyu

I rebased my work against master:
https://github.com/Gillou68310/GLideN64/commits/wdc

The game will only run fullspeed if you disable all LOG_WARNING.

The game won't work until mupen64plus/mupen64plus-rsp-hle#62 is merged.

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2017

You should check if REG.SP_STATUS is valid before trying to modify it. If the emulator doesn't support that (Project64 or older versions of mupen64plus), it's going to crash.

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2017

For instance with Gauntlet Legends, I only enabled the infloop hack if SP_STATUS wasn't nullptr:

https://github.com/Gillou68310/GLideN64/blob/02bf5444df0d37e59cdc12c4d0032444bf79299c/src/RSP.cpp#L337-L338

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 14, 2017

You're right. BTW why not checking direcly if SP_STATUS is null instead of using this hack flag?

@gonetz

This comment has been minimized.

Copy link
Owner

commented Dec 14, 2017

@Gillou68310 Are you going to fully implement this ucode?

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 14, 2017

I don't have much time to work on it these days, that's why I'm sharing it in it's current state ;-)

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 14, 2017

Last time I worked on it was October 2

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2017

You're right. BTW why not checking direcly if SP_STATUS is null instead of using this hack flag?

@Gillou68310 I did it because I wasn't sure if the check address == (RSP.PC[RSP.PCi] - 8) was true in any other games. Basically I wanted to make sure that it only applied to Gauntlet Legends, since that check will be executed for other games. I didn't do any testing to see if it affected any other games, I assume not.

BTW, in your commit (Gillou68310@942756d) it doesn't seem to catch every instance, there is usage of SP_STATUS in RSP_ProcessDList(), ZSortBOSS_EndMainDL(), ZSortBOSS_WaitSignal() and ZSortBOSS_Obj() as well

@olivieryuyu

This comment has been minimized.

Copy link

commented Dec 14, 2017

I will try to continue on this ucode after the current HLE project...

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 14, 2017

@Gillou68310 I did it because I wasn't sure if the check address == (RSP.PC[RSP.PCi] - 8) was true in any other games. Basically I wanted to make sure that it only applied to Gauntlet Legends, since that check will be executed for other games. I didn't do any testing to see if it affected any other games, I assume not.

Ok I will keep it then.

BTW, in your commit (Gillou68310/GLideN64@942756d) it doesn't seem to catch every instance, there is usage of SP_STATUS in RSP_ProcessDList(), ZSortBOSS_EndMainDL(), ZSortBOSS_WaitSignal() and ZSortBOSS_Obj() as well

Damn I need to stop doing things in a hurry :-)

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 14, 2017

@Gillou68310 I did it because I wasn't sure if the check address == (RSP.PC[RSP.PCi] - 8) was true in any other games. Basically I wanted to make sure that it only applied to Gauntlet Legends, since that check will be executed for other games. I didn't do any testing to see if it affected any other games, I assume not.

Thinking about this twice if address == (RSP.PC[RSP.PCi] - 8) is true the game will be locked anyway, so it won't harm if we always do the test.

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2017

Thinking about this twice if address == (RSP.PC[RSP.PCi] - 8) is true the game will be locked anyway, so it won't harm if we always do the test.

Yes you are probably right, at the time I added the hack because I didn't have the time/motivation to test and make sure it was needed, it's probably safe to change it like you've done

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 15, 2017

Ok I updated the last commit, if ucode type is ZSortBOSS and pointer to SP_STATUS reg is null, an assertion failure is triggered. It will avoid checking the pointer every time we try to access it.

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Dec 16, 2017

Just for reference/interest, this is what it looks like in mupen64plus:

world_driver_champ-000
world_driver_champ-001

Not bad, seems like a good amount of progress!

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Dec 16, 2017

@Gillou68310 I'm curious why there is no sound. I was under the impression that these BOSS games used the CPU to work on the audio, not the RSP, and it doesn't seem like an audio task is ever called for the RSP, yet there is no audio.

@LegendOfDragoon

This comment has been minimized.

Copy link

commented Dec 16, 2017

The graphics microcode affects audio

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 16, 2017

Hum good question. I'm using the dummy audio plugin when working on the emu so I didn't even realize audio was missing. I will investigate!

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 19, 2017

Wow this is one hell of an optimized ucode!!! The ucode is doing both audio and graphic processing, plus everything fits in less then 1024 instructions so no overlay needed.
@loganmc10 this explains why there is no sound, audio commands are unimplemented yet!
@gonetz This is going to be weird, having audio commands inside video plugin.

@LegendOfDragoon

This comment has been minimized.

Copy link

commented Dec 19, 2017

The ucode is doing both audio and graphic processing

I figured as much since I remember an RSP regression made the audio sound terrible. Do you know which part of the ucode handles audio?

@gonetz

This comment has been minimized.

Copy link
Owner

commented Dec 20, 2017

This is going to be weird, having audio commands inside video plugin.

I have no idea how to resolve it. Add special audio library?

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 20, 2017

I figured as much since I remember an RSP regression made the audio sound terrible. Do you know which part of the ucode handles audio?

@LegendOfDragoon here's a log extract of commands execution:

ZSortBOSS_MoveWord (Write 0x00000c20 to DMEM: 0x0010)
ZSortBOSS_ClearBuffer (Write 0x0 to DMEM: 0x0c20 -> 0x0e20)
ZSortBOSS_1CF0 (0x220cdfc8, 0x803f4440)
ZSortBOSS_MoveWord (Write 0x32e22f6e to DMEM: 0x0904)
ZSortBOSS_MoveWord (Write 0x00000000 to DMEM: 0x0000)
ZSortBOSS_1D20 unimplemented (0x2400003f, 0x003e0798)
ZSortBOSS_1BCC unimplemented (0x260000ec, 0x803f4440)
ZSortBOSS_1BF0 unimplemented (0x1c00d000, 0x80000000)
ZSortBOSS_1CF0 (0x220cf100, 0x803f4448)
ZSortBOSS_MoveWord (Write 0x065a05eb to DMEM: 0x0904)
ZSortBOSS_MoveWord (Write 0x00000000 to DMEM: 0x0000)
ZSortBOSS_1D20 unimplemented (0x24000012, 0x003e0f30)
ZSortBOSS_1BF0 unimplemented (0x1c003333, 0x80000000)
ZSortBOSS_MoveMem (R/W: 1, RDRAM: 0x003f129c, DMEM: 0x0c20; len: 512)

Audio buffer in DMEM is cleared by ZSortBOSS_ClearBuffer command.
Audio buffer is send from DMEM to RDRAM by ZSortBOSS_MoveMem command.
So I would say that commands, 1CF0, 1D20, 1BCC and 1BF0 are audio commands.

Here's the disassembly if you want to take a look at the commands. I added a few comments to help understanding the code:
WDC.txt

@gonetz Don't worry we won't need any special audio library, you can take a look at typical audio commands in mupen64plus-rsp-hle. It's just a bit confusing to have to implement them in the video-plugin.

@LegendOfDragoon

This comment has been minimized.

Copy link

commented Dec 20, 2017

@Gillou68310 Thanks for the info. I really should have wrote more things down instead of just relying on my own memory.

Iirc, there was some code that never gets executed unless you significantly increase the number of times it yields. I also remember there being a lot of interesting ways to optimize the code for this ucode. Like for instance, $v0 always = 0, and they did interesting stuff with that.

@gonetz The ucode is relatively lightweight and can be done on CPU without any need for any libraries.

@gonetz

This comment has been minimized.

Copy link
Owner

commented Dec 21, 2017

Don't worry we won't need any special audio library, you can take a look at typical audio commands in mupen64plus-rsp-hle. It's just a bit confusing to have to implement them in the video-plugin.

If this code will not leave ZSortBOSS.cpp, its ok.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Dec 22, 2017

More progress on this:
ZSortBOSS_TransposeMTX and ZSortBOSS_TransformLights commands are now implemented. ZSortBOSS_1444 (Lighting) command is still wip but reflection seems to work.
I thought that the car wrong color was caused by lighting but the light number is equal to zero so I still need to investigate.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2018

Progress report:
-Fixed cars wrong colors (ZSortBOSS_SetOtherMode_L and ZSortBOSS_SetOtherMode_H commands were not correctly implemented)
-Fixed cars reflection (the square root reciprocal needs to be saturated when normalizing light vectors)
-Added ZSortBOSS_Obj command to reflect what the assembly code actually do (I'm not relying on original ZSort_Obj command anymore)

Remaining issues:
-No audio (audio commands are not implemented yet)
-Clipping issue (probably related to theZSortBOSS_MultMPMTX command, I need to investigate)

Gillou68310@59bf5be

@gonetz

This comment has been minimized.

Copy link
Owner

commented Jan 5, 2018

Lots of works is done, great!

However, how to run it? I guess, it will not work with Zilmar-spec emus.
I tried to run it with M64PY, but it seems to be too old and does not have necessary changes in core and RSP.
I tried to run it with latest m64p, but it always run WDC in LLE mode, despite of RSP-HLE set as rsp plugin. The same problem is with Indi: m64p runs it in LLE mode only.

@theboy181

This comment has been minimized.

Copy link

commented Jan 5, 2018

I have a zilmar spec RSP plugin that may work. Can you send a build and I’ll test it out. ?

@gonetz

This comment has been minimized.

Copy link
Owner

commented Jan 5, 2018

Sure. Here WDC WIP build for Windows, zilmar and mupen64plus:
https://drive.google.com/file/d/19veV2yqkOppCSH0N2gyhxGsqy34pX6oz/view?usp=sharing

@theboy181

This comment has been minimized.

Copy link

commented Jan 5, 2018

Aio has buld a zilmar rsp with hole support for gaintlet and I figured it may work. :( nop

@AmbientMalice

This comment has been minimized.

Copy link
Contributor

commented Jan 5, 2018

I seemingly got it working on m64p by replacing the mupen64plus-rsp-cxd4-sse2.dll file with a duplicate of mupen64plus-rsp-hle.dll which overrides the override. Graphical distortions. No audio.

@gonetz

This comment has been minimized.

Copy link
Owner

commented Jan 5, 2018

I seemingly got it working on m64p by replacing the mupen64plus-rsp-cxd4-sse2.dll file with a duplicate of mupen64plus-rsp-hle.dll which overrides the override.

Yes, it works, thanks. I did not hit upon this myself.

Graphical distortions. No audio.

As promised.

The overall result is very good. Especially for such complex microcode. I remember how I implemented zsort ucode. I had documentation and two test roms, but I spent many time to get it working.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 5, 2018

Nice thanks for testing :-)
It works well on windows for me too. However I tested on my galaxy note3 and the performance was not good there, plus textures were all messed up, not sure what the issue is.

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 5, 2018

My best guess on the bad textures is probably driver issues.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 6, 2018

I tested on my odroid c2 and I have the same texture problem, maybe it's a gles2 specific issue. Anyway the game was running fullspeed on my odroid c2 so I don't think performance is an issue here.

@olivieryuyu

This comment has been minimized.

Copy link

commented Feb 8, 2018

Seriously why the Zsort ucode was not used in more games? It is obvious that it palliates the fillrate issue of the N64 and allows way better graphics in consequence.

Just compare Mia Soccer with FIFA98 on a real N64: the difference is so huge, it is shocking!

@AmbientMalice

This comment has been minimized.

Copy link
Contributor

commented Feb 8, 2018

Was released too late in the system's life, I think. And was also a bit obtuse.

Feb 1999.

Q4 What is the current status of the Z Sort microcode?

A4 Its release has been announced. It is now a matter of distributing the Z Sort microcode, but the Z Sort microcode libraries have not yet been completed. Consequently, the microcode still hasn't been officially released.

There is no official release schedule for the Z Sort microcode. There are various reasons for this, including the following.

Since it is not compatible with F3D GBI, it is not easy to incorporate into games
Preprocessing by the CPU is necessary to run the microcode
A minimum RSP and CPU scheduling sample (including sound processing) using a game has not been created
There are no tools for output with Z Sort structures
The current latest version 0.34 is available as a beta evaluation version at NTSC-ONLINE.

Note: While you can expect that using this microcode and library will be effective for games which currently cause bottlenecks in the RDP, please note that it will have no effect on those with bottlenecks in the CPU and/or RSP. In addition, even if the RDP processing is affected, this cannot be guaranteed to improve the overall performance of the game application. Please, keep these things in mind when using and evaluating this microcode.

Since the current version 0.34 is a beta version, please be forewarned that the specifications may vastly differ in the future. Now, while the majority of this is outside the scope of support, it would be extremely helpful if you would contact NTSC regarding bugs, etc.

http://n64devkit.square7.ch/qa/graphics/ucode.htm

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2018

I finally implemented the audio commands!
Since I'm not familiar with audio processing I can't really put a name on those commands.
Anyway at least we have audio now!

Gillou68310@50f74e2

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2018

That's pretty amazing work.

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2018

@Gillou68310 looks and sounds great! Definitely ready to be merged as far as I can tell (well you have a tiny merge conflict in the wdc branch). Great work! The CPU usage is relatively low on my PC, so I assume that a device with GL support like the Shield would be able to run the game really well. It sounds like there might be some GLES issues but that is unrelated to the ucode.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2018

@fzurita, @loganmc10 thanks for the feedback!

Game is running fullspeed on my odroid c2, it should definitely run fullspeed on a Shield too.

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2018

I have yet to try it. But I'll give this a shot to see how it goes. I can probably debug the GLES issues if I get sone glErrors.

@olivieryuyu

This comment has been minimized.

Copy link

commented Feb 9, 2018

amazing! great work! :)

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2018

Well, I can say it works on Android with GLES, but there are a lot of texture glitches and the game runs sluggish. Also, sound seems to skip a lot. Also, there may be same near plane issues.

These are probably all GLES issues.

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2018

Also, loading save states seem to lock up the game.

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 10, 2018

Full GL mode doesn't look much better on the Shield TV.

Edit: I forgot to bring this in, Maybe it will help a few issues I saw:
mupen64plus/mupen64plus-core#523

Edit 2: The core pull request didn't help it seems.

@loganmc10

This comment has been minimized.

Copy link
Contributor

commented Feb 10, 2018

Could the arm dynarec be screwing something up? Did you guys try the interpreter? I'm surprised it doesn't work on the Shield

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 10, 2018

Also, loading save states seem to lock up the game.

Savestate should not happen while rsp task is locked:

Please apply this patch first Gillou68310/mupen64plus-core@d9129e1, otherwise you could end up with unusable savestate.

Could the arm dynarec be screwing something up?

Looks like it is, interpreter has no texture glitches!

@gonetz

This comment has been minimized.

Copy link
Owner

commented Feb 10, 2018

Amazing work, Gilles!

Yes, it can be merged with master. Things to do:

  • Put ZSortBOSS sources to proper place in Visual Studio project: Header files/uCodes for ZSortBOSS.h
    Source files/uCodes for ZSortBOSS.cpp
  • rebase to current master, fix conflicts
  • Squash all commits to one commit
  • Make PR :)
@gonetz

This comment has been minimized.

Copy link
Owner

commented Feb 10, 2018

Note: I see several "ZSortBOSS_Audio4: Index out of bound" messages in gliden64.log

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 11, 2018

So this turns out to run a lot faster if you disable the LLE checks in rsp-hle. Whoops, I wasn't even testing the HLE portion. After fixing that, speed is excellent. Texture issues remain though. Unfortunately, speed using interpreter with HLE is as slow as dynarec with LLE.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 11, 2018

Yes, it can be merged with master. Things to do:

Put ZSortBOSS sources to proper place in Visual Studio project: Header files/uCodes for ZSortBOSS.h
Source files/uCodes for ZSortBOSS.cpp
rebase to current master, fix conflicts
Squash all commits to one commit
Make PR :)

I will take care of this tomorrow ;-)

Note: I see several "ZSortBOSS_Audio4: Index out of bound" messages in gliden64.log

Yes this is happening on LLE too. It's either a core issue or a bug that is also present on real hardware.

So this turns out to run a lot faster if you disable the LLE checks in rsp-hle. Whoops, I wasn't even testing the HLE portion. After fixing that, speed is excellent. Texture issues remain though. Unfortunately, speed using interpreter with HLE is as slow as dynarec with LLE.

Lol I just realized that I did the same mistake, that's why performance was not good on android :-D
I'll see what I can do about the new dynarec issue. At least it's not happening on x86 new dynarec, this should help debugging the issue.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2018

Done #1717

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2018

I wonder if the new dynarec issues in this game are related to the new dynarec issues in multi racing championship. It's also texture issues when new dynarec is enabled:

mupen64plus/mupen64plus-core#433

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 14, 2018

@fzurita Looks like the DMULT/DMULTU recompiled code is broken on arm. Texture issues are fixed by switching to interpreted opcode Gillou68310/mupen64plus-core@d019477
I still need to understand what is wrong with the recompiled code.

@fzurita

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2018

Ok, that's good to know. You are really good at finding dynarec issues.

@Gillou68310

This comment has been minimized.

Copy link
Contributor Author

commented Feb 14, 2018

Trust me it takes a lot of time and patience ;-)

@gonetz gonetz referenced this issue Mar 13, 2018

Closed

WIP Builds 3 #1574

@gonetz

This comment has been minimized.

Copy link
Owner

commented Mar 13, 2018

olivieryuyu: The only bug is the missing coronas in game in both LLE and HLE.

Hmm, that is interesting. Usually corona is depth buffer based effect. CPU analyzes values in depth buffer and decides: include coronas rendering to display list or not. ZSort microcode does not use depth buffer at all. How it should work? May be coronas rendered by CPU? Could you give me savestate from the place, where corona should be visible?

@olivieryuyu

This comment has been minimized.

Copy link

commented Mar 13, 2018

You don't need a savestate: just let the intro runnning and you will see with Angrylion plugin the coronas within few mins appearing.

@Tasosgemah

This comment has been minimized.

Copy link

commented Mar 13, 2018

There's also a lens flare effect in the last part of first track that is missing in both LLE and HLE. You can see it in Angrylion, when you exit the final tunnel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.