Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Drive Unlimited, really slow (30%~40%) #7347

Closed
issuer opened this issue Jan 18, 2015 · 31 comments
Closed

Test Drive Unlimited, really slow (30%~40%) #7347

issuer opened this issue Jan 18, 2015 · 31 comments
Labels
GE emulation Backend-independent GPU issues
Milestone

Comments

@issuer
Copy link

issuer commented Jan 18, 2015

Tested this game today and seems alright, doesn't crash anymore on cutscenes, it's playable too. AdHoc also works fine.

However, the game is locked at 30% speed in the menus and 40% speed in-game, which also affect the game sound, cutscenes audio is fine. There are also a couple of minor graphical glitches on some buildings in the island and the post-processing shaders don't work for some reason, even in OpenGL Backend.

ules00637_00000

This is on v0.9.9.1-1505-gcf577e9.

@daniel229
Copy link
Collaborator

That game only rendering at 1x resolution,it's the block transfer thing.In D3D the game would run much faster.

In gedebugger drawing the glitch
02

@unknownbrackets
Copy link
Collaborator

Hmm, a bezier and then immediately two prims. I wonder if we end up messing anything up after drawing a bezier?

-[Unknown]

@hrydgard
Copy link
Owner

Does turning off "Simulate block transfer" improve speed in-game?

@daniel229
Copy link
Collaborator

Turning off "Simulate block transfer",the game would be just black.

@issuer
Copy link
Author

issuer commented Feb 2, 2015

No changes in 1.0. Direct3D9 and Auto Frameskip seems to help the game run faster but Fullspeed (100%) is not possible yet.

@unknownbrackets
Copy link
Collaborator

Does it help the speed to set the bezier quality to "Low"?

Are we increasing / should we be increasing the vaddr after drawing beziers?

And by "are we?" I mean "we are not."

-[Unknown]

@daniel229
Copy link
Collaborator

Low bezier did not help.
Another bug,It is just blackscreen at 1x resolution.

@daniel229
Copy link
Collaborator

Since #8259

@unknownbrackets
Copy link
Collaborator

Only at 1x? Can you log like before with that code? This is even with the fix for outside range, right?

-[Unknown]

@daniel229
Copy link
Collaborator

Yes,only 1x,it's the latest build,an the log only happen at 1x

32:24:417 idle0        I[SCEGE]: GLES\Framebuffer.cpp:1982 Destroying FBO for 00044000 : 480 x 272 x 1
32:24:417 idle0        I[SCEGE]: GLES\Framebuffer.cpp:1982 Destroying FBO for 00000000 : 480 x 272 x 1
32:24:423 user_main    I[SCEGE]: Common\FramebufferCommon.cpp:420 Creating FBO for 00044000 : 480 x 272 x 1
32:24:425 idle0        N[HLE]: GLES\Framebuffer.cpp:1395 src 0,0 -> dst 0,0  480x272
32:24:425 idle0        N[HLE]: GLES\Framebuffer.cpp:1400 glCopyImageSubData: 00000502
32:24:427 idle0        N[HLE]: GLES\Framebuffer.cpp:1395 src 0,0 -> dst 0,0  480x272
32:24:427 idle0        N[HLE]: GLES\Framebuffer.cpp:1400 glCopyImageSubData: 00000502
32:24:430 idle0        N[HLE]: GLES\Framebuffer.cpp:1395 src 0,0 -> dst 0,0  480x272
32:24:430 idle0        N[HLE]: GLES\Framebuffer.cpp:1400 glCopyImageSubData: 00000502

@unknownbrackets
Copy link
Collaborator

GL_INVALID_OPERATION is generated if the texel size of the uncompressed image is not equal to the block size of the compressed image.

No compression here, so this shouldn't be a concern.

GL_INVALID_OPERATION is generated if either object is a texture and the texture is not complete.

Both textures ought to be complete, I think... and no reason they wouldn't be at 1x but would at 2x+.

GL_INVALID_OPERATION is generated if the source and destination internal formats are not compatible, or if the number of samples do not match.

These should both match as well.

So apparently, it's a reason outside the documentation? Or I'm wrong in the above. Hmm. I've tested and it works to do a copy from/to the same buffer (not overlapping.) Maybe these overlap...

If you log src->fb_address and dst->fb_address (both %08x), are they the same or different? I also see above it destroyed 00000000 but did not recreate. Seems a little suspicious.

-[Unknown]

@daniel229
Copy link
Collaborator

src->fb_address and dst->fb_address are the same.
that destroyed 00000000 because I changing resolution in game.

this is not changing resolution,and add log 2 Variables.

56:28:823 user_main    I[ME]: HLE\sceMpeg.cpp:423 sceMpegInit()
56:28:826 user_main    I[UTIL]: Dialog\PSPSaveDialog.cpp:85 sceUtilitySavedataInitStart(091ae5dc) - SIZES (8)
56:28:826 user_main    I[UTIL]: Dialog\PSPSaveDialog.cpp:86 sceUtilitySavedataInitStart(091ae5dc) : Game key (hex): AABA4F00024E00542E2412A521F05ACC
56:28:837 user_main    W[DISP]: HLE\sceDisplay.cpp:829 sceDisplaySetFramebuf: PSP_DISPLAY_SETBUF_IMMEDIATE without topaddr?
56:28:839 user_main    I[SCEGE]: Common\FramebufferCommon.cpp:420 Creating FBO for 00044000 : 480 x 272 x 1
56:28:839 user_main    W[G3D]: Common\FramebufferCommon.cpp:619 Memcpy fbo upload 04444000 -> 04044000
56:28:840 user_main    I[G3D]: GLES\ShaderManager.cpp:160 Linked shader: vs 13 fs 14
56:28:841 user_main    I[G3D]: GLES\ShaderManager.cpp:160 Linked shader: vs 16 fs 17
56:28:841 idle0        W[G3D]: Common\FramebufferCommon.cpp:784 Block transfer download 04044000 -> 0417b000
56:28:843 idle0        N[HLE]: GLES\Framebuffer.cpp:1395 src->fb_address: 00044000 , dst->fb_address: 00044000
56:28:843 idle0        N[HLE]: GLES\Framebuffer.cpp:1396 src 0,0 -> dst 0,0  32x128
56:28:843 idle0        N[HLE]: GLES\Framebuffer.cpp:1401 glCopyImageSubData: 00000502
56:28:851 idle0        I[G3D]: GLES\ShaderManager.cpp:160 Linked shader: vs 19 fs 20
56:28:852 idle0        N[HLE]: GLES\Framebuffer.cpp:1395 src->fb_address: 00044000 , dst->fb_address: 00044000
56:28:852 idle0        N[HLE]: GLES\Framebuffer.cpp:1396 src 32,0 -> dst 32,0  32x128
56:28:852 idle0        N[HLE]: GLES\Framebuffer.cpp:1401 glCopyImageSubData: 00000502
56:28:854 idle0        N[HLE]: GLES\Framebuffer.cpp:1395 src->fb_address: 00044000 , dst->fb_address: 00044000
56:28:854 idle0        N[HLE]: GLES\Framebuffer.cpp:1396 src 64,0 -> dst 64,0  32x128
56:28:854 idle0        N[HLE]: GLES\Framebuffer.cpp:1401 glCopyImageSubData: 00000502
56:28:859 idle0        N[HLE]: GLES\Framebuffer.cpp:1395 src->fb_address: 00044000 , dst->fb_address: 00044000

@daniel229
Copy link
Collaborator

Boku no Natsuyasumi 4 also blackscreen at 1x under water.So it's Block transfer related?
02

@daniel229
Copy link
Collaborator

Block transfer is also broken in Sora no Kiseki HD,ikuze gensan at 1x.

@unknownbrackets
Copy link
Collaborator

Yes, this function is used to handle block transfers.

So it's blitting from itself to itself? That should simply do nothing. How does it even work at 2x?

If we skip src == dst, we lose a nice optimization in for example Tales of Phantasia. So I guess we need to check for overlap manually.

Do all of these blit with src and dst that have the same fb_address?

-[Unknown]

@daniel229
Copy link
Collaborator

Yes,they are the same.

@unknownbrackets
Copy link
Collaborator

Okay, try this:

    if (gstate_c.Supports(GPU_SUPPORTS_ANY_COPY_IMAGE)) {
        // glBlitFramebuffer can clip, but glCopyImageSubData is more restricted.
        // In case the src goes outside, we just skip the optimization in that case.
        const bool sameSize = dstX2 - dstX1 == srcX2 - srcX1 && dstY2 - dstY1 == srcY2 - srcY1;
        const bool srcInsideBounds = srcX2 <= src->renderWidth && srcY2 <= src->renderHeight;
        const bool dstInsideBounds = dstX2 <= dst->renderWidth && dstY2 <= dst->renderHeight;
        const bool xOverlap = src->fb_address == dst->fb_address && srcX2 >= dstX1 && srcX1 <= dstX2;
        const bool yOverlap = src->fb_address == dst->fb_address && srcY2 >= dstY1 && srcY1 <= dstY2;
        if (sameSize && srcInsideBounds && dstInsideBounds && !(xOverlap && yOverlap)) {

-[Unknown]

@daniel229
Copy link
Collaborator

Yes,it works, and that Sora no Kiseki is other bug.

@unknownbrackets
Copy link
Collaborator

Just to confirm, what about this?

    if (src == dst && srcX == dstX && srcY == dstY) {
        // Let's just skip a copy where the destination is equal to the source.
        WARN_LOG_REPORT_ONCE(blitSame, G3D, "Skipped blit with equal dst and src");
        return;
    }

    if (gstate_c.Supports(GPU_SUPPORTS_ANY_COPY_IMAGE)) {
        // glBlitFramebuffer can clip, but glCopyImageSubData is more restricted.
        // In case the src goes outside, we just skip the optimization in that case.
        const bool sameSize = dstX2 - dstX1 == srcX2 - srcX1 && dstY2 - dstY1 == srcY2 - srcY1;
        const bool srcInsideBounds = srcX2 <= src->renderWidth && srcY2 <= src->renderHeight;
        const bool dstInsideBounds = dstX2 <= dst->renderWidth && dstY2 <= dst->renderHeight;
        const bool xOverlap = src == dst && srcX2 > dstX1 && srcX1 < dstX2;
        const bool yOverlap = src == dst && srcY2 > dstY1 && srcY1 < dstY2;
        if (sameSize && srcInsideBounds && dstInsideBounds && !(xOverlap && yOverlap)) {

I realized my previous change was not checking X2/Y2 properly, and was also breaking resize blits.

-[Unknown]

@daniel229
Copy link
Collaborator

Didn't work.

@unknownbrackets
Copy link
Collaborator

So, to get back to the speed issues, I got some information on this game. I may have to refer to it as "the spawn of Sonic Rivals 2".

Apparently the game does something that involves downloading slices of the active framebuffer, doing something to them, uploading those slices back... once slice at a time. I'm not clear on how many slices, but it's at least a few, and so this is painful for modern GPUs.

So I guess something similar to Sonic Rivals 2, but instead of depal, block transfer.

-[Unknown]

@daniel229
Copy link
Collaborator

Graphical glitches fixed in #8689

@LunaMoo LunaMoo changed the title Test Drive Unlimited, really slow (30%~40%) and graphical glitches Test Drive Unlimited, really slow (30%~40%) Nov 15, 2017
@LunaMoo
Copy link
Collaborator

LunaMoo commented May 27, 2018

Just a reminder for users being linked to this issue without looking into game's thread - even if performance of the effect used in this game would be solved, it would still render at x1, so a complete removal of the effect via hack like this might overall give better experience.

@Panderner
Copy link
Contributor

Panderner commented Mar 17, 2020

It seems this game requires simulated block transfer effects to be turned on, if you turned off simulated block transfer effects the the whole game is black screen.
Add a warning that users requires simulated block transfer effects to be turned on

@unknownbrackets
Copy link
Collaborator

@LunaMoo what does that patch out?

Although I'm generally against game specific hacks, I do support them for surgical ways to enable enhancements. I just don't like them for fixes. So in this case, maybe we could make > 1x work by default without requiring any cheats.

-[Unknown]

@Panderner
Copy link
Contributor

This game is unplayable when simulated block transfer effects is not turned on

@LunaMoo
Copy link
Collaborator

LunaMoo commented Mar 17, 2020

@unknownbrackets the effect is using DRAW PRIM RECTANGLES: count= 2 and as far as I know nothing else in this game did, so the cheat just 0's that out on boot when it's set.
Original code is
0x3C060406
0x34C60002
so basicaly a2 set to 04060002 ~ DRAW PRIM RECTANGLES: count= 2, the cheat just sets a2 to 0 there which then is saved for use.

@hrydgard
Copy link
Owner

hrydgard commented Mar 17, 2020

I had a look at this a while ago and I believe it may be possible to make this work at higher resolutions with reasonable performance using similar tricks like when we automatically created a copy target framebuffer in Digimon Adventures. That potential method will require some possibly game specific heuristics to size the new targets though.

@hrydgard hrydgard added this to the Future milestone Mar 17, 2020
@hrydgard hrydgard added the GE emulation Backend-independent GPU issues label Mar 17, 2020
@hrydgard
Copy link
Owner

It is similar to Sonic (with the RGB depal lookups) except that instead of doing it from one framebuffer to another, it copies each tile to a small temporary space and textures from there, to save space. Fortunately at least the tiles are a lot bigger so there's not that many of them.

I've got this fully working at high resolution and fast in a branch (on desktop, at least) except a stripe at the bottom of the screen, where it switches from 32x128 to 256x16 tiles but textures using the wrong one since they are from the same address in VRAM. Unfortunately the code to choose a framebuffer is a bit of a mess so I'm going to clean that up before I make a PR.

@Panderner
Copy link
Contributor

Panderner commented Aug 31, 2020

Screenshot_2020-08-31-17-51-50-73_2f85358b2198d26f8aca533d68bee793
After #13355 the game runs great on Realme C2 in OpenGL without cheat workaround but some frame dips randomly.

@hrydgard
Copy link
Owner

hrydgard commented Aug 31, 2020

Yeah it's still a heavy game, but the big performance killing problem is gone.

Closing. Fixed by #13355.

@unknownbrackets unknownbrackets modified the milestones: Future, v1.11.0 Sep 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GE emulation Backend-independent GPU issues
Projects
None yet
Development

No branches or pull requests

6 participants