Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpcodeDecoding: Cache vertex sizes #11067

Merged
merged 3 commits into from Sep 19, 2022
Merged

Conversation

K0bin
Copy link
Contributor

@K0bin K0bin commented Sep 15, 2022

I struggled a bit to find a place where to put the cache but I'm pretty happy with having it in the callback.
The hit rate came out at 97% in Mario Galaxy. (I hooked it up to the stats HUD for testing.)

Together with #11066 Mario Galaxy goes from 85 fps to 140 fps on the hub world. (5900X, downclocked to 2.2 GHz)

@K0bin K0bin mentioned this pull request Sep 15, 2022
@K0bin K0bin force-pushed the cache-vertex-size branch 3 times, most recently from eb39450 to dd3375e Compare September 15, 2022 21:52
@K0bin
Copy link
Contributor Author

K0bin commented Sep 15, 2022

The new version just uses the vertex size that's stored in the VertexLoader if that's available.
Works for 99% of all calls in Mario Galaxy.

@Nerboruto
Copy link

could it bring some improvement in other games like fzero or metroid prime?

@K0bin
Copy link
Contributor Author

K0bin commented Sep 17, 2022

I think @JMC47 tried it with those and it was between 2 and 5% faster.

@K0bin
Copy link
Contributor Author

K0bin commented Sep 18, 2022

Ready for another round of reviews.

Copy link
Contributor

@iwubcode iwubcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested but code wise LGTM. Great work @K0bin !!

@K0bin K0bin force-pushed the cache-vertex-size branch 2 times, most recently from 907b5d7 to b603c91 Compare September 18, 2022 20:10
@K0bin K0bin force-pushed the cache-vertex-size branch 2 times, most recently from a80f5a0 to eb55bd2 Compare September 18, 2022 23:06
@K0bin
Copy link
Contributor Author

K0bin commented Sep 18, 2022

sigh

Either my local clang-format doesn't pick up the Dolphin config file or the CI bot has a different set of rules.

Copy link
Contributor

@Pokechu22 Pokechu22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Most of my testing was on single core which didn't show much of an improvement (possibly because of other accuracy settings I have enabled), but I saw a lot larger of an improvement on dual-core. Thanks!

Regarding clang-format: I use MSVC's built-in formatting functionality, which picks up on the config file properly. I think there's also different behavior depending on the version of clang-format in use; I believe clang-format-9 is in use (and different versions behave differently with the same config).

@JMC47 JMC47 merged commit 6f4f5b0 into dolphin-emu:master Sep 19, 2022
@K0bin K0bin deleted the cache-vertex-size branch September 19, 2022 11:11
@theofficialgman
Copy link

I've tested this and the preceeding PR
unfortunatly no FPS improvement seen here (nintendo switch, tegra-x1 quad core cortex A57 @ 2Ghz). 25FPS in that scene before and after the PRs

@badkarma12
Copy link

Small boost to Pokemon Colosseum/xd on scenes with shadow pokemon aura and moves like sunny day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
8 participants