Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(game): show correct softcore users count in Achievement Distribution graph #2466

Merged

Conversation

TwosomesUP
Copy link
Contributor

@TwosomesUP TwosomesUP commented May 24, 2024

Initial commit and composer fix
Update achievement distribution selections and migration to Eloquent
Update calculateBuckets and handleAllAchievementsCase to accommodate selection update
Update V1Test to accommodate selection update

Resolves: #2399

@wescopeland
Copy link
Member

There are a lot of changes in this branch that seem unrelated to the achievement distribution issue. I'm not sure what happened - perhaps composer fix was overzealous.

We should thin out this branch so only your specific changes are included, though.

@TwosomesUP
Copy link
Contributor Author

Yep, composer fix definitely seemed to have went above and beyond. I can see if I can pull out those changes.

Initial commit and composer fix
Update achievement distribution selections and migration to Eloquent
Update calculateBuckets and handleAllAchievementsCase to accomodate selection update
Update V1Test to accomodate selection update

Composer fix
@TwosomesUP
Copy link
Contributor Author

@wescopeland additional changes added by composer fix have been removed.

@wescopeland
Copy link
Member

Doing some local testing, the new implementation appears to be an order of magnitude slower than what's currently in master. I haven't dived into any possible reason why this may be occurring - I'm just looking purely at query execution time.

These queries are taken from a game that has 175,000 player_achievements records associated to it.

master
Screenshot 2024-05-24 at 2 14 58 PM

This branch
Screenshot 2024-05-24 at 2 14 44 PM

@TwosomesUP
Copy link
Contributor Author

Got it. I can investigate the SQL as well to see if there's any way to optimize it. What's curious is that this query pulls from player_games, which seems to have 1 row per player per game, instead of player_achievements which is 1 row per achievement per player.

There wouldn't be a SQL statement I could run locally that would insert 200k rows into player_achievements would there? :)

@wescopeland
Copy link
Member

Ah, truthfully I didn't even notice that these queries aren't reading from player_achievements, I just made a (poor) assumption 😅

@TwosomesUP
Copy link
Contributor Author

I made some minor optimizations to the queries, and they seem to be executing faster, but I'll need to set up some volume testing on my end to be sure.

image

@wescopeland
Copy link
Member

Thanks for taking a look. Just checked on my end with latest from the branch, seems about the same 😟

Screenshot 2024-05-24 at 4 53 35 PM

@wescopeland
Copy link
Member

Sorry, same problem on latest 😔

Branch
Screenshot 2024-05-24 at 8 24 22 PM

master
Screenshot 2024-05-24 at 8 24 45 PM

@TwosomesUP
Copy link
Contributor Author

No worries! I'm going to break this off into another branch to play around with. Hopefully I can figure something out that's more performant. 😁

@TwosomesUP
Copy link
Contributor Author

Ok, I was able to insert about 25k rows into player_games for game_id 1 with a variety of unlock combinations, and I'm definitely seeing the performance hit now!

image

image

This should allow me to run better unit tests with this selection. 😄

@TwosomesUP
Copy link
Contributor Author

Hey @wescopeland !

I did some extensive testing/tinkering with the Achievement Distribution query and as the data is currently structured, I'm not sure we'd be able to get the selection much faster than it currently is. The fastest I could get it was 50ms for 25k rows on the player_games table.

That said, there is a different approach we could take to speed this up, but it would require altering the player_games table and adding a new stored Generated Column for softcore unlocks, and an index. This has a much bigger impact, so I'll leave it up to RA if you all would like to consider this route, but here are my results from local testing:

Legacy
Legacy

PR Aggregate (Eloquent)
Aggregate

Alter + Index (Eloquent)
WithAlterAndIndex_Eloquent

Alter + Index (DB::select)
WithAlterAndIndex_DBSelect

I'm not sure if there is a configuration difference with my instance of MySQL but the Alter + Index seems to have equivalent performance compared to the legacy query. For your reference, here is the SQL I used for the alter and the index build:

alter table player_games
add column achievements_unlocked_softcore int unsigned generated always as (greatest(achievements_unlocked - achievements_unlocked_hardcore, 0)) stored;

alter table player_games add index player_games_game_id_achievements_unlocked_softcore_index (game_id, achievements_unlocked_softcore)

Again, this can have a big impact since it alters a table with potentially a million+ rows in production, and it would calculate the softcore unlocks for each one of those (one time). Once altered, the column would be calculated for a row upon insert or recalculated upon update.

With that in mind, it's completely understandable if this is something RA would not want to pursue at the moment.

@wescopeland
Copy link
Member

wescopeland commented May 26, 2024

I had a feeling needing a new denormalized column would be how things might shake out 😆

That said, I would be fine with pursuing this. Before moving forward, I think another maintainer should also sign off on the idea though (cc @RetroAchievements/web-maintainers). I can think back to other times where I've needed to perform a calculation to peek into a user's mixed progress. Just having the raw value would've been more helpful and performant.

Going down this road is a more significant lift and is much more fraught with peril. It would need to occur in two phases (releases): a write phase and a read phase.

For the write phase, we would need a PR/changeset which includes:

  • The migration itself, in database/migrations/platform.
  • Changes made to the UpdatePlayerGameMetrics action so these new values get populated.
  • Updated tests, etc.

I personally haven't triggered a full resync of player game metrics on the server, but I know we've done it once or twice in the past when we first began experimenting with the performance benefits of denormalized data. Once the code is deployed, we'd trigger a full resync, which would queue up probably about 10,000,000 jobs. This would take quite some time to finish.

For the read phase, it'd essentially be a PR containing what you have now, but instead reading from these new denormalized values.

@wescopeland wescopeland requested a review from a team May 26, 2024 14:04
@TwosomesUP
Copy link
Contributor Author

Great! I can start researching the write phase steps this week. I may reach out for guidance on a few things if I hit a wall, but I'll be sure to let you know.

@TwosomesUP
Copy link
Contributor Author

@wescopeland a PR for the Write Phase (#2477) has been submitted. If/when the new PR is approved and merged, I can update this PR to accommodate the changes. Just let me know if there's anything I may have missed, or if you have any questions/concerns.

Thanks!

@TwosomesUP
Copy link
Contributor Author

@wescopeland circling back to this and after some testing, it seems like the automatic "Soft Delete" check when selecting from the player_games table using Eloquent and the PlayerGame model is impacting performance. Specifically when the deleted_at column is included in the selection. Here are the timings for selections with and without the soft delete check:

Eloquent with Soft Delete check:

$countQuery = PlayerGame::selectRaw("$countColumn as AwardedCount, count(*) as NumUniquePlayers")
            ->where("$countColumn", ">", 0)
            ->where('player_games.game_id', $gameID)
            ->groupBy('AwardedCount')
            ->orderByDesc('AwardedCount');

image

Eloquent without Soft Delete check (code same as above):

image
*This one removes the use SoftDeletes from the PlayerGame model. Probably don't want to do this. :)

DB::table() select without Soft Delete check:

$countQuery = DB::table('player_games')
            ->selectRaw("$countColumn as AwardedCount, count(*) as NumUniquePlayers")
            ->where("$countColumn", ">", 0)
            ->where('player_games.game_id', $gameID)
            ->groupBy('AwardedCount')
            ->orderByDesc('AwardedCount');

image
*This one bypasses the PlayerGame model and does a direct DB select

@wescopeland
Copy link
Member

@TwosomesUP Sorry for my late reply here. I'm in the process of setting up some production data to test with, which should be done later today. As soon as that is set up, getting you a review will be my highest priority 👍

@TwosomesUP
Copy link
Contributor Author

I appreciate it Wes, but no rush! I still need to update this pull request with the new selects that account for the new achievements_unlocked_softcore column.

I've been on vacation, so I haven't had a chance to get to it yet, but I'll try to get that pushed either today or tomorrow.

app/Helpers/database/player-achievement.php Show resolved Hide resolved
app/Helpers/database/player-achievement.php Outdated Show resolved Hide resolved
app/Helpers/database/player-achievement.php Outdated Show resolved Hide resolved
resources/views/pages-legacy/gameInfo.blade.php Outdated Show resolved Hide resolved
@wescopeland
Copy link
Member

Hi @TwosomesUP! Looks like we may have a small merge conflict. Then, assuming @Jamiras approves, we can merge to get this in the next release.

@wescopeland wescopeland changed the title Fixes for Achievement Distribution selections fix(game): show correct softcore users count in Achievement Distribution graph Sep 23, 2024
@wescopeland wescopeland merged commit 79e40a1 into RetroAchievements:master Sep 23, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong softcore users count in Achievement Distribution graph
3 participants