Use batching in BackgroundDataStoreProcessor processes#36128
Use batching in BackgroundDataStoreProcessor processes#36128cyperdark wants to merge 4 commits intoppy:masterfrom
BackgroundDataStoreProcessor processes#36128Conversation
|
Did you try performing any operations when testing this? It should deadlock. Realm doesn't support multiple concurrent writes, so holding writes open for long periods as you are doing here, would under my presumptions, cause something like running gameplay (and triggering an update-thread-bound write) to freeze the game. |
|
when it was recalculating beatmaps i was scrolling though song select and it was working fine, let me test it gameplay and song select once again. If it wont work i'll try another approach i have in mind |
|
Please air the other solution before you work on it, because if it's what I think you're planning it likely has other implications. |
|
Doing intermittent writes like that will delay the updates from being visible. It also increases the size of individual transactions. Maybe okay, but will need due consideration. And it likely will not fix the "abysmal performance" group of users, who actually see things taking seconds per realm write. What we really need there is to find someone who gets very low performances on realm transactions and figure out what is causing that, rather than working around the issue. |
|
hmm, best candidates for testing that people who have their lazer files/ and realm.client on HDD. So using bulk update by 100 beatmaps in |
|
Does it add overhead while updating? If so, that's my concern. Put another way: we want to be able to run background operations like this without the user being concerned they are running. Making it run faster is fine and all, but it's not fixing the actual issue we are dealing with. So this is super low priority as long as all that's happening is "batching more". |
|
idk how to measure overhead, but here's video with performance graph with calculating star ratings and scrolling in song select. Closer to the end i rerun calculations with small chunk of maps and then started updating tags (both used bulk, star ratings 100, user tags 500): https://youtu.be/m_7wohmkI5U with only user tags updating: https://youtu.be/6X7xrV1VsyA recalculating 5k maps: user tags (doesnt lag in song select for me): |
Star Ratings recalculation and User Tags population processesStar Ratings recalculation and User Tags population processes
|
round 2, i've revived my old pc and copied all beatmaps to HDD and tested once again:
It feels less laggy with bulk changes, and i have suspicion that lagging comes from cases when beatmap recalculation fails or map parsing fails i didnt test other functions because for some reason some of them got stuck at some point and i decided to skip them for now also here's runtime.log files |
|
I happened to test this as well on my old HDD, and got similar results as the author posted #36128 (comment) . I also should point out that the whole UI (main menu / song select / etc.) feels very laggy when populating user tags without this change, like, as if it runs at 5 fps. After this change the UI feels much smoother. The PR does have bugs (as already discussed above), but I think it should not be "super low priority" as it does solve the problem and improve the experience for HDD users. Indeed I would say that spending several minutes to wait this process to complete is way better than leaving it in the background essentially forever. Another approach would be first copy the whole client.realm (or other database files, idk) into RAM, and write it back to HDD after everything is finished. Honestly this is the first idea I come up with, though I understand it also has its own issues. |
|
@cihe13375 i have few local changes that does bulk writes, can you try them as well to see if it still lags more, since it's writes every 500 processed beatmaps |
You're missing the point though. The whole reason for using realm is that this is not supposed to happen in the first place. I want to understand why. Not hack around it. |
I tried and I feel that the largest lag happens at disk writes (e.g. when the SR calculation reaches ~500, 1000, etc.), where the whole UI is frozen for like 0.5-1s. In other times the song select does lag a bit, but it is kinda usable (way better than batch size 1). |
I don't really know how realm works internally, but the realm .net documentation (https://github.com/realm/realm-dotnet/blob/community/Guides/crud/write-transactions.md?plain=1) says such a thing:
As each map processed calls I tried silly a local change that removed other operations in the user tag population: diff --git a/osu.Game/Database/BackgroundDataStoreProcessor.cs b/osu.Game/Database/BackgroundDataStoreProcessor.cs
index f8d1b9ae51..8665ce3dab 100644
--- a/osu.Game/Database/BackgroundDataStoreProcessor.cs
+++ b/osu.Game/Database/BackgroundDataStoreProcessor.cs
@@ -655,8 +655,6 @@ private void backpopulateUserTags()
if (metadataSourceFetchDate <= lastPopulation)
{
- Logger.Log($@"Skipping user tag population because the local metadata source hasn't been updated since the last time user tags were checked ({lastPopulation.Value:d})");
- return;
}
Logger.Log(@"Querying for beatmaps that do not have user tags");
@@ -695,27 +693,6 @@ private void backpopulateUserTags()
// ReSharper disable once MethodHasAsyncOverload
realmAccess.Write(r =>
{
- BeatmapInfo beatmap = r.Find<BeatmapInfo>(id)!;
-
- bool lookupSucceeded = localMetadataSource.TryLookup(beatmap, out var result);
-
- if (lookupSucceeded)
- {
- Debug.Assert(result != null);
-
- var userTags = result.UserTags.ToHashSet();
-
- if (!userTags.SetEquals(beatmap.Metadata.UserTags))
- {
- beatmap.Metadata.UserTags.Clear();
- beatmap.Metadata.UserTags.AddRange(userTags);
- return true;
- }
-
- return false;
- }
-
- Logger.Log(@$"Could not find {beatmap.GetDisplayString()} in local cache while backpopulating missing user tags");
return false;
});
The performance is just as bad, and I can clearly hear the sound of HDD working during the stub "user tag population" process. If I move client.realm to SSD and create a symlink at the osu data folder on HDD, all problems are gone. |
because of high disk usage, here's video with disk usage graphs https://youtu.be/j5MrpL_q0c0 as you can see in video a regular lazer version does too much disk writes especially on "User Tags Population" process. I also think that for user tags process i can increase bulk size to 2000-3000 since this process is way faster and it'll use less disk which will both good for lazer and disk itself |
|
made final changes ^ and run tests again on HDD but fully waiting for them to finish and rebooting pc after each one
for some reason i have a score in database with null in beatmapInfo and because of that Another question i have about |
aed1051 to
97ab0ef
Compare
Please read the code to get an answer to this. |
|
hey @peppy i asked people on reddit for testing and some of them said that around I started investigating why and where this thing causes issue. And after 4-5 hours i found that if i disable After another 30 minutes i found class
here's video with and without this code 1768772278_LGGcvLl2jy.mp4 |
|
locally i reverted PR and added notifications to |
|
Thanks, but unfortunately I think you're missing the point of testing. Nothing is gained from the testing you're having people do. Also sharing builds like this on reddit is a HUGE security no-no, I'd suggest removing your post IMMEDIATELY.
This is a known issue. See realm/realm-dotnet#2917 for one place I've reported this. But without notifications, realm is useless to us.
If you spend another 30 minutes blaming this class, you will find that this was made to improve performance by reducing subscriptions to a bare minimum. I will reiterate, there are some users where realm operations aren't 50% slower "without your changes". That's not what this is about. The issue is that for some users, realm operations are 100-1000x slower than the average osu! user. For no good reason. |
deleted
not on this class, it's in general to find a place which causes things to lag spike
oh i see, but in this PR i am trying to speed processes by using bulk, because one day i was curious what exactly p.s. i have uncommitted changes which properly adds BulkWrite So can this PR still be merged with Bulk writes idea or it's not needed? |
It needs validation testing by myself or someone else on the team with a sound approach to testing. This may take weeks to months, so please chill. If your |
97ab0ef to
f941673
Compare
Star Ratings recalculation and User Tags population processesBackgroundDataStoreProcessor processes
peppy
left a comment
There was a problem hiding this comment.
not even reviewing most of the code, just a few things
| } | ||
| } | ||
|
|
||
| public bool TryLookup(SqliteConnection db, int version, BeatmapInfo beatmapInfo, [NotNullWhen(true)] out OnlineBeatmapMetadata? onlineMetadata) |
There was a problem hiding this comment.
What on earth? Is this a copy paste of the method above? Why?
| } | ||
|
|
||
| private SqliteConnection getConnection() => | ||
| public SqliteConnection GetConnection() => |
There was a problem hiding this comment.
Please write a benchmark in the benchmarks project or otherwise showing why this is required. I'd want to know the overheads of this rather than blindly complicating code because we can.
There was a problem hiding this comment.
HDD WD20EZBX - 275Mb/s(?)
| amount | 9057 | 43437 |
|---|---|---|
| before | 20364ms | 38700ms |
| after | 17269ms | 22975ms |
| speed up | 1.179x | 1.684x |
SSD MZ-V8P1T0BW - 7000Mb/5000Mb
| amount | 9057 | 43498 |
|---|---|---|
| before | 4528ms | 9748ms |
| after | 3153ms | 8790ms |
| speed up | 1.436x | 1.108x |
i will edit main post soon
There was a problem hiding this comment.
please show the code used to benchmark
There was a problem hiding this comment.
i manually did testing:
- shutdown pc => start pc
- run lazer before sqlite changes
- shutdown pc => start pc
- run lazer with sqlite changes
and i applied those changes on top of that:
code
diff --git a/osu.Game/Database/BackgroundDataStoreProcessor.cs b/osu.Game/Database/BackgroundDataStoreProcessor.cs
index b2a535cc33..eeb0c4a855 100644
--- a/osu.Game/Database/BackgroundDataStoreProcessor.cs
+++ b/osu.Game/Database/BackgroundDataStoreProcessor.cs
@@ -76,6 +76,8 @@ public partial class BackgroundDataStoreProcessor : Component
private LocalCachedBeatmapMetadataSource localMetadataSource = null!;
+ private Dictionary<Guid, (int EndTimeObjectCount, int TotalObjectCount)> beatmapsObjectCountsCache = [];
+
protected virtual int TimeToSleepDuringGameplay => 30000;
protected override void LoadComplete()
@@ -87,19 +89,22 @@ protected override void LoadComplete()
ProcessingTask = Task.Factory.StartNew(() =>
{
Logger.Log("Beginning background data store processing..");
+ Thread.Sleep(10000);
- clearOutdatedStarRatings();
- populateMissingStarRatings();
- processOnlineBeatmapSetsWithNoUpdate();
+ // clearOutdatedStarRatings();
+ // populateMissingStarRatings();
+ // processOnlineBeatmapSetsWithNoUpdate();
// Note that the previous method will also update these on a fresh run.
- processBeatmapsWithMissingObjectCounts();
- processScoresWithMissingStatistics();
- convertLegacyTotalScoreToStandardised();
- upgradeScoreRanks();
+ // processBeatmapsWithMissingObjectCounts();
+ // processScoresWithMissingStatistics();
+ // convertLegacyTotalScoreToStandardised();
+ // upgradeScoreRanks();
backpopulateMissingSubmissionAndRankDates();
backpopulateUserTags();
}, TaskCreationOptions.LongRunning).ContinueWith(t =>
{
+ beatmapsObjectCountsCache.Clear();
+
if (t.Exception?.InnerException is ObjectDisposedException)
{
Logger.Log("Finished background aborted during shutdown");
@@ -159,7 +164,8 @@ private void populateMissingStarRatings()
realmAccess.Run(r =>
{
- foreach (var b in r.All<BeatmapInfo>().Where(b => b.StarRating < 0 && b.BeatmapSet != null))
+ foreach (var b in r.All<BeatmapInfo>())
+ // foreach (var b in r.All<BeatmapInfo>().Where(b => b.StarRating < 0 && b.BeatmapSet != null))
beatmapIds.Add(b.ID);
});
@@ -215,6 +221,11 @@ Ruleset getRuleset(RulesetInfo rulesetInfo)
var working = beatmapManager.GetWorkingBeatmap(beatmap);
var ruleset = getRuleset(working.BeatmapInfo.Ruleset);
+ beatmapsObjectCountsCache.Add(id, (
+ working.Beatmap.HitObjects.Count(h => h is IHasDuration),
+ working.Beatmap.HitObjects.Count
+ ));
+
Debug.Assert(ruleset != null);
var calculator = ruleset.CreateDifficultyCalculator(working);
@@ -320,7 +331,8 @@ private void processBeatmapsWithMissingObjectCounts()
realmAccess.Run(r =>
{
- foreach (var b in r.All<BeatmapInfo>().Where(b => b.TotalObjectCount < 0 || b.EndTimeObjectCount < 0))
+ foreach (var b in r.All<BeatmapInfo>())
+ // foreach (var b in r.All<BeatmapInfo>().Where(b => b.TotalObjectCount < 0 || b.EndTimeObjectCount < 0))
beatmapIds.Add(b.ID);
});
@@ -363,6 +375,22 @@ private void processBeatmapsWithMissingObjectCounts()
try
{
+ if (beatmapsObjectCountsCache.TryGetValue(id, out var result))
+ {
+ actions.Add(realm =>
+ {
+ var beatmap = realm.Find<BeatmapInfo>(id)!;
+
+ beatmap.TotalObjectCount = result.TotalObjectCount;
+ beatmap.EndTimeObjectCount = result.EndTimeObjectCount;
+
+ ((IWorkingBeatmapCache)beatmapManager).Invalidate(beatmap);
+ });
+
+ ++processedCount;
+ continue;
+ }
+
// Moved from BeatmapUpdater.ProcessObjectCounts to reduce writes to disk
var workingBeatmapCache = (IWorkingBeatmapCache)beatmapManager;
workingBeatmapCache.Invalidate(beatmap);
@@ -407,15 +435,17 @@ private void processScoresWithMissingStatistics()
realmAccess.Run(r =>
{
- foreach (var score in r.All<ScoreInfo>().Where(s => !s.BackgroundReprocessingFailed))
- {
- if (score.BeatmapInfo != null
- && score.Statistics.Sum(kvp => kvp.Value) > 0
- && score.MaximumStatistics.Sum(kvp => kvp.Value) == 0)
- {
- scoreIds.Add(score.ID);
- }
- }
+ foreach (var score in r.All<ScoreInfo>())
+ scoreIds.Add(score.ID);
+ // foreach (var score in r.All<ScoreInfo>().Where(s => !s.BackgroundReprocessingFailed))
+ // {
+ // if (score.BeatmapInfo != null
+ // && score.Statistics.Sum(kvp => kvp.Value) > 0
+ // && score.MaximumStatistics.Sum(kvp => kvp.Value) == 0)
+ // {
+ // scoreIds.Add(score.ID);
+ // }
+ // }
});
if (scoreIds.Count == 0)
@@ -498,10 +528,10 @@ private void convertLegacyTotalScoreToStandardised()
HashSet<Guid> scoreIds = realmAccess.Run(r => new HashSet<Guid>(
r.All<ScoreInfo>()
- .Where(s => !s.BackgroundReprocessingFailed
- && s.BeatmapInfo != null
- && s.IsLegacyScore
- && s.TotalScoreVersion < LegacyScoreEncoder.LATEST_VERSION)
+ // .Where(s => !s.BackgroundReprocessingFailed
+ // && s.BeatmapInfo != null
+ // && s.IsLegacyScore
+ // && s.TotalScoreVersion < LegacyScoreEncoder.LATEST_VERSION)
.AsEnumerable()
// must be done after materialisation, as realm doesn't want to support
// nested property predicates
@@ -592,7 +622,7 @@ private void upgradeScoreRanks()
HashSet<Guid> scoreIds = realmAccess.Run(r => new HashSet<Guid>(
r.All<ScoreInfo>()
- .Where(s => s.TotalScoreVersion < 30000013 && !s.BackgroundReprocessingFailed) // last total score version with a significant change to ranks
+ // .Where(s => s.TotalScoreVersion < 30000013 && !s.BackgroundReprocessingFailed) // last total score version with a significant change to ranks
.AsEnumerable()
// must be done after materialisation, as realm doesn't support
// filtering on nested property predicates or projection via `.Select()`
@@ -705,8 +735,9 @@ private void backpopulateMissingSubmissionAndRankDates()
// that said, one difficulty in ranked state is enough for the backpopulation to work.
HashSet<Guid> beatmapSetIds = realmAccess.Run(r => new HashSet<Guid>(
r.All<BeatmapSetInfo>()
- .Filter($@"{nameof(BeatmapSetInfo.StatusInt)} > 0 && ({nameof(BeatmapSetInfo.DateRanked)} == null || {nameof(BeatmapSetInfo.DateSubmitted)} == null) "
- + $@"&& ANY {nameof(BeatmapSetInfo.Beatmaps)}.{nameof(BeatmapInfo.StatusInt)} > 0")
+ .Filter($@"{nameof(BeatmapSetInfo.StatusInt)} > 0 && ANY {nameof(BeatmapSetInfo.Beatmaps)}.{nameof(BeatmapInfo.StatusInt)} > 0")
+ // .Filter($@"{nameof(BeatmapSetInfo.StatusInt)} > 0 && ({nameof(BeatmapSetInfo.DateRanked)} == null || {nameof(BeatmapSetInfo.DateSubmitted)} == null) "
+ // + $@"&& ANY {nameof(BeatmapSetInfo.Beatmaps)}.{nameof(BeatmapInfo.StatusInt)} > 0")
.AsEnumerable()
.Select(b => b.ID)));
@@ -830,11 +861,11 @@ private void backpopulateUserTags()
// is less accurate than the cache file's fetch date which is stored with higher precision in the filesystem metadata.
var metadataSourceFetchDate = localMetadataSource.GetCacheFetchDate()?.Date;
- if (metadataSourceFetchDate <= lastPopulation)
- {
- Logger.Log($@"Skipping user tag population because the local metadata source hasn't been updated since the last time user tags were checked ({lastPopulation.Value:d})");
- return;
- }
+ // if (metadataSourceFetchDate <= lastPopulation)
+ // {
+ // Logger.Log($@"Skipping user tag population because the local metadata source hasn't been updated since the last time user tags were checked ({lastPopulation.Value:d})");
+ // return;
+ // }
Logger.Log(@"Querying for beatmaps that do not have user tags");
@@ -843,7 +874,8 @@ private void backpopulateUserTags()
// if that turns out to be the case we may need a better way to debounce this (or just delete the backpopulation logic after some time has passed?)
HashSet<Guid> beatmapIds = realmAccess.Run(r => new HashSet<Guid>(
r.All<BeatmapInfo>()
- .Filter($"{nameof(BeatmapInfo.Metadata)}.{nameof(BeatmapMetadata.UserTags)}.@count == 0 AND {nameof(BeatmapInfo.StatusInt)} IN {{ 1,2,4 }}")
+ .Filter($"{nameof(BeatmapInfo.StatusInt)} IN {{ 1,2,4 }}")
+ // .Filter($"{nameof(BeatmapInfo.Metadata)}.{nameof(BeatmapMetadata.UserTags)}.@count == 0 AND {nameof(BeatmapInfo.StatusInt)} IN {{ 1,2,4 }}")
.AsEnumerable()
.Select(b => b.ID)));
timings copied for runtime.log which i added in this commit
There was a problem hiding this comment.
This is not valid unfortunately...
Please take a look at the benchmark project and consider isolating in a reproducible way if you want to help. Else just wait for one of us to do it properly I guess.
There was a problem hiding this comment.
oh shit i think i used from lazer versions, because i did more test and difference on HDD is significant compare to one above, and this time i shutdown pc for a 5 sec and then run other version:
SSD MZ-V8P1T0BW - 7000Mb/5000Mb
| amount | 9057 | 43498 |
|---|---|---|
| before | 3475ms | 8371ms |
| after | 2915ms | 6475ms |
| speed up | 1.192x | 1.292x |
HDD WD20EZBX - 275Mb/s(?)
| amount | 9057 | 43437 |
|---|---|---|
| before | 40388ms | 49045ms |
| after | 15256ms | 23023ms |
| speed up | 2.647x | 2.130x |
log files:
There was a problem hiding this comment.
Please take a look at the benchmark project and consider isolating in a reproducible way if you want to help. Else just wait for one of us to do it properly I guess.
oooh, i was confused by what is benchmark project now i see will, try it soon
There was a problem hiding this comment.
sorry, i've tried and didnt understand how to make them for BackgroundDataStoreProcessor


BulkWritemethod to realmTryLookupmethod to reusable sqlite connection inLocalCachedBeatmapMetadataSourcebenchmark (updated 2025-01-20)
Tests done offline on HDD WD20EZBX - 275Mb/s(?)
Notes
backpopulateMissingSubmissionAndRankDatesmethod sending 9k sets toRealmDetachedBeatmapStoresubscriptionprocessOnlineBeatmapSetsWithNoUpdatebecause it making web requests to server(?) and im not sure if i should touch this function.