Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected frame stutters at regular intervals (GC related) #11800

Closed
pinembour opened this issue Feb 17, 2021 · 81 comments
Closed

Unexpected frame stutters at regular intervals (GC related) #11800

pinembour opened this issue Feb 17, 2021 · 81 comments

Comments

@pinembour
Copy link

Describe the bug:
A few times per map I encounter a stutter that makes frame time for one frame spike. The screenshot shows the issue in mania but I encounter it in every game mode.
From what I saw, nothing in the logs seems to happen at the same time as the stutter.

I can reproduce this issue with stock or custom kernel, 5.10 or 5.11, amd gpu or nvidia, built from source or using appimage.

Screenshots or videos showing encountered issue:
Screenshot from 2021-02-17 09-38-32

osu!lazer version:
2021.212.0.r106.86faa7f465-1, reproductible with 2021.109.0-1 and anything in between, did not try anything earlier

Logs:

logs.zip

@peppy
Copy link
Sponsor Member

peppy commented Feb 17, 2021

There's nothing to work with in this report, nor is this reproducible for us. We cannot solve per-user performance issues like this sorry.

If you happen to come up with some kind of profiling or further information please feel free to attach it to this issue thread and we will re-visit it.

@peppy peppy closed this as completed Feb 17, 2021
@Micha-ohne-el
Copy link
Contributor

Micha-ohne-el commented Feb 22, 2021

I am experiencing the exact same issue, and I have a few things to add to hopefully help solve this:

  • This is not only happening during gameplay, but also in menus and basically everywhere else. They do seem more common during gameplay (every 5–10 seconds), but I've had them happen in menus to (maybe every 30 seconds-ish).
  • It happens no matter what I set the frame limiter to. I tried every one, including VSync; same result of same severity every time.
  • The stutters happen every couple of seconds, which leads me to believe that is is caused by some sort of cleanup task that runs on a schedule. The stutters don't happen on a perfect schedule, as it seems, but they tend to be spaced out, if that makes sense.
    Also, the memory footprint seems to stay unaffected during these stutters, so maybe I'm wrong with the cleanup theory.
  • As the screenshot in the original post shows, the stutters are classified as “Work” by the game, in both the update and the draw routines, and they always spike both graphs at once, never just draw, or just update.
  • The stutters also happen both in single- and multi-threded mode, no difference there.

@pinembour, what DE are you using? I'm using KDE Plasma, maybe that could be a factor (though I doubt it)?
One more thing to note is that we are both using high-refreshrate monitors. Mine runs at 120 Hz, yours at 144 Hz. I tried lowering my refresh rate to 60 Hz but that didn't seem to improve things.

@peppy
Copy link
Sponsor Member

peppy commented Feb 22, 2021

If you want to rule out garbage collection, you can try clicking "clear all caches" at the bottom of settings. This will force the harshest garbage collection level (which will almost never run). You can also see garbage collections as they show up as dots on the frame graph, but keep in mind they may not necessarily be the cause but the result (the runtime sees that nothing is happening so it runs a GC).

The spike will show on both graphs because draw is dependent on update. If there's no update run, there is nothing to draw on the screen so it will wait on the update to arrive. If it shows as "work" on both then it's guaranteed to be a draw-side issue (update blocking draw will show as sleep time on the draw thread). This 99.9999999999% rules out garbage collection, and points to a GPU, driver, or GL code issue.

We can't really work further without you performing profiling on your system to find what is causing the gpu pipeline blocking. What I can say is that we don't experience this on Windows or macOS, so it is likely driver or hardware specific.

@Micha-ohne-el
Copy link
Contributor

Thank you for that crazy fast reply!

If you want to rule out garbage collection, you can try clicking "clear all caches" at the bottom of settings. This will force the harshest garbage collection level (which will almost never run). You can also see garbage collections as they show up as dots on the frame graph, but keep in mind they may not necessarily be the cause but the result (the runtime sees that nothing is happening so it runs a GC).

I did that, and just like you said, there are dots appearing on each of those spikes (red, yellow, and green ones if that matters).

The spike will show on both graphs because draw is dependent on update. If there's no update run, there is nothing to draw on the screen so it will wait on the update to arrive. If it shows as "work" on both then it's guaranteed to be a draw-side issue (update blocking draw will show as sleep time on the draw thread). This 99.9999999999% rules out garbage collection, and points to a GPU, driver, or GL code issue.

We can't really work further without you performing profiling on your system to find what is causing the gpu pipeline blocking. What I can say is that we don't experience this on Windows or macOS, so it is likely driver or hardware specific.

I used trace-cmd to do some profiling but idk how to interpret the results properly. I used gpuvis to visually look at the resulting traces, and they do show a pattern, but idk what it means.
This is one stutter, for example:
2021-02-22_14-57-42
And here it is zoomed further in:
2021-02-22_14-58-41
I uploaded all three traces that I created, so hopefully you or anyone else can interpret them better: Profiling Traces.zip

Feel completely free to request any more info, I really want this to get figured out, as it's actually really annoying when playing.
And please forgive me if I did something wrong, it's my first time doing something like this.

@peppy
Copy link
Sponsor Member

peppy commented Feb 22, 2021

Are those after manually pressing the button in settings? I may have not been clear: that button will cause a stutter (it does for me as well, for around 10ms) because it's doing a lot of clean-up that usually would not occur. The trace output you provide may be useful if you can confirm (or provide a second set of trace output) during the actual stutters you are experiencing during gameplay or otherwise.

@Micha-ohne-el
Copy link
Contributor

Ah, sorry. These are actual stutters recorded during normal gameplay. I didn't press that putton at all during those traces.
Pressing the button causes a much larger stutter, should I record a trace for that as well, so you can compare the two?

@pinembour
Copy link
Author

pinembour commented Feb 22, 2021

I'm glad I'm not the only one facing this issue, we might be able to figure something out here. Every point you added @Micha-ohne-el applies to me as well. I can also add that the time between each stutter seems to be dependent on the difficulty. For example on one song, in an easy 4k map the stutters may appear every 11s or so, but in a harder 7k map they would appear every 8s. ( I can also see a difference, although smaller, between a 4k easy and a harder 4k.)

@pinembour, what DE are you using? I'm using KDE Plasma, maybe that could be a factor (though I doubt it)?
One more thing to note is that we are both using high-refreshrate monitors. Mine runs at 120 Hz, yours at 144 Hz. I tried lowering my refresh rate to 60 Hz but that didn't seem to improve things.

I am using gnome, also tried with sway so I could rule out mutter, with no amelioration.

The spike will show on both graphs [...]

I'm not sure it matters, but the spike is actually visible on all 4 graphs

@Micha-ohne-el
Copy link
Contributor

I can also add that the time between each stutter seems to be dependent on the difficulty. For example on one song, in an easy 4k map the stutters may appear every 11s or so, but in a harder 7k map they would appear every 8s. ( I can also see a difference, although smaller, between a 4k easy and a harder 4k.)

I was thinking the same thing, but I thought I was imagining it (since the stutters don't always take the same amount of time, I thought it was just a coincidence that harder maps tended to have more severe stutters).
But yes, I can also confirm this, the more object there are, the more frequent the stutters are (and possibly the longer they take).

I am using gnome, also tried with sway so I could rule out mutter, with no amelioration.

Yeah, seems as though the issue is within osu!.
As an aside, I also tried running the latest LST kernel (5.4) but the issue persisted, leading me to believe that the kernel is not the issue.

I'm not sure it matters, but the spike is actually visible on all 4 graphs

Yes, for me the Audio and Input graphs show it in blue (meaning Sleep), which makes sense if they're also dependent on Update like Draw is.

@peppy
Copy link
Sponsor Member

peppy commented Feb 22, 2021

Please do not test using mania. This is the only ruleset which has not been optimised for memory usage. Test using the main menu (since you say you can reproduce there) or in another ruleset.

@pinembour
Copy link
Author

Okay I'll keep that in mind, but I can still see that the difficulty correlates with the time between stutters in osu or taiko.

@pinembour
Copy link
Author

pinembour commented Feb 22, 2021

Yeah, seems as though the issue is within osu!.
As an aside, I also tried running the latest LST kernel (5.4) but the issue persisted, leading me to believe that the kernel is not the issue.

I'm not sure the blame falls fully on osu!. I remember the game working fine not that long ago, but even a build from november has the same issue atm. Are you using Arch or a derivative ? I'm using Arch, but when I tried a liveusb of Fedora I couldn't reproduce the issue.

@Micha-ohne-el
Copy link
Contributor

So I recorded two more stutters with very controlled conditions so that they can be reproduced:
I started the game, pressed the osu! cookie once, went to the top-bar-thingy to pause the music. Then I just sat there on the main menu, without doing anything, not even moving the mouse cursor. This resulted in this stutter:
2021-02-22_16-18-27
As you can see it is around 100 ms long. I had to wait quite a bit for it to happen (let's say 30 seconds).
One thing to note is that the Update graph showed a blue spike this time! This is also reflected in the trace by a continuous purple line. No clue why it did that this time, it never did this any other time.
After this, I went into the settings and clicked the Clear All Caches button to get this stutter:
2021-02-22_16-17-28
As you can see, it looks very different; for one it is much longer at nearly 650 ms, but the events happening are also very different from the one above, further suggesting that this has nothing to do with GC.
Both traces can be found here: More Stutters.zip

@Micha-ohne-el
Copy link
Contributor

I'm not sure the blames falls fully on osu!. I remember the game working fine not that long ago, but even a build from november has the same issue atm.

Yeah, I remember that too, and I can also confirm the exact same thing, even older builds produce this behavior for me, despite me being reasonably sure that those builds worked fine for me before.

Are you using Arch or a derivative ? I'm using Arch, but when I tried a liveusb of Fedora I couldn't reproduce the issue.

Yes, I'm using Manjaro, which is Arch-based.
So it seems like some update in some package that Arch rolled out around a week to two weeks ago is causing this issue.
Do you have any idea as to what package this could be?

@pinembour
Copy link
Author

It could be, but I have no idea what the specific package would be.

@peppy
Copy link
Sponsor Member

peppy commented Feb 23, 2021

I'm willing to look into this and potentially point you in the right direction if you can record a trace in the standard dotnet format using dotnet-trace collect -p <pid> (you will need to install the dotnet sdk and use dotnet tools to install the trace tool).

@pinembour
Copy link
Author

Here is the trace from my system, with the exact same procedure @Micha-ohne-el described earlier.
The stutter happened 45 seconds after launching the game and the trace.
trace.zip

@Micha-ohne-el
Copy link
Contributor

I'm willing to look into this and potentially point you in the right direction if you can record a trace in the standard dotnet format using dotnet-trace collect -p <pid> (you will need to install the dotnet sdk and use dotnet tools to install the trace tool).

@peppy, thank you for considering this!
I recorded a trace with dotnet-trace.
I was just sitting in the main menu after first launching the game (and entering and exiting the settings), so it took a short while for a stutter to occur. I tried to stop the trace as soon as that happened.
I don't know how to visualize this sort of trace, but since you asked for it, I'm sure you'll know what to do with it: trace.nettrace.zip

If anything interesting happens related to this, or if I manage to fix it, I'll post my findings in here.

@peppy peppy self-assigned this Feb 24, 2021
@peppy
Copy link
Sponsor Member

peppy commented Feb 24, 2021

Second trace (from #11800 (comment)):

20210224 152037 (Parallels Desktop app)

This is definitely looking to be a very deep GC execution. Not something we generally see happen, so it's curious as to why it's running. If possible, could you provide a longer trace where you stay at the main menu the whole time, but encounter two of these stutters? I'd like to cross-compare them to make sure it's the same thing each time.

Also could you please specify your general memory stats (used / free / swap available?)

GC trigger reason:

Event Name Time MSec Process Name Rest
Microsoft-Windows-DotNETRuntime/GC/Start 24,480.796 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Count="15" Reason="AllocSmall" Depth="1" Type="NonConcurrentGC" ClrInstanceID="0" ClientSequenceNumber="0"

GC stats:

Event Name Time MSec Process Name Rest
Microsoft-Windows-DotNETRuntime/GC/Stop 24,539.920 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" DURATION_MSEC="59.124" Count="15" Depth="1" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/HeapStats 24,539.921 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" TotalHeapSize="247,511,584" TotalPromoted="3,958,320" Depth="1" GenerationSize0="21,699,360" TotalPromotedSize0="1,538,960" GenerationSize1="1,491,296" TotalPromotedSize1="2,419,360" GenerationSize2="150,392,352" TotalPromotedSize2="0" GenerationSize3="73,928,552" TotalPromotedSize3="0" FinalizationPromotedSize="237,335" FinalizationPromotedCount="5" PinnedObjectCount="3" SinkBlockCount="151" GCHandleCount="46,199" ClrInstanceID="0" GenerationSize4="24" TotalPromotedSize4="0"

Weirdly it looks to start as a gen1, which shouldn't take this long in the first place, but the full log shows it going deeper? Also note that the GC isn't even under much load...

Event Name Time MSec Process Name Rest
Microsoft-Windows-DotNETRuntime/GC/Start 24,480.796 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Count="15" Reason="AllocSmall" Depth="1" Type="NonConcurrentGC" ClrInstanceID="0" ClientSequenceNumber="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,480.799 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="4" RangeStart="0x7f5597fff000" RangeUsedLength="0x00000018" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,480.801 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="3" RangeStart="0x7f558ffff000" RangeUsedLength="0x04680f68" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,480.802 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="2" RangeStart="0x7f557ffff000" RangeUsedLength="0x08f6ce20" RangeReservedLength="0x08f6ce20" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,480.803 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="1" RangeStart="0x7f547ffff000" RangeUsedLength="0x0137c2e8" RangeReservedLength="0x0137c2e8" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,480.805 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="0" RangeStart="0x7f548137b2e8" RangeUsedLength="0x0517f950" RangeReservedLength="0x0ec82d18" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 24,481.802 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="0" Promoted="280"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 24,481.804 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="1" Promoted="0"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 24,481.846 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="2" Promoted="60,096"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 24,486.546 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="3" Promoted="3,656,106"
Microsoft-Windows-DotNETRuntime/GC/BulkMovedObjectRanges 24,535.396 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Index="0" Count="2,444" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GlobalHeapHistory 24,539.910 Process(22337) (22337) HasStack="True" ThreadID="22,388" ProcessorNumber="0" FinalYoungestDesired="83,886,080" NumHeaps="1" CondemnedGeneration="1" Gen0ReductionCount="0" Reason="AllocSmall" GlobalMechanisms="Compaction, Promotion, Demotion, CardBundles" ClrInstanceID="0" PauseMode="SustainedLowLatency" MemoryPressure="13" CondemnReasons0="0" CondemnReasons1="0"
Microsoft-Windows-DotNETRuntime/GC/PerHeapHistory 24,539.913 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" ClrInstanceID="0" FreeListAllocated="2,419,360" FreeListRejected="5,616" EndOfSegAllocated="0" CondemnedAllocated="0" PinnedAllocated="0" PinnedAllocatedAdvance="0" RunningFreeListEfficiency="94" CondemnReasons0="4" CondemnReasons1="48" CompactMechanisms="compact_low_ephemeral" ExpandMechanisms="not_specified" ConcurrentCompactMechanisms="" HeapIndex="0" ExtraGen0Commit="84,497,280" Count="5" MemoryPressure=""
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,539.918 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="4" RangeStart="0x7f5597fff000" RangeUsedLength="0x00000018" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,539.918 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="3" RangeStart="0x7f558ffff000" RangeUsedLength="0x04680f68" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,539.919 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="2" RangeStart="0x7f557ffff000" RangeUsedLength="0x08f6ce20" RangeReservedLength="0x08f6ce20" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,539.919 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="1" RangeStart="0x7f547ffff000" RangeUsedLength="0x0016c160" RangeReservedLength="0x0016c160" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 24,539.920 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" Generation="0" RangeStart="0x7f548016b160" RangeUsedLength="0x014b1b20" RangeReservedLength="0x0fe92ea0" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/Stop 24,539.920 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" DURATION_MSEC="59.124" Count="15" Depth="1" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/HeapStats 24,539.921 Process(22337) (22337) ThreadID="22,388" ProcessorNumber="0" TotalHeapSize="247,511,584" TotalPromoted="3,958,320" Depth="1" GenerationSize0="21,699,360" TotalPromotedSize0="1,538,960" GenerationSize1="1,491,296" TotalPromotedSize1="2,419,360" GenerationSize2="150,392,352" TotalPromotedSize2="0" GenerationSize3="73,928,552" TotalPromotedSize3="0" FinalizationPromotedSize="237,335" FinalizationPromotedCount="5" PinnedObjectCount="3" SinkBlockCount="151" GCHandleCount="46,199" ClrInstanceID="0" GenerationSize4="24" TotalPromotedSize4="0"

@peppy
Copy link
Sponsor Member

peppy commented Feb 24, 2021

First trace (from #11800 (comment))

20210224 154643 (Parallels Desktop app)

GC trigger reason:

Event Name Time MSec Process Name Rest
Microsoft-Windows-DotNETRuntime/GC/Start 45,088.094 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Count="14" Reason="AllocSmall" Depth="1" Type="NonConcurrentGC" ClrInstanceID="0" ClientSequenceNumber="0"

GC stats:

Event Name Time MSec Process Name Rest
Microsoft-Windows-DotNETRuntime/GC/Stop 45,129.702 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" DURATION_MSEC="41.608" Count="14" Depth="1" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/HeapStats 45,129.703 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" TotalHeapSize="192,293,344" TotalPromoted="9,875,232" Depth="1" GenerationSize0="24" TotalPromotedSize0="440,016" GenerationSize1="442,440" TotalPromotedSize1="9,435,216" GenerationSize2="117,489,816" TotalPromotedSize2="0" GenerationSize3="74,361,040" TotalPromotedSize3="0" FinalizationPromotedSize="243,375" FinalizationPromotedCount="11" PinnedObjectCount="0" SinkBlockCount="121" GCHandleCount="44,053" ClrInstanceID="0" GenerationSize4="24" TotalPromotedSize4="0"

This one is shorter, at 42ms, but that's still longer than we can live with. Very weird indeed.

Event Name Time MSec Process Name Rest
Microsoft-Windows-DotNETRuntime/GC/RestartEEStart 45,086.027 Process(281057) (281057) ThreadID="281,095" ProcessorNumber="0" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/RestartEEStop 45,086.032 Process(281057) (281057) ThreadID="281,095" ProcessorNumber="0" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/SuspendEEStart 45,087.091 Process(281057) (281057) ThreadID="281,095" ProcessorNumber="0" Reason="SuspendOther" Count="-1" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/SuspendEEStop 45,087.099 Process(281057) (281057) ThreadID="281,095" ProcessorNumber="0" DURATION_MSEC="0.008" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/RestartEEStart 45,087.229 Process(281057) (281057) ThreadID="281,095" ProcessorNumber="0" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/RestartEEStop 45,087.235 Process(281057) (281057) ThreadID="281,095" ProcessorNumber="0" DURATION_MSEC="0.144" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/SuspendEEStart 45,088.023 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Reason="SuspendForGC" Count="13" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/SuspendEEStop 45,088.029 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" DURATION_MSEC="0.006" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/Triggered 45,088.048 Process(281057) (281057) HasStack="True" ThreadID="281,139" ProcessorNumber="0" Reason="AllocSmall" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/Start 45,088.094 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Count="14" Reason="AllocSmall" Depth="1" Type="NonConcurrentGC" ClrInstanceID="0" ClientSequenceNumber="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,088.095 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="4" RangeStart="0x7f318ffff000" RangeUsedLength="0x00000018" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,088.095 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="3" RangeStart="0x7f3187fff000" RangeUsedLength="0x046ea8d0" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,088.095 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="2" RangeStart="0x7f3177fff000" RangeUsedLength="0x0700c098" RangeReservedLength="0x0700c098" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,088.096 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="1" RangeStart="0x7f3023fff000" RangeUsedLength="0x01a32748" RangeReservedLength="0x01a32748" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,088.096 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="0" RangeStart="0x7f3025a31748" RangeUsedLength="0x050000b0" RangeReservedLength="0x0e5cc8b8" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 45,088.303 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="0" Promoted="0"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 45,088.303 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="1" Promoted="0"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 45,088.400 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="2" Promoted="122,160"
Microsoft-Windows-DotNETRuntime/GC/MarkWithType 45,095.341 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" HeapNum="0" ClrInstanceID="0" Type="3" Promoted="9,482,610"
Microsoft-Windows-DotNETRuntime/GC/BulkMovedObjectRanges 45,120.223 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Index="0" Count="2,677" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/BulkMovedObjectRanges 45,120.610 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Index="1" Count="2,677" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/BulkMovedObjectRanges 45,120.869 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Index="2" Count="2,677" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/BulkMovedObjectRanges 45,120.984 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Index="3" Count="976" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GlobalHeapHistory 45,129.693 Process(281057) (281057) HasStack="True" ThreadID="281,139" ProcessorNumber="0" FinalYoungestDesired="83,886,080" NumHeaps="1" CondemnedGeneration="1" Gen0ReductionCount="0" Reason="AllocSmall" GlobalMechanisms="Compaction, Promotion, CardBundles" ClrInstanceID="0" PauseMode="SustainedLowLatency" MemoryPressure="25" CondemnReasons0="0" CondemnReasons1="0"
Microsoft-Windows-DotNETRuntime/GC/PerHeapHistory 45,129.696 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" ClrInstanceID="0" FreeListAllocated="9,435,216" FreeListRejected="0" EndOfSegAllocated="0" CondemnedAllocated="0" PinnedAllocated="0" PinnedAllocatedAdvance="0" RunningFreeListEfficiency="99" CondemnReasons0="4" CondemnReasons1="16" CompactMechanisms="compact_low_ephemeral" ExpandMechanisms="not_specified" ConcurrentCompactMechanisms="" HeapIndex="0" ExtraGen0Commit="108,285,856" Count="5" MemoryPressure=""
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,129.700 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="4" RangeStart="0x7f318ffff000" RangeUsedLength="0x00000018" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,129.700 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="3" RangeStart="0x7f3187fff000" RangeUsedLength="0x046ea8d0" RangeReservedLength="0x07fff000" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,129.701 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="2" RangeStart="0x7f3177fff000" RangeUsedLength="0x0700c098" RangeReservedLength="0x0700c098" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,129.701 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="1" RangeStart="0x7f3023fff000" RangeUsedLength="0x0006c048" RangeReservedLength="0x0006c048" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/GenerationRange 45,129.702 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" Generation="0" RangeStart="0x7f302406b048" RangeUsedLength="0x00000018" RangeReservedLength="0x0ff92fb8" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/Stop 45,129.702 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" DURATION_MSEC="41.608" Count="14" Depth="1" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/HeapStats 45,129.703 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" TotalHeapSize="192,293,344" TotalPromoted="9,875,232" Depth="1" GenerationSize0="24" TotalPromotedSize0="440,016" GenerationSize1="442,440" TotalPromotedSize1="9,435,216" GenerationSize2="117,489,816" TotalPromotedSize2="0" GenerationSize3="74,361,040" TotalPromotedSize3="0" FinalizationPromotedSize="243,375" FinalizationPromotedCount="11" PinnedObjectCount="0" SinkBlockCount="121" GCHandleCount="44,053" ClrInstanceID="0" GenerationSize4="24" TotalPromotedSize4="0"
Microsoft-Windows-DotNETRuntime/GC/RestartEEStart 45,129.705 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/RestartEEStop 45,129.729 Process(281057) (281057) ThreadID="281,139" ProcessorNumber="0" DURATION_MSEC="41.707" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/FinalizersStart 45,129.880 Process(281057) (281057) ThreadID="281,096" ProcessorNumber="0" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/FinalizersStop 45,129.918 Process(281057) (281057) ThreadID="281,096" ProcessorNumber="0" Count="11" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/FinalizersStart 45,129.920 Process(281057) (281057) ThreadID="281,096" ProcessorNumber="0" ClrInstanceID="0"
Microsoft-Windows-DotNETRuntime/GC/FinalizersStop 45,129.921 Process(281057) (281057) ThreadID="281,096" ProcessorNumber="0" Count="0" ClrInstanceID="0"

@peppy
Copy link
Sponsor Member

peppy commented Feb 24, 2021

Going to reopen to track this as it looks like an actual issue.

The reason question here is that you both say that it worked fine at one point in time, which would point to a change in the way we are allocating memory or the underlying runtime. We recently switched to .NET 5 which could definitely have introduced some issues like this, but since you have already tried older builds (which should have the .NET runtime bundled) I'm not so sure..

Can you confirm that when you are checking old builds, you are using the AppImage, and not self-compiling or otherwise?

@peppy peppy reopened this Feb 24, 2021
@pinembour
Copy link
Author

Thank you for reopening and taking a closer look at it.
I can confirm that I tried both compiling and the AppImage, on a recent and an old build.

I'll send a trace with multiple stutters in a few minutes.

@peppy
Copy link
Sponsor Member

peppy commented Feb 24, 2021

The important part is that when you are testing on the old build, you are using the AppImage, or explicitly compiling with the older .NET runtime (3.x should do fine). Conversely, ensure you are building using .NET 5 when building the newer release (although I believe this is a requirement these days, so should be enforced at compile time).

@Micha-ohne-el
Copy link
Contributor

Can you confirm that when you are checking old builds, you are using the AppImage, and not self-compiling or otherwise?

Yes, I was using the AppImage, both with the most recent build, as well as the one from November last year. This is why I still think the issue is with Arch somehow. I actually reinstalled my entire system yesterday (for a different reason) and the issue still persists. Maybe I should install a non-Arch based distro and see if the issue persists? Definitely very weird.

I'm currently on mobile, but I'll provide more traces and memory stats when I get on my PC.

@pinembour
Copy link
Author

Here are two new traces. trace-new with a current version of the AppImage, trace-old with a version from november, still using an AppImage, 2020.1114.0. I definitely did not encounter the issue last november, but I do now.
trace-new.zip Stutters at 40s, 1m09, and 1m40
trace-old.zip Stutters at 38s, 1m06, and 1m40

Both traces were recorded with dotnet 5.0.3.sdk103

@peppy
Copy link
Sponsor Member

peppy commented Feb 24, 2021

Just for confirmation, can you show how you are running the appimage? The .NET runtime should be bundled inside, so I want to confirm you aren't somehow forcing the old version to run on .NET 5 (which is your system installed runtime/sdk version), as this is feasible and could mean a lot in figuring out where this regressed.

@peppy
Copy link
Sponsor Member

peppy commented Mar 30, 2021

We've made some pretty large improvements on this front. Things are still not perfect but I'm going to remove this from a milestone task for the time being. We still have some ongoing tasks which will further improve the situation.

Please feel free to report back with further profiling in subsequent builds.

@peppy peppy removed this from the March 2021 milestone Mar 30, 2021
@peppy peppy changed the title Unexpected frame stutters at regular intervals Unexpected frame stutters at regular intervals (GC related) Mar 30, 2021
@smoogipoo
Copy link
Contributor

Just posting back for future reference...

https://osu.ppy.sh/beatmapsets/519505#osu/1906276 played with osu!mania is a good way to see the stutters in full effect:
image

@peppy peppy removed their assignment Apr 12, 2021
@Morodar
Copy link

Morodar commented Jul 12, 2021

I experience some lag spikes while playing beatmaps. I haven't noticed those lag spikes for quite a while.

  • Lag spikes appear on any beatmap
  • Lag spikes appear on any frame limiter (VSync, 2x refresh rate & Unlimited)

It also appears to happen more often when I limit the game to 160FPS via NVIDIA Control Panel.

Screenshots or videos showing encountered issue:

YouTube
Lags occur at

  • 00:09 (Logs ~ 15:40:02)
  • 00:25 (Logs ~ 15:40:17)

osu!lazer version:
2021.707.0-lazer

Logs:
Beatmap starts at 15:39:53 and ends at 15:40:31
performance-update.log
runtime.log
updater.log
database.log
network.log
performance.log
performance-draw.log
performance-input.log
performance-audio.log

@smoogipoo
Copy link
Contributor

Hard to say - nothing's changed in the last little while that would make them more prominent, but these spikes have been occurring for a long time now. It would be better to show the frame time by pressing Ctrl+F11 once more to get the graphs to show.

You could also try running osu! with COMPlus_GCGen0MaxBudget=600000 set as an environment variable.

@Morodar
Copy link

Morodar commented Jul 18, 2021

Summary

Just the next days I was not able to reproduce it. It's so weird.

Probably some driver issues?
I'll try to report it whenever it occurs again (providing logs and Ctrl+F11).

For the sake of completeness, here are two recordings using the same pc, same osu version etc..
Those recordings do not show the lag spikes.
I'm pretty sure they don't help you. :/

At least I can tell you that osu! is running at 600-700FPS on my machine (in multi-thread mode and when I'm not recording) 😄

Single threaded video

YouTube
performance-input.log
performance-update.log
runtime.log
network.log
performance.log
performance-audio.log
performance-draw.log

Multi threaded video

YouTube
network.log
performance.log
performance-audio.log
performance-draw.log
performance-input.log
performance-update.log
runtime.log

@smoogipoo
Copy link
Contributor

You don't need to keep reporting back. We know with absolutely 100% confidence that there is an issue, even if you're not experiencing it.

@hez2010
Copy link

hez2010 commented Oct 6, 2021

Does this help? This PR fix an issue which is causing periodical long GC on gen 1, to verify whether it helps we need either use nightly build or 6.0 RTM for testing.

@smoogipoo
Copy link
Contributor

Improvement looks to be marginal at best. Left is net6-rc1, right is compiled from master:
image

As I mentioned in the thread, there seems to be a difference in how the GC tunes its gen1 heap size on Linux vs Windows, resulting in making it too large on Linux and triggering collections less frequently than we'd like. That's the primary issue here.

The .NET team have yet to respond, unfortunately...

@hez2010
Copy link

hez2010 commented Oct 8, 2021

How about the new standalone region based GC? (it can be built from source and consumed via setting DOTNET_GCName to the path of built clrgc.dll). It's not finished yet but worth a try in my opinion.

@smoogipoo
Copy link
Contributor

Regions currently segfaults during gameplay. I also posted a repro for that in the comment I linked.

@peppy peppy pinned this issue Jun 13, 2022
@smoogipoo
Copy link
Contributor

An update on regions - they've since fixed the segfaults but it doesn't look like regions are helping, unfortunately. I still have to do my own investigation because it seems like Linux vs Windows allocates different budgets for Gen0, which is probably a bug that's somehow translating to regions as well.

@DanielPower
Copy link
Contributor

DanielPower commented Aug 7, 2022

As a workaround until this is resolved, is there some way to prevent garbage collection during gameplay (provided the user has enough memory)? It would be preferable to have a massive garbage collection spike when the song is over, rather than periodic spikes throughout the song. I'm experiencing this issue on both Linux and Windows, which makes playing on Lazer infeasible currently.

Also I noticed that on June 13th, Peppy added the tag "platform:linux" to this issue. Though multiple users have reported this occurring on Windows. Including #13969 which was closed as a duplicate of this. So it seems this is not platform-specific.

@peppy
Copy link
Sponsor Member

peppy commented Aug 7, 2022

If you read the issue you linked, you will notice that I say it is in no way related to this. It is also related to vRAM being exceeded as the user points out.

There are no required workarounds for this issue on windows because it doesn't occur. If you are having performance issues, please open a new discussion thread. Make sure to test using the default skin and settings before doing so, as we have found some skins can cause issues due to excessively large resources.

@DanielPower
Copy link
Contributor

I'm not sure exactly which release resolved the issue. But I'm happy to report that at some point between 2022.1101 and 2022.1117, all stuttering issues have been resolved for me on Linux.

@peppy
Copy link
Sponsor Member

peppy commented Nov 23, 2022

Great to hear! Most likely the latest release.

@peppy
Copy link
Sponsor Member

peppy commented Nov 23, 2022

Going to tentatively close this for the time being.

@peppy peppy closed this as completed Nov 23, 2022
@NimaQu
Copy link

NimaQu commented Jan 3, 2023

Seems i have the same problem on Windows, screenshot:
image
logs:
performance.log

version: 2022.1228.0

@peppy
Copy link
Sponsor Member

peppy commented Jan 3, 2023

That's not GC. Swap buffer can be caused by anything. Definitely not this issue. Please wait for future performance updates.

@ppy ppy locked and limited conversation to collaborators Jan 3, 2023
@frenzibyte frenzibyte unpinned this issue Mar 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants