Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Game unpredictably hangs (macrostutters) on complex maps #13969

Closed
SonoSooS opened this issue Jul 21, 2021 · 12 comments
Closed

Game unpredictably hangs (macrostutters) on complex maps #13969

SonoSooS opened this issue Jul 21, 2021 · 12 comments

Comments

@SonoSooS
Copy link

Describe the bug:
A macrostutter is like a microstutter, except the time range can range from around 100ms up to even two seconds, with the average stutter length being around 500-800ms.

Only when playing, the more hit objects come per second, the more likely it is for the game to macrostutter.
For simple maps with barely a few hit objects per second, this issue still happens, but very difficult to reproduce. On some maps it's impossible to play without at least 3 macrostutters during the entire map, sometimes predictably down to a few seconds.

There are three types of this macrostutter I observed:

  • the most common is where the music keeps playing, and the game hangs for anywhere from a tenth of a second up to a whole second, and I usually lose because the map keeps playing in the background while the screen is frozen
  • sometimes the game freezes with part of the audio missing, but most often (as rare this type of macrostutter is) I hear a sparkly sound effect looping (which is most likely the slider slide sound effect) with no music, and upon unfreezing the beatmap has only skipped forward a little bit, no matter the freeze interval
  • only managed to get this 2-3 times out of the many, many macrostutters, but sometimes the entire audio cuts off completely, and upon the freeze ends, the beatmap continues exactly where it froze

This seems to be medium-strongly related to how many hit objects the map spawns on average each second.
On simple maps, or on maps with no streams and spam sliders (small sliders repeating very fast at a high frequency), it's almost impossible to reproduce this issue, and usually happens every 10-20mins when replaying the same map over and over again.

On complex maps however, and/or if a lot of hit objects start coming (circle streams, especially with sliders between), it's impossible to play the given maps without at least one or two macrostutters. 100% of the time I get a macrostutter, no matter how hard I try opening/closing programs, changing settings, or playing different maps with similar hit object spawning characteristics.

I also tried reproducing this issue on few versions of the proprietary client, and besides the usual microstutters which rarely causes me to choke, there were zero macrostutters reproducible during weeks of testing on many different maps.

There is however one thing I noticed: I get slightly less macrostutters on dGPU, but I still get some. On iGPU it's unbearable.

I also managed to make a possible correlation with the log while testing on the dGPU, and that's this line:

2021-07-21 22:50:44 [verbose]: TextureAtlas size exceeded 5 time(s); generating new texture (1024x1024)

While I'm not sure if there are other causes for macrostutters (considering the different types of results I get), but this seems to be one of the causes at least.

I tried different Windows 10 versions, and all of them have this issue. I also tried different driver versions for both the iGPU and dGPU, and same results.
I didn't try Lazer on Windows 8.0 with the dGPU, because it feels like this is an issue in Lazer itself, or some software clashing.

Don't know how this affects Lazer's timers, but my laptop has bugged HPET which I can't disable. Just in case if timers going backwards could mess with Lazer's coding.

Screenshots or videos showing encountered issue:

iGPU

zESehZVN92
Eou8M8cIYM
HYX7vrgV6I

dGPU

EYGkzpEmDA
klk3cSJVbb

osu!lazer version:
2021.720.0-lazer

System specs:

  • Windows 10 1703 (15063.138)
  • Intel i7-8750H (UHD 630)
  • Nvidia GTX 1060M

Logs:
iGPU: logs.zip
dGPU: logs_d.zip

@peppy
Copy link
Sponsor Member

peppy commented Jul 22, 2021

Duplicate of #11800.

@peppy peppy closed this as completed Jul 22, 2021
@SonoSooS
Copy link
Author

I read the issue, and doesn't seem to be a duplicate, as the behavior I get is different.

I don't get these at regular intervals, and I don't experience it at all in the main menu. And I only experience macrostutters, not microstutters (except when selecting a different song in the song browser, but that's sort of obvious, having to load from disk and having to calculate difficulty).
One day I was using the media player feature on the main menu, and never had a single stutter (micro or macro) during those few hours I let it playing.

I only experience these during play, and even then it's extremely unpredictable besides for a few really intensive maps, where a stutter can be usually predicted within a few (~7-12) seconds.
I was closely watching the Gen values as per #11800, and GC seems to happen independently of these macrostutters, as the Gen values don't change during the macrostutter.

However using the same hardware still, I was able to reproduce #11800 on macOS 10.13, and it's pretty regular. It even has the weird Gen0 behavior, which I don't experience at all on Windows, as it barely reaches 50k, and most often stays rock-solid at 24.
On macOS it has to be the GC causing the stutters, as the GC values are lower significantly after the stutter ends. Also macOS stutters are much shorter than the macrostutters I'm experiencing.

On Linux this issue doesn't happen at all, and I can't reproduce either this, or #11800. I rarely get sound crackles, but that's down to the low-latency Pulseaudio config. Linux behaves pretty much the same as Windows, except Gen2 stays 4x-8x as big on average compared to Windows, but Gen0 still stays rock-solid at 24.

@peppy
Copy link
Sponsor Member

peppy commented Jul 22, 2021

The other issue is also generally only during gameplay. Trust me, it's the same thing. You can see the GC markers on your frame graph.

@SonoSooS
Copy link
Author

I just noticed that the small green dot (it's quite invisible).

This sparked some memory in my head.
We once had a problem with the standard GC (even at the "sustained low latency" setting), as it kept hogging the program because the GC was too slow, and it was already using all free memory the system had, and switching to server GC has greatly improved it until we refactored the code to reduce strain on the GC.

While server GC doesn't solve the underlying problem, it should slightly better the user experience until the underlying problem is figured out.

I did some tests, and it is indeed undoubtedly the GC.
However, during my tests I also managed to reduce the amount and the frequency of the lag, and managed to make play mode actually usable.

Just enabling server GC alone has decreased the macrostutters to microstuttes with length less than 200ms (kind of hard to measure).
Enabling concurrent GC (sadly deprecated) along with server GC reduces the microstutters even lower, but at the cost of average RAM usage being slightly higher.
I got the best results by also enabling memory retaining (at the cost of noticably higher RAM usage), but the difference with concurrent GC and server GC enabled is barely noticable, although seems to be still there.

You can test these GC settings on the "retail" version just by setting some environment variables prior launching osu!.exe:

set COMPLUS_gcServer=1
set COMPLUS_gcConcurrent=1
set COMPLUS_GCRetainVM=1

@peppy
Copy link
Sponsor Member

peppy commented Jul 22, 2021

Please read through the linked issue, and especially the one over at the dotnet repo (dotnet/runtime#48937). We are very competent on how the GC works and have already tested all available options.

I understand you're trying to be helpful, but please read every post in both threads before providing further suggestions/commentary.

@smoogipoo
Copy link
Contributor

Please follow #11800 for any updates instead. I'm very interested in you testing with COMPlus_GCGen0MaxBudget=600000 instead.

Using server GC is only delegating the problem.

@SonoSooS
Copy link
Author

I am following #11800, because I would also root for this issue being figured out and solved.

I did try with that setting, and sadly it had a negative effect. I'm still getting ~400-600ms macrostutters (slightly less than the ~700ms average), but now its frequency has significantly increased during gameplay (on some maps I'm getting it almost every ~7s!).
However on the map browser I can press F2 a few times (~1-4) before it macrostutters, as opposed without it, where it macrostutters for every F2 press.
Main menu seems to be still unaffected when idling.

I did not test that setting on macOS or Linux, unless you want me to test it.

And yeah, I know that server GC is not a solution, but a temporary workaround, and I did word my reply that way for that reason.

@smoogipoo
Copy link
Contributor

smoogipoo commented Jul 23, 2021

I'm going to tentatively reopen this one, because the issue sounds much more severe and different than #11800 in that case.

Can you clarify - are you running this on a Mac machine, running Windows 10 via bootcamp? Or is it the other way around - a PC running a hackintosh (when you tested the macOS scenario)?
Because this could be related to #11691 and #12944.

@smoogipoo smoogipoo reopened this Jul 23, 2021
@SonoSooS
Copy link
Author

SonoSooS commented Jul 25, 2021

I'm running a Hackintosh because my MacBook Pros are really old (2010 - 2011) and too expensive to service.

I've always had issues on macOS with the proprietary client as well (both Hackintosh and on real MacBook Pro), so I assume that it's an issue with some macOS implementation, and not a hardware issue.

Although I haven't tested Lazer yet on a real MacBook Pro though, because I'm using their RAM sticks and storage in a different non-Apple machine, and switching them out takes some time. But if it's requested, I can switch back the components and test Lazer on it, just to rule out any Hackintosh weirdness.

Although on macOS my problems more closely resemble #11800 than this one. I can only reproduce this issue on Windows.

Edit: I sadly can't test Windows 7 in Bootcamp anymore, because of hardware fault, but I can test on Windows 10 Bootcamp on weeksdays.

@SonoSooS
Copy link
Author

SonoSooS commented Nov 10, 2021

Tested on the 2010 13" MacBook Pro (Core 2 Duo, Nvidia GeForce 320M, 8Gigs of RAM) with the latest build (2021.1108.0-lazer), and the results are extremely surprising.

On macOS 10.13, the graphics are very garbled up, but the graphical distortion and mouse acceleration issues aside, the game is perfectly playable. For a Core 2 Duo, the game is extremely snappy, and navigating the song select is basically lagless - even faster than on my laptop where this issue happens, even with the gcServer hack, the MacBook Pro wins -. and I had zero macrostutters. During my testing, I only had a single microstutter, and that's it.

On Windows 20H2 (BIOS mode, 64bit), instead of overwriting the VRAM once it's full, the text and icons don't get garbled up, but instead the VRAM allocation cycles each frame, creating a constant "your computer is being hacked" effect (used in movies) of flickering.
Other than the graphical glitches not related to this issue at all, it's the same as in macOS, and the game runs completely fine with no other issues at all.

The issue still happens on my main laptop though (only tested on Windows this time), and it's definitely the GC.

I did notice different GC behavior across all three tests, and I have no explaination as for why GC behavior is so inconsistent across software and hardware. This seems to be the territory belonging to #11800 requiring more research.

tl;dr: on the real MacBook Pro, there are absolutely no issues, but it still happens on my laptop, even with the latest build (2021.1108.0-lazer)
So yeah, this is basically #11800, except it seems to be affected by random things for no observable reason or external force.

@peppy
Copy link
Sponsor Member

peppy commented Nov 10, 2021

#11800 is not related to windows and has no effect on any windows system

@peppy
Copy link
Sponsor Member

peppy commented Nov 10, 2021

let's close this issue for now, anyway. it seems out of scope

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants