Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest Sniper releases freezes The Elder Scrolls Online an then the whole OS #610

Closed
oliverklee opened this issue Aug 13, 2023 · 6 comments

Comments

@oliverklee
Copy link

oliverklee commented Aug 13, 2023

(edit: Today, I couldn't reproduce this problem anymore. If I don't encounter the problem for the next few days, I'll close this ticket.)

Your system information

  • Steam Runtime Version: sniper depot 0.20230808.56699 (Steampipe build ID 11903591)
  • Distribution (e.g. Ubuntu 18.04): Kubuntu 23.04
  • Link to your full system information (Help -> System Information) in a Gist:
    https://gist.github.com/oliverklee/c59623fb60fe56fb0a4febade5c1bbf0
  • Have you checked for system updates?: Yes
  • What compatibility tool are you using?: Steam Linux Runtime and Proton 8.0-3 11723527
  • If you are using Steam Linux Runtime, or Proton 5.13 or newer: What versions are listed in SteamLinuxRuntime_soldier/VERSIONS.txt?

In SteamLinuxRuntime_sniper/VERSIONS.txt:

depot   0.20230808.56699                        # Overall version number
pressure-vessel 0.20230804.0    scout           # pressure-vessel-bin.tar.gz
scripts 0.20230804.0                    # from steam-runtime-tools
sniper  0.20230808.56699        sniper  0.20230808.56699        # sniper_platform_0.20230808.56699/

Please describe your issue in as much detail as possible:

After the latest Sniper beta update (which probably was 0.20230808.56699 (Steampipe build ID 11903591), but I'm not 100% sure), The Elder Scrolls Online (with Steam Beta on Kubuntu 23.04) freezes within the first 5 minutes of playing (on the loading screen going from the character selection screen to in-game if I remember correctly), and a few seconds later, my complete OS becomes unresponsive so that I need to reboot. (I can't even switch from the UI to a virtual console.)
Going back to Sniper 11406188 fixes this problem for me. So my best guess is that it's a Sniper regression.
(I haven't player any other games with that particular Sniper version yet. So I don't know if this is specific to ESO.)

(I'll update the log once I've created it.)

Steps for reproducing this issue:

  1. Switch to Steam beta.
  2. Install Sniper beta.
  3. Install Proton 8.0.3 (not the beta).
  4. Install The Elder Scrolls Online.
  5. Play ESO and switch a few times between your characters.

Note: I tried to reproduce it today in order to create the logs, but was not able to reproduce the problem (yet). Strange. When the problem first occurred, it did so extremely reliably for me.

Tickets/ticket comments in other projects for the same issue:

@smcv
Copy link
Contributor

smcv commented Aug 14, 2023

Today, I couldn't reproduce this problem anymore. If I don't encounter the problem for the next few days, I'll close this ticket.

I would guess that other things changed around the same time in your OS, which might have obscured whether the regression and fix were caused by an OS update or a runtime update. On Kubuntu, you should be able to find a history of recent updates in /var/log/apt/history.log.

I don't see anything in recent runtime changelogs which would have been an obvious trigger for a regression like this.

Going back to Sniper 11406188 fixes this problem for me

I believe that version is sniper depot 0.20230605.51441, currently in the previous_release branch. It's usually easier to track this stuff with depot versions from VERSIONS.txt rather than Steampipe build IDs, since the build ID is not available anywhere obvious on-disk, but VERSIONS.txt is (which is a large part of why we provide it).

There are usually three versions of sniper available to the public, with a gap of a few weeks between each one, shown here newest-first:

  • client_beta: currently sniper depot 0.20230808.56699 (Steampipe build ID 11903591)
  • default branch: currently sniper depot 0.20230718.55074 (Steampipe build ID 11828592)
  • previous_release: currently sniper depot 0.20230605.51441 (Steampipe build ID 11406188)

If you can reproduce this at some point, it would be useful to know whether sniper depot 0.20230718.55074 (Steampipe build ID 11828592) reproduces this or not.

a few seconds later, my complete OS becomes unresponsive so that I need to reboot

It should not have been possible for the Steam Runtime to make this happen even if we wanted to, so that part must be an issue with some OS component (most likely the kernel or graphics drivers).

@oliverklee
Copy link
Author

Thanks!

Then my money (for fixing the crash or avoiding triggering it) would be the latest kernel update from 6.2.0.26.26 to 6.2.0.27.27.

@oliverklee
Copy link
Author

And thanks for the thoughtful, detailed reply, @smcv! I really appreciate it. ❤️

@oliverklee
Copy link
Author

oliverklee commented Aug 14, 2023

I have been able to reproduce it by going back to the kernel version 6.2.0.26.26 (with the Sniper version listed in the initial comment of this issue).

I have created a gist with the Steam logs for this steam session:
https://gist.github.com/oliverklee/8e4594bf10ee6355429bab921bc4775f

The crash happened around 17:22:35, and the latest entry from the Steam log was from more than 2 minutes before that (assuming that my computer clock is in sync with my external clock - at the moment, the computer is less than 10 seconds early).

The latest entries from /var/log/syslog also stop about 2 minutes before the crash: https://gist.github.com/oliverklee/e10d60e6ae90ac7088805cb7a5ec979f

@smcv
Copy link
Contributor

smcv commented Aug 14, 2023

Since @oliverklee says the newer kernel 6.2.0.27.27 (presumably that should be 6.2.0-27.27) avoids this, I think we can consider this to be resolved on Ubuntu's side, and therefore "not our bug".

It seems unlikely that there would be anything that the Steam Runtime could do to avoid this, so there will probably not be any suitable workaround for us to add. @kisak-valve, please could you close this as "not planned"?

I think 23.04 is Ubuntu lunar? @oliverklee is using an AMD GPU, and the changelog for 6.2.0-26.26 to 6.2.0-27.28 (or more specifically, their update from upstream Linux 6.2.13 to 6.2.15) has direct rendering manager fixes for AMD GPUs, such as these:

    - drm/amd/display: Remove stutter only configurations
    - drm/amd/display: limit timing for single dimm memory
    - drm/amd/display: fix PSR-SU/DSC interoperability support
    - drm/amd/display: fix a divided-by-zero error

(Confusingly, games on Linux often involve more than one thing named DRM: the Linux direct rendering manager is nothing to do with digital rights management!)

Without any particular kernel knowledge, that last point I quoted seems like it might have been the solution for this: if a divide-by-zero error was game-triggerable, then causing a kernel panic that has the symptoms described here seems plausible.

The latest entries from /var/log/syslog also stop about 2 minutes before the crash

That certainly sounds like a problem with the kernel: nothing we can do in a game as an unprivileged user should be allowed to have that effect, but if something triggered a kernel panic, then the kernel would stop without finishing any pending writes to disk. After a sufficiently serious bug is detected, the kernel can no longer trust its internal state to be non-destructive, so to avoid possible data loss it stops doing anything at all.

@kisak-valve kisak-valve closed this as not planned Won't fix, can't repro, duplicate, stale Aug 14, 2023
@oliverklee
Copy link
Author

Yes, Ubuntu 23.04 is Lunar Lobster: https://wiki.ubuntu.com/Releases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants