Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

steamcmd: Terminating m_ThreadClient, likely to crash down the line #8083

Open
lloesche opened this issue Sep 19, 2021 · 24 comments
Open

steamcmd: Terminating m_ThreadClient, likely to crash down the line #8083

lloesche opened this issue Sep 19, 2021 · 24 comments
Labels

Comments

@lloesche
Copy link

lloesche commented Sep 19, 2021

Your system information

  • Steam client version (build number or date): steamcmd latest
  • Distribution (e.g. Ubuntu): Debian Stable (11) Docker Container running on Fedora 30 Host (5.6.11-100.fc30.x86_64)
  • Opted into Steam client beta?: No
  • Have you checked for system updates?: Yes

Please describe your issue in as much detail as possible:

Whenever steamcmd.sh quit is executed the following error is thrown:

Work thread 'CJobMgr::m_WorkThreadPool:1' is marked exited, but we could not immediately join prior to deleting -- proceeding without join
Terminating m_ThreadClient, likely to crash down the line... but avoiding hang on exit
./steamcmd.sh: line 38:  1712 Killed                  $DEBUGGER "$STEAMEXE" "$@"

This only happens on the Fedora 30 system. I tried on a Fedora 33 system with Kernel (5.11.7-200.fc33.x86_64) and there the issue doesn't present itself. Previous versions of steamcmd worked just fine on Fedora 30.

Steps for reproducing this issue:

root@18229210b028:/opt/steamcmd# ./steamcmd.sh
Redirecting stderr to '/root/Steam/logs/stderr.txt'
Looks like steam didn't shutdown cleanly, scheduling immediate update check
[  0%] Checking for available updates...
[----] Verifying installation...
Steam Console Client (c) Valve Corporation
-- type 'quit' to exit --
Loading Steam API...OK

Steam>quit
Terminating m_ThreadClient, likely to crash down the line... but avoiding hang on exit
./steamcmd.sh: line 38:  1918 Killed                  $DEBUGGER "$STEAMEXE" "$@"
root@18229210b028:/opt/steamcmd#
@lloesche
Copy link
Author

lloesche commented Sep 20, 2021

A user of my valheim Docker container (which uses steamcmd to download valheim dedicated server from Steam) reported that the container now freezes his entire Synology NAS running Linux Kernel 3.10.105

He is getting the following message in his steamcmd stderr log:

root@d716c7c91549:/# cat /home/valheim/Steam/logs/stderr.txt
src/tier0/threadtools.cpp (4071) : Assertion Failed: Probably deadlock or failure waiting for thread to initialize.
crash_20210919074534_5.dmp[78]: Uploading dump (out-of-process)
/tmp/dumps/crash_20210919074534_5.dmp
CWorkThreadPool::StartWorkThread: Thread creation failed.

crash_20210919074534_5.dmp[78]: Finished uploading minidump (out-of-process): success = yes
crash_20210919074534_5.dmp[78]: response: Discarded=1
crash_20210919074534_5.dmp[78]: file ''/tmp/dumps/crash_20210919074534_5.dmp'', upload yes: ''Discarded=1''

His syslog stdout captures a very similar output:

2021-09-19 06:24:56,stdout, Redirecting stderr to '/home/valheim/Steam/logs/stderr.txt'
2021-09-19 06:24:56,stdout, [  0%] Checking for available updates...
2021-09-19 06:24:56,stdout,crash_20210919062456_5.dmp[69]: Uploading dump (out-of-process) /tmp/dumps/crash_20210919062456_5.dmp
2021-09-19 06:24:56,stdout, src/tier0/threadtools.cpp (4071) : Assertion Failed: Probably deadlock or failure waiting for thread to initialize.
2021-09-19 06:24:56,stdout, Thread failed to initialize

The container is using Debian stable as the base Linux system with the following steamcmd and Valheim specific packages installed in it:

        libc6-dev
        lib32gcc-s1
        libsdl2-2.0-0
        libsdl2-2.0-0:i386

@lloesche
Copy link
Author

So, it seems this is Debian 11 related. When I downgraded the Docker container base from debian:stable to debian:buster (Debian 10) steamcmd behaves again and no longer throws any of the errors above. So maybe steamcmd doesn't play nice with the Debian 11 libraries or it is a combination of Debian 11 libs with older host OS kernels?

@TTimo
Copy link
Collaborator

TTimo commented Sep 20, 2021

Can you post / send me the minidump files from those logs (or new ones for the problem, doesn't matter)?

@lloesche
Copy link
Author

I can ask the reporter of the issue to download the dmp for me. Might be challenging as this happend on their Synology NAS and I don't know how comfortable they are with the CLI.

@lloesche
Copy link
Author

lloesche commented Sep 20, 2021

@TTimo in lloesche/valheim-server-docker#401 (comment) at the end of their comment the user did copy'paste the dmp file (they cat it and pasted it incl. all the binary data). I don't know if that contains all the binary data or if something went missing in the copy'paste process but at the very least it seems to contain some funky symbols leading me to believe it's the complete file.

@TTimo
Copy link
Collaborator

TTimo commented Sep 20, 2021

Yeah I'd need the file or a crash id of a successful upload to our servers - I can't retrieve the binary bits from a copy paste to ascii :)

@tomekduda
Copy link

Apologies for the delay, I'm a bit out of my element and Synology via SSH is pretty barebones. I couldn't run groupadd docker or usermod, I had to work on root.

It doesn't seem like the dmp file suffered from Linux -> Windows extra newline characters but please don't be surprised if it did.

valheim.log.txt

steam.logs.stderr.txt

crash_20210920204020_5.zip (GitHub didn't let me upload as *.dmp)

Rough steps I've followed (if we need to run it again in future or you think I mangled the files)

sudo su -
docker pull ghcr.io/lloesche/valheim-server@sha256:2cc61cb267192d34c73526e7377a6cc7c3eefd843916af81047305ada14721d2
rm -r valheim-server
mkdir -p $HOME/valheim-server/config/worlds $HOME/valheim-server/data

docker image ls

docker run -d \
    --name valheim-server \
    --cap-add=sys_nice \
    --stop-timeout 120 \
    -p 2456-2457:2456-2457/udp \
    -v $HOME/valheim-server/config:/config \
    -v $HOME/valheim-server/data:/opt/valheim \
    -e SERVER_NAME="eyescream" \
    -e WORLD_NAME="HiGithub" \
    -e SERVER_PASS="HiGithub!" \
    a78bfac8f1bb

# wait 1 min for the Valheim dedicated server download to fail
# stop the container and snapshot it, otherwise it'll hog the CPU and lead to hard reboot

docker ps
docker commit 1d4acb565af7 mysnapshot
docker container stop 1d4acb565af7
docker run -t -i mysnapshot /bin/bash
# have a look around, then exit

# copy files
docker logs 1d4acb565af7 > valheim.log.txt
docker cp 1d4acb565af7:home/valheim/Steam/logs/stderr.txt steam.logs.stderr.txt
docker cp 1d4acb565af7:tmp/dumps/crash_20210920204020_5.dmp crash_20210920204020_5.dmp

@kisak-valve
Copy link
Member

For reference, the attached minidump is a DUMP_REQUESTED in crashhandler.so.

@TTimo
Copy link
Collaborator

TTimo commented Sep 20, 2021

Yeah, DUMP_REQUESTED .. that's an assertion failure (the crash_*.dmp name is misleading, known Steam bug but unrelated here).

According to the log, the failure looks immediate after startup, and the .dmp indicates it is an init failure from a worker thread of the HTTP client. I think after that steamcmd just decides to cleanly exit.

I don't think Fedora and docker version have anything to do with this do they? Should be reproductible simply with debian 10 vs debian 11 as the base container image.

@lloesche
Copy link
Author

@TTimo so the reason why I thought the host system kernel is playing a role here is because not every user of my container has that issue. The container has 16 Mio. downloads on Dockerhub so is in use by quite a few people. However only a small number of reports came in after the container auto-upgraded from Debian 10 to 11 via the debian:stable-slim FROM tag.

Also the failure behavior was different across users. Some, like the Synology users had their entire Synology NAS freeze when running the updated container. Others like @tomekduda would see steamcmd crash and create a dmp file.

I myself on my Fedora 30 host system with Linux 5.6.11 I would be able to execute steamcmd and download Valheim server but then get the Terminating m_ThreadClient, likely to crash down the line... but avoiding hang on exit error I reported above whenever I executed the quit command.

On a Fedora 33 host system with Linux 5.11.7 I did not see ANY issues whatsoever. steamcmd worked just fine, downloaded Valheim server and was able to run quit without the error I was seeing on the 5.6/FC30 Kernel.

This was all with the same Debian 11 based container just on different host systems with different host kernels. I agree the host system distribution shouldn't make any difference which is why I suspect it's a combination of Kernel version with Debian 11 libc or maybe SDL, that's causing the issues here.

After downgrading to Debian 10 all users reported that the issues went away. So that's a good workaround for us right now.

@TTimo
Copy link
Collaborator

TTimo commented Sep 21, 2021

Yeah - Debian 11's libc/threading having a backwards compatibility bug with 5.6 kernels is a strong possibility. If that's the case it's probably affecting software other than steamcmd too so maybe the debian bug tracker will have some things.

@smcv
Copy link
Contributor

smcv commented Sep 21, 2021

What kernel is the Synology using?

In principle Debian 11 user-space is meant to work on at least kernels >= Debian 10, meaning 4.19.x (otherwise we wouldn't be able to upgrade Debian 10 systems to Debian 11 in-place), but there might be something more subtle happening here.

@TTimo, if you're able to get a backtrace with a failing or hanging glibc call, that would give me better search terms?

@tomekduda
Copy link

This post helps? https://old.reddit.com/r/synology/comments/cn9qnd/what_distribution_of_linux_is_synology_using/

They call it DSM (Disk Station Manager). My hardware is from 2016 so I'm sitting on v 6.2. Newer devices sit on 7.0.

From SSH (I'm an "admin" according to Synology web admin UI but I'm not sitting on root account) I don't see mention of Debian anywhere. "toster" is the NAS's name so "Linux toster" won't mean anything to anybody, don't bother Googling it.

eyescream@toster:~$ uname -a
Linux toster 3.10.105 #25556 SMP Sat Aug 28 02:13:34 CST 2021 x86_64 GNU/Linux synology_braswell_916+

eyescream@toster:~$ cat /etc/VERSION
majorversion="6"
minorversion="2"
major="6"
minor="2"
micro="4"
productversion="6.2.4"
buildphase="GM"
buildnumber="25556"
smallfixnumber="2"
nano="0"
base="25556"
builddate="2021/08/28"
buildtime="14:40:29"

@smcv
Copy link
Contributor

smcv commented Sep 21, 2021

Thanks, 3.10.105 is the version I was looking for.

@smcv
Copy link
Contributor

smcv commented Sep 21, 2021

glibc on Debian appears to be set up to have a minimum kernel version of 3.2, so Debian user-space should work on older kernels all the way down to 3.2, and if it doesn't then that's likely to be considered to be a bug.

However, a reproducer consisting of "run Steam" is not exactly minimal, so even if it's a Debian 11 bug, reporting that bug is probably not helpful yet.

@smcv
Copy link
Contributor

smcv commented Sep 21, 2021

src/tier0/threadtools.cpp (4071) : Assertion Failed: Probably deadlock or failure waiting for thread to initialize.

@TTimo, is it possible to find out from the crash-dump what this thread is for and how it was created?

One possibility is that something involved in thread-creation might have become more strict in the version of glibc used in Debian 11, compared with the version used in Debian 10.

Another possibility is that "the other thread has initialized" is being communicated to the main thread in a non-thread-safe way?

the container now freezes his entire Synology NAS

In principle this shouldn't be possible for an unprivileged process to achieve, and it is likely to be considered to be a kernel bug (maybe even a security vulnerability) if it can carry out that denial-of-service - but kernel 3.10 is pretty old and is no longer maintained upstream, not even as a LTS kernel.

Work thread 'CJobMgr::m_WorkThreadPool:1' is marked exited, but we could not immediately join prior to deleting -- proceeding without join

Am I correct to think that the proprietary code in steamcmd is using pthread_timedjoin_np() or pthread_tryjoin_np() to try to join a thread that it thinks has already finished? If it is, it would be helpful if it could print the strerror() resulting from that failed library call. I think it works like this, rather than the more usual errno setup:

void *retval;
int join_result = pthread_tryjoin_np(my_thread, &retval);
if (join_result != 0)
  printf ("pthread_tryjoin_np: %s\n", strerror (join_result));

Do we have evidence that the original issue you reported here, which seems to be about terminating a thread, has the same root cause as the issue @tomekduda reported, which seems to be about starting a thread?

Terminating m_ThreadClient, likely to crash down the line... but avoiding hang on exit
./steamcmd.sh: line 38:  1918 Killed                  $DEBUGGER "$STEAMEXE" "$@"

Does this mean it's using pthread_kill() to force the other thread to terminate? Is it possible that that's going wrong and it is terminating the main thread by mistake?

@TTimo
Copy link
Collaborator

TTimo commented Sep 21, 2021

@TTimo, is it possible to find out from the crash-dump what this thread is for and how it was created?

It's an HTTP worker thread, nothing special about it's creation, Steam uses the worker thread pattern a lot.

There is not much evidence that the two problems have the same root cause (thread failing to start and locking up DSM installs, vs thread failing to terminate at exit), except that they both appeared when the container base image changes from 10 to 11.

The thread startup failure seems like a more tractable problem that has less chances of being caused by a problem in steamcmd.

@Dids
Copy link

Dids commented Oct 28, 2021

This might be a long shot, but curious to know whether you've tried installing libcurl4?

This missing dependency has usually been the culprit with newer linux distros, and considering it's an HTTP request, it would make sense.

@peterf81
Copy link

peterf81 commented May 25, 2022

Steamcmd in docker on Synology crashes the Docker and also Synology NAS hard-freezes, has to be rebooted by power button. Even "sudo killall -KILL dockerd" sometimes does not help. I tried multiple docker images, result is the same.


steam@cm2network-steamcmd1:~/Steam/logs$ cat stderr.txt                                                                                                     
src/tier0/threadtools.cpp (4122) : Probably deadlock or failure waiting for thread to initialize.                                                           
assert_20220525034314_5.dmp[41]: Uploading dump (out-of-process)                                                                                            
/tmp/dumps/assert_20220525034314_5.dmp                                                                                                                      
assert_20220525034314_5.dmp[41]: Finished uploading minidump (out-of-process): success = no                                                                 
assert_20220525034314_5.dmp[41]: error: libcurl.so: cannot open shared object file: No such file or directory                                               
assert_20220525034314_5.dmp[41]: file ''/tmp/dumps/assert_20220525034314_5.dmp'', upload no: ''libcurl.so: cannot open shared object file: No such file or d
irectory''                                                                                                                                                  
src/tier0/threadtools.cpp (4122) : Probably deadlock or failure waiting for thread to initialize.                                                           
CWorkThreadPool::StartWorkThread: Thread creation failed.    

@an-englishman
Copy link

Any news on this for Synology DSM 7. I also can not download steamcmd.

2022-08-03T16:51:01.643241454Z stdout Aug  3 16:51:01 cron[25]: (root) RELOAD (crontabs/root)
2022-08-03T16:50:41.987819658Z stdout 2022-08-03 16:50:41,987 INFO reaped unknown pid 87 (exit status 0)
2022-08-03T16:50:40.986562290Z stdout Aug  3 16:50:40 assert_20220803165030_6.dmp[87]: file ''/tmp/dumps/assert_20220803165030_6.dmp'', upload no: ''Couldn't resolve host name''
2022-08-03T16:50:40.986526071Z stdout Aug  3 16:50:40 assert_20220803165030_6.dmp[87]: error: Couldn't resolve host name
2022-08-03T16:50:40.986250467Z stdout Aug  3 16:50:40 assert_20220803165030_6.dmp[87]: Finished uploading minidump (out-of-process): success = no
2022-08-03T16:50:30.522698989Z stdout Aug  3 16:50:30 /supervisord: valheim-updater ERROR - Failed to download Valheim server from Steam - retrying later - check your networking and volume access permissions
2022-08-03T16:50:30.521522440Z stdout Aug  3 16:50:30 /supervisord: valheim-updater src/tier0/threadtools.cpp (3628) : Assertion Failed: Illegal termination of worker thread 'Thread(0x0x57f9ef90/0x0xf6f10b'
2022-08-03T16:50:30.518541134Z stdout Aug  3 16:50:30 assert_20220803165030_6.dmp[87]: Uploading dump (out-of-process) /tmp/dumps/assert_20220803165030_6.dmp
2022-08-03T16:50:30.376773192Z stdout Aug  3 16:50:30 /supervisord: valheim-updater [----] !!! Fatal Error: Steamcmd needs to be online to update. Please confirm your network connection and try again.

@opello
Copy link

opello commented Dec 12, 2022

It seems that steamcmd is failing during start up to create threads because clock_gettime64 fails to return:

clock_gettime64(CLOCK_MONOTONIC, 0xffc97d9c) = -1 EINVAL (Invalid argument)

And when trying to run and switch between threads, as inspected by passing a PID of a running steamcmd at 99% CPU to strace the process is just thrashing on clock_gettime64 calls:

[pid    79] sched_yield()               = 0
[pid    79] clock_gettime64(CLOCK_REALTIME_COARSE, 0xffdd944c) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd944c) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd946c) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd93cc) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd946c) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd946c) = -1 EINVAL (Invalid argument)
[pid    79] clock_gettime64(CLOCK_MONOTONIC, 0xffdd944c) = -1 EINVAL (Invalid argument)

I can trivially reproduce this EINVAL failure with:

#include <stdio.h>
#include <time.h>

int main(int argc, char* argv[])
{
    struct timespec ts = {0};
    clock_gettime(CLOCK_MONOTONIC, &ts);
    printf("result: % 10li % 10li\n", ts.tv_sec, ts.tv_nsec);
    return 0;
}

when building with -m32 and running in a VM with the Synology kernel and directly running the program or from within the Docker image from @lloesche. It seems like there's some problem with 32-bit syscalls in the Synology kernel. Rebuilding for 64-bit does not have the problem and returns the correct result, it also uses clock_gettime without the 64 suffix. I'm not sure what is missing or what else to test. Maybe using the __vdso version directly to check the behavior...

@opello
Copy link

opello commented Dec 16, 2022

After having done a fair bit more investigating and playing with this issue, it sure seems like the problem is that Synology added system calls that were then later used for other things (403, for example, is clock_gettime64 in Linux mainline while it is SYNOArchiveBit in Synology's 3.10.77 kernel).

It seems that the presence of the similarly numbered syscall causes the issue in glibc >= 2.31 and the random musl I tried, even when the entry for the syscall isn't in the vDSO. I don't expect it should have to be in the vDSO, but this glibc commit made me wonder if it should test both for presence in the vDSO and availability in the running kernel ...

I imagine a workaround would require LD_PRELOADing something to rewrite clock_gettime calls to only use the syscall instead of trying the vDSO/clock_gettime64/clock_gettime flow. Moving to an older glibc would also fix the steamcmd side of things, and doing so for only 32-bit applications feels a bit ugly but would probably serve the need here). Or a LD_PRELOADed definition for clock_gettime that forces the syscall path.

@smcv
Copy link
Contributor

smcv commented Jan 3, 2023

I think we have two separate things going on in this issue report:

  1. The original issue report referred to a Debian 11 container on a Fedora host, on which steamcmd crashes on exit, but is functional before that point.
  2. Many subsequent comments are about a Debian 11 container on a Synology host, with multiple worse symptoms (full system freeze or 100% CPU use). This might in fact be several issues, or it might be a single root cause that can show various different symptoms depending on what happens.

The rest of this comment refers only to the problems seen on Synology systems. The short version is that I think these will have to be "won't fix" from Steam's point of view.

it sure seems like the problem is that Synology added system calls that were then later used for other things (403, for example, is clock_gettime64 in Linux mainline while it is SYNOArchiveBit in Synology's 3.10.77 kernel).

Sorry, what you have there is not Linux, but instead Synology's incompatible fork of Linux. System call numbers are part of the Linux ABI, and derivatives that use a previously-unused syscall number for their own purposes are no longer suitable for running arbitrary Linux programs.

The (only) feature-discovery mechanism for the Linux system call interface is that user-space invokes the system call that it would prefer to use, to see whether it works. If it fails with ENOSYS (meaning "not implemented yet") then user-space can either fail (if the syscall is a hard requirement), or fall back to something older like the old 32-bit clock_gettime. If it either succeeds, or fails with an error other than ENOSYS, then that's an error. There is no way for user-space to query what syscall 403 is: after a system call number has been defined, it is meant to mean the same thing on every x86 Linux system, either "not implemented yet" or its mainline Linux meaning. (For historical reasons, syscalls use different numbers on different CPU architectures, but Steam only supports x86 anyway.)

It seems that the presence of the similarly numbered syscall causes the issue in glibc >= 2.31 and the random musl I tried, even when the entry for the syscall isn't in the vDSO

There is no requirement for a syscall to be in the vDSO, and the majority of syscalls are not (in practice the vDSO only contains a few of the most time-sensitive syscalls). If a syscall exists in the vDSO, then user-space can choose whether to invoke it via the vDSO or the ordinary syscall mechanism. If a syscall does not exist in the vDSO, then user-space is expected to invoke it via the ordinary syscall mechanism (potentially with a fallback on ENOSYS).

Moving to an older glibc would also fix the steamcmd side of things

steamcmd does not ship with its own glibc (it can't, even if we wanted it to, because private symbols in glibc are tightly coupled to the corresponding runtime linker /lib*/ld-*.so) so the choice of glibc is determined by the OS or container that you run it on. In the case of this issue report, the newer glibc is provided by a Debian-11-based container, and the older glibc is provided by a Debian-10-based container.

This doesn't seem like it is really a Steam issue at all: the issue is that Synology's kernel is not suitable for running containers based on a newer version of glibc (for example Debian >= 11), or a newer version of musl. If Synology advertises the ability to run arbitrary Linux containers as a selling point for their systems, then I would suggest that Synology users should report their kernel's incompatibility with modern glibc versions to whatever technical support contact they provide, because this is going to affect any modern container that you want to run (not just Steam).

The oldest Linux kernel with security support on kernel.org is currently 4.9, so a kernel based on 3.10.105 is likely to have multiple unfixed security vulnerabilities. The 3.10.x branch started in 2013, was an LTS branch maintained for several years, but reached end-of-life in 2017. Again, this is something for Synology to address.

@lloesche, if you're the maintainer of this Valheim-dedicated-server container, you will have to decide which is more important to you: being able to run on Synology's fork of Linux, or using a modern version of Debian. If the ability to run your container on Synology is important to you, you will have to either stick to Debian 10, or have a Debian-11-based container for standard Linux systems and a separate Debian-10-based container for Synology systems. Conversely, if you want to require Debian 11, then you might want to document your container as being unsuitable for use with Synology's fork of Linux.

@opello
Copy link

opello commented Jan 3, 2023

The rest of this comment refers only to the problems seen on Synology systems. The short version is that I think these will have to be "won't fix" from Steam's point of view.

Ack. Makes complete sense.

The non-Synology problem was encountered again in lloesche/valheim-server-docker #531 and hopefully more information and a simple steamcmd invocation is forthcoming.

As for the rest of the manifesto ...

Sorry, what you have there is not Linux, but instead Synology's incompatible fork of Linux.

You are preaching to the choir. I was trying to find a way forward given a sea of constraints and understand the complex interaction of the various pieces for which I am not the originator.

There is no requirement for a syscall to be in the vDSO

I do understand this. The commit I referenced, as I read it, seems to suggest the only time clock_gettime64 would be available would be in vDSO-supporting kernels. Furthermore, if those kernels support the vDSO, they also include clock_gettime64 in that vDSO (starting in v3.15). Such a test would have also avoided the Synology problem, but may not have been appropriate. I don't have the context to say for sure, and didn't really get much of an answer on the libc-help mailing list.

Moving to an older glibc would also fix the steamcmd side of things

steamcmd does not ship with its own glibc

Of course. This sentence was from a paragraph dedicated to workarounds in the Docker image environment used to reproduce the Synology problem. However, musl supports static linking, and certainly presents an opportunity to ship your own libc.

This doesn't seem like it is really a Steam issue at all

💯

If the ability to run your container on Synology is important to you, you will have to either stick to Debian 10, or have a Debian-11-based container for standard Linux systems and a separate Debian-10-based container for Synology systems.

And you can see the solution here if you like. But it's pretty ugly. I certainly don't feel "good" about it insofar as execution environment cleanliness is concerned. But, given the constraints, I didn't see a much better way. Proof of the pudding being in the eating and all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants