os/signal: notify should filter out runtime-generated SIGURG #37942
Comments
Right. It was an explicit decision to pass these through to the program, in case the program is using SIGURG. Unfortunately, the runtime can't tell if a signal is "runtime generated" and thus can't filter it out. Because the kernel combines signals, it's not even well-specified what it would mean for a received signal to be runtime-generated because the kernel could have combined a runtime-generated signal with a signal from another source. @changkun, I don't really understand your program. First, I don't understand the need for |
@minux I don't see how it is possible even in principle for os/signal to filter out signals generated from the runtime package. Do you have any suggestions as to how that could be done? |
|
That would not work if the kernel combines signals with different |
As far as I can tell, the Linux 5.3.12 kernel does combine signals with a different if (legacy_queue(pending, sig))
goto ret; and static inline bool legacy_queue(struct sigpending *signals, int sig)
{
return (sig < SIGRTMIN) && sigismember(&signals->signal, sig);
} where |
Yeah, it's very difficult to filter out runtime-generated SIGURG without
also occasionally remove a real SIGURG.
I guess, a slightly better solutions is to make sure the runtime only uses
SIGURG when it's necessary.
My program can tolerate occasional spurious SIGURG, but in the current
runtime implementation, it's occurring too frequently.
(Just like the example program I posted, it contains enough preempt points
that forced preemption shouldn't be necessary.)
Perhaps we can repurpose this issue to "runtime: only use forced preemption
when necessary"? Thoughts?
|
Probably a Go1.14 issue; see golang/go#37942
I found a lot of SIGURGs will timeout the runtime/regex testes of mips64le. The origin test of runtime only takes 43s to finish, now it's 600s.
|
I've considered delaying the first attempt to use signals until some time after we've tried cooperative preemption and it's failed. Most of the time cooperative preemption responds very quickly. I'm not sure what to set that threshold to, though, since every millisecond of delay there is a millisecond added to STW if there's anything not cooperating. |
Could you also provide a build option to enable (or never enable) preemptive? Runtime env flags would presumably override that... |
That option already exists ( |
Um, I meant a compiler flag to set the default scheduler, which the env var could override. |
…14 in git env As seen in trouble shooting go-gitea#11032 the new feature of Go 1.14 is causing several second delays in startup in certain situations. Debugging shows it spending several seconds handling SIGURG commands during init: ``` 6922:04:51.984234 trace init() ./modules/queue/unique_queue_wrapped.go remote: ) = 69 <0.000012> remote: [pid 15984] 22:04:51 write(1, "\ttime taken: 236.761\302\265s\n\n", 25 time taken: 236.761µs remote: remote: ) = 25 <0.000011> remote: [pid 15984] 22:04:51 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- remote: [pid 15984] 22:04:52 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- remote: [pid 15984] 22:04:52 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- ``` This causes up to 20 seconds added to a push in some cases as it happens for each call of the gitea hook command. This is likely the cause of go-gitea#10661 as well and would start to effect users once we release 1.12 which would be the first release compiled with Go 1.14. I suspect this is just a slight issue with the upstream implementatation as there have been a few very similar bugs fixed and reported: golang/go#37741 golang/go#37942 We should revisit this in the future and see if a newer version of Go has solved it, but for now disable this option in the environment that gitea hook runs in to avoid it.
…14 in git env (#11237) As seen in trouble shooting #11032 the new feature of Go 1.14 is causing several second delays in startup in certain situations. Debugging shows it spending several seconds handling SIGURG commands during init: ``` 6922:04:51.984234 trace init() ./modules/queue/unique_queue_wrapped.go remote: ) = 69 <0.000012> remote: [pid 15984] 22:04:51 write(1, "\ttime taken: 236.761\302\265s\n\n", 25 time taken: 236.761µs remote: remote: ) = 25 <0.000011> remote: [pid 15984] 22:04:51 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- remote: [pid 15984] 22:04:52 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- remote: [pid 15984] 22:04:52 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- ``` This causes up to 20 seconds added to a push in some cases as it happens for each call of the gitea hook command. This is likely the cause of #10661 as well and would start to effect users once we release 1.12 which would be the first release compiled with Go 1.14. I suspect this is just a slight issue with the upstream implementatation as there have been a few very similar bugs fixed and reported: golang/go#37741 golang/go#37942 We should revisit this in the future and see if a newer version of Go has solved it, but for now disable this option in the environment that gitea hook runs in to avoid it.
…14 in git env (go-gitea#11237) As seen in trouble shooting go-gitea#11032 the new feature of Go 1.14 is causing several second delays in startup in certain situations. Debugging shows it spending several seconds handling SIGURG commands during init: ``` 6922:04:51.984234 trace init() ./modules/queue/unique_queue_wrapped.go remote: ) = 69 <0.000012> remote: [pid 15984] 22:04:51 write(1, "\ttime taken: 236.761\302\265s\n\n", 25 time taken: 236.761µs remote: remote: ) = 25 <0.000011> remote: [pid 15984] 22:04:51 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- remote: [pid 15984] 22:04:52 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- remote: [pid 15984] 22:04:52 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=15984, si_uid=0} --- ``` This causes up to 20 seconds added to a push in some cases as it happens for each call of the gitea hook command. This is likely the cause of go-gitea#10661 as well and would start to effect users once we release 1.12 which would be the first release compiled with Go 1.14. I suspect this is just a slight issue with the upstream implementatation as there have been a few very similar bugs fixed and reported: golang/go#37741 golang/go#37942 We should revisit this in the future and see if a newer version of Go has solved it, but for now disable this option in the environment that gitea hook runs in to avoid it.
When Docker/containerd binaries are compiled with Go 1.15 the containers generate many signal 23 (SIGURG) events which flood monitoring systems: kubernetes/kops#10388 The SIGURG signal does not kill the process but is generated by Go runtime scheduling: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md) Because the Go runtime does not know if the process expects external SIGURG signals, the signal is not filtered out but reported to the process: golang/go#37942 The process has to filter this signal out itself before forwarding it to, e.g,. children processes or logs. This change was introduced with the Go 1.15 update (actually Go 1.14 but Flatcar skipped that for Stable), however, while containerd has some workarounds in place, e.g., in containerd/containerd#4532 but there are still areas where the signal is not handled correctly. Until this is the case, downgrade to use the Go 1.13 compiler for Docker/containerd binaries. See kinvolk/Flatcar#315
When Docker/containerd binaries are compiled with Go 1.15 the containers generate many signal 23 (SIGURG) events which flood monitoring systems: kubernetes/kops#10388 The SIGURG signal does not kill the process but is generated by Go runtime scheduling: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md) Because the Go runtime does not know if the process expects external SIGURG signals, the signal is not filtered out but reported to the process: golang/go#37942 The process has to filter this signal out itself before forwarding it to, e.g,. children processes or logs. This change was introduced with the Go 1.15 update (actually Go 1.14 but Flatcar skipped that for Stable), however, while containerd has some workarounds in place, e.g., in containerd/containerd#4532 but there are still areas where the signal is not handled correctly. Until this is the case, downgrade to use the Go 1.13 compiler for Docker/containerd binaries. See kinvolk/Flatcar#315
When Docker/containerd binaries are compiled with Go 1.15 the containers generate many signal 23 (SIGURG) events which flood monitoring systems: kubernetes/kops#10388 The SIGURG signal does not kill the process but is generated by Go runtime scheduling: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md) Because the Go runtime does not know if the process expects external SIGURG signals, the signal is not filtered out but reported to the process: golang/go#37942 The process has to filter this signal out itself before forwarding it to, e.g,. children processes or logs. This change was introduced with the Go 1.15 update (actually Go 1.14 but Flatcar skipped that for Stable), however, while containerd has some workarounds in place, e.g., in containerd/containerd#4532 but there are still areas where the signal is not handled correctly. Until this is the case, downgrade to use the Go 1.13 compiler for Docker/containerd binaries. See kinvolk/Flatcar#315
In go1.14+, SIGURG is used by the runtime to handle preemtable system calls. In practice this signal caught *frequently*. For reference: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md golang/go#37942 Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In go1.14+, SIGURG is used by the runtime to handle preemtable system calls. In practice this signal caught *frequently*. For reference: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md golang/go#37942 Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In go1.14+, SIGURG is used by the runtime to handle preemtable system calls. In practice this signal caught *frequently*. For reference: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md golang/go#37942 Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In go1.14+, SIGURG is used by the runtime to handle preemtable system calls. In practice this signal caught *frequently*. For reference: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md golang/go#37942 Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In go1.14+, SIGURG is used by the runtime to handle preemtable system calls. In practice this signal caught *frequently*. For reference: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md golang/go#37942 Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In go1.14+, SIGURG is used by the runtime to handle preemtable system calls. In practice this signal caught *frequently*. For reference: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md golang/go#37942 Signed-off-by: Brian Goff <cpuguy83@gmail.com> Upstream-commit: fff164c22e8dc904291fecb62307312fd4ca153e Component: cli
When Docker/containerd binaries are compiled with Go 1.15 the containers generate many signal 23 (SIGURG) events which flood monitoring systems: kubernetes/kops#10388 The SIGURG signal does not kill the process but is generated by Go runtime scheduling: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md) Because the Go runtime does not know if the process expects external SIGURG signals, the signal is not filtered out but reported to the process: golang/go#37942 The process has to filter this signal out itself before forwarding it to, e.g,. children processes or logs. This change was introduced with the Go 1.15 update (actually Go 1.14 but Flatcar skipped that for Stable), however, while containerd has some workarounds in place, e.g., in containerd/containerd#4532 but there are still areas where the signal is not handled correctly. Until this is the case, downgrade to use the Go 1.13 compiler for Docker/containerd binaries. See kinvolk/Flatcar#315
When Docker/containerd binaries are compiled with Go 1.15 the containers generate many signal 23 (SIGURG) events which flood monitoring systems: kubernetes/kops#10388 The SIGURG signal does not kill the process but is generated by Go runtime scheduling: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md) Because the Go runtime does not know if the process expects external SIGURG signals, the signal is not filtered out but reported to the process: golang/go#37942 The process has to filter this signal out itself before forwarding it to, e.g,. children processes or logs. This change was introduced with the Go 1.15 update (actually Go 1.14 but Flatcar skipped that for Stable), however, while containerd has some workarounds in place, e.g., in containerd/containerd#4532 but there are still areas where the signal is not handled correctly. Until this is the case, downgrade to use the Go 1.13 compiler for Docker/containerd binaries. See kinvolk/Flatcar#315
When Docker/containerd binaries are compiled with Go 1.15 the containers generate many signal 23 (SIGURG) events which flood monitoring systems: kubernetes/kops#10388 The SIGURG signal does not kill the process but is generated by Go runtime scheduling: https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md) Because the Go runtime does not know if the process expects external SIGURG signals, the signal is not filtered out but reported to the process: golang/go#37942 The process has to filter this signal out itself before forwarding it to, e.g,. children processes or logs. This change was introduced with the Go 1.15 update (actually Go 1.14 but Flatcar skipped that for Stable), however, while containerd has some workarounds in place, e.g., in containerd/containerd#4532 but there are still areas where the signal is not handled correctly. Until this is the case, downgrade to use the Go 1.13 compiler for Docker/containerd binaries. See kinvolk/Flatcar#315
This program, when running under go 1.14, will show a huge number of
SIGURG
are received even though none should be (no network IO, no explicit kills).https://play.golang.org/p/x7hFjPZnrg5
https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md changes to use SIGURG to preempt goroutines, which is fine, but I think
os/signal.Notify
should filter out those runtime generated SIGURGs.The reason I found this is that I have a program that happens to use
SIGURG
as a custom signal likeSIGUSR1
, which works fine with previous Go releases. I understand that it should probably useSIGUSR1
, but still I thinkos/signal
should hide any signals generated by the runtime as it's irrelevant for the user and an implementation detail./cc @aclements thoughts?
The text was updated successfully, but these errors were encountered: