New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Killed by SIGPIPE #2350

Closed
lfam opened this Issue Oct 2, 2015 · 11 comments

Comments

Projects
None yet
6 participants
@lfam
Copy link

lfam commented Oct 2, 2015

Sometimes I find that Syncthing has died on one of my machines due to SIGPIPE. I notice this every few days. I haven't figured out how to reproduce the problem.

I am running Syncthing as a systemd user service on Debian Jessie. The architecture is armv7. The systemd journal is on-disk (/var/log/journal).

I'm wondering if the problem is related to this bug report on systemd from ~1 year ago. Go is called out specifically in relation to journald.
https://bugs.freedesktop.org/show_bug.cgi?id=84923

In the meantime, I will have to add this as a restart condition to the systemd service files but I think that shouldn't be necessary in the long run.

Here is how I confirm the problem after noticing my data wasn't synced:

$ sysu status syncthing
● syncthing.service - Syncthing service for 
   Loaded: loaded (/etc/systemd/user/syncthing.service; enabled)
   Active: inactive (dead) since Thu 2015-10-01 19:33:06 EDT; 8h ago
     Docs: http://docs.syncthing.net/
  Process: 5814 ExecStart=/usr/bin/syncthing -no-browser -logflags=0 (code=killed, signal=PIPE)
 Main PID: 5814 (code=killed, signal=PIPE)

Oct 01 19:31:57 hostname systemd[710]: Started Syncthing service for .
Oct 01 19:32:07 hostname syncthing[5814]: [GRH4V] INFO: syncthing v0.11.25 "Aluminium Ant" (go1.4.2 linux-arm default) unknown-user@syncthing-builder 2015-09-13 09:46:17 UTC
Oct 01 19:32:07 hostname syncthing[5814]: [GRH4V] INFO: My ID: GRH4VX4-XTR55BZ-OTX5NT3-T52VFHL-AIGB2NJ-KOAHAG4-44EURGI-MDGNCQR
Oct 01 19:32:07 hostname syncthing[5814]: [GRH4V] INFO: Database block cache capacity 8192 KiB

Here is some info about my system. Yes, that is a weird device-specific kernel (linux-sun7i on the Cubieboard A20).

$ uname -a
Linux hostname 3.4.106-cubieboard2 #6 SMP PREEMPT Fri Apr 17 19:44:23 CEST 2015 armv7l GNU/Linux
$ lscpu
Architecture:          armv7l
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
CPU max MHz:           1008.0000
CPU min MHz:           60.0000

Here are the systemd user service files of Syncthing and syncthing-inotify. I'm presenting them here as patches on the upstream service files. The big change is that the relationship between Syncthing and syncthing-inotify is completely contained in syncthing-inotify.service. Also, I see that my syncthing-inotify.service is out of date and needs to take exit status 3 into account (a good reason to put syncthing-inotify into apt.syncthing.net. What new systems with apt / dpkg don't use systemd?).

--- /home/user/work/syncthing/etc/linux-systemd/user/syncthing.service  2015-10-02 14:37:30.655277574 -0400
+++ /etc/systemd/user/syncthing.service 2015-07-08 14:59:01.327643525 -0400
@@ -1,12 +1,16 @@
 [Unit]
-Description=Syncthing - Open Source Continuous File Synchronization
+Description=Syncthing service for %i
 Documentation=http://docs.syncthing.net/
 After=network.target
-Wants=syncthing-inotify.service

 [Service]
+User=%i
 Environment=STNORESTART=yes
+EnvironmentFile=-%h/.config/syncthing/environment
 ExecStart=/usr/bin/syncthing -no-browser -logflags=0
+Nice=19
+IOSchedulingClass=2
+IOSchedulingPriority=7
 Restart=on-failure
 SuccessExitStatus=2 3 4
 RestartForceExitStatus=3 4
--- /home/user/work/syncthing-inotify/etc/linux-systemd/user/syncthing-inotify.service  2015-10-02 14:40:39.091405241 -0400
+++ /home/user/tmp/syncthing-inotify.service    2015-10-02 14:41:59.511459727 -0400
@@ -1,16 +1,14 @@
 [Unit]
 Description=Syncthing Inotify File Watcher
 Documentation=https://github.com/syncthing/syncthing-inotify/blob/master/README.md
-After=network.target syncthing.service
-Requires=syncthing.service
+PartOf=syncthing.service

 [Service]
-ExecStart=/usr/bin/syncthing-inotify -logflags=0
+ExecStart=/usr/local/bin/syncthing-inotify
 SuccessExitStatus=2
-RestartForceExitStatus=3
 Restart=on-failure
-ProtectSystem=full
-ProtectHome=read-only
+; These don't work. Bugs in systemd?
+;ProtectSystem=full
+;ProtectHome=read-only

 [Install]
-WantedBy=default.target
+WantedBy=syncthing.service
@AudriusButkevicius

This comment has been minimized.

Copy link
Member

AudriusButkevicius commented Oct 2, 2015

So given it's a systemd bug, what do you expect us todo?

@lfam

This comment has been minimized.

Copy link
Author

lfam commented Oct 2, 2015

Oh, so its definitely a systemd bug? In that case, the systemd service files could be updated to restart on this failure condition, until systemd is fixed. Do you think that would be acceptable?

Or are you just asking? I gave you the information I had because I'm sure that many people are using systemd to run Syncthing. And if they are having the same problem, that sucks.

I'll try to find out if I am running systemd with the patches mentioned by Poettering.

On October 2, 2015 3:04:25 PM EDT, Audrius Butkevicius notifications@github.com wrote:

So given it's a systemd bug, what do you expect us todo?


Reply to this email directly or view it on GitHub:
#2350 (comment)

@AudriusButkevicius

This comment has been minimized.

Copy link
Member

AudriusButkevicius commented Oct 2, 2015

Well we definately don't install a sigpipe handler, so its eitger systems or go missing something

@calmh

This comment has been minimized.

Copy link
Member

calmh commented Oct 2, 2015

Well, I've never seen a sigpipe death personally. The only possible pipes we would write to would be stdout, and I guess as that's managed by systemd there is a suspicion in that direction. There could be weird interaction between go and systemd I guess (I haven't read the links yet). In any case, if this is a restart condition we can add, we should do so. Pull request welcome from whoever understands the issue well enough (I'm happily systemd agnostic so far).

@rumpelsepp

This comment has been minimized.

Copy link
Member

rumpelsepp commented Oct 2, 2015

Since you have

+; These don't work. Bugs in systemd?
+;ProtectSystem=full
+;ProtectHome=read-only

in your service files, I guess you are running a really old systemd version. So I would try updating first, since these issues seem to be fixed nowadays.

@rumpelsepp

This comment has been minimized.

Copy link
Member

rumpelsepp commented Oct 2, 2015

offtopic:

The big change is that the relationship between Syncthing and syncthing-inotify is completely contained in syncthing-inotify.service.

That is not a really nice idea. Since these are two separate applications, they should not be managed by one unit. This issue is solved nowadays by making syncthing-inotify.service an optional dependency of syncthing.service. That means, both applications start when you start syncthing, but nothing breaks when you shutdown synchting-inotify. The other way round, syncthing-inotify requires syncthing und is shutdown when syncthing does.

@lfam

This comment has been minimized.

Copy link
Author

lfam commented Oct 2, 2015

Okay, regarding systemd bug 84923, the relevant fixes came out is systemd 219 and I am using 215.

I put syncthing in a pipeline by hand, and tried breaking the pipe by hand, and sometimes Syncthing dies and sometimes it keeps running. It seems to have more luck on amd64 than armv7.

@rumpelsepp
I'm not sure what you mean that this systemd arrangement is not a nice idea. If I start syncthing, systemd will try to start syncthing-inotify but will keep going if syncthing-inotify fails. If syncthing-inotify came up with syncthing, it will go down with it. But I can start and stop syncthing-inotify without starting or stopping syncthing.

The semantics make sense to me. Syncthing-inotify is useless on its own so it should be PartOf=syncthing.service but syncthing does not rely on it.

@lfam

This comment has been minimized.

Copy link
Author

lfam commented Oct 2, 2015

And I actually had the filesystem protection directives (ProtectHome and ProtectSystem) fail on more recent systemd versions, too.

@rumpelsepp

This comment has been minimized.

Copy link
Member

rumpelsepp commented Oct 2, 2015

Let's move the discussion about that service file stuff to the forum. :)

When the protection stuff causes problems feel free to open another issue.

@kozec

This comment has been minimized.

Copy link
Contributor

kozec commented Oct 3, 2015

I put syncthing in a pipeline by hand, and tried breaking the pipe by hand, and sometimes Syncthing dies and sometimes it keeps running. It seems to have more luck on amd64 than armv7.

SIGPIPE is sent only after process tries to write to pipe that has: 1. no reader left and 2. full write buffer. So, as long as Syncthing doesn't output anything, it will run even with broken pipe.

@lfam

This comment has been minimized.

Copy link
Author

lfam commented Oct 4, 2015

Ah, the write buffer. Thanks for the information, @kozec.

@lfam lfam closed this Oct 4, 2015

@syncthing syncthing locked and limited conversation to collaborators Jun 16, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.