Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profile-sync-daemon fails to work properly after battery dies on laptop #178

Closed
JanLuca opened this issue Aug 5, 2016 · 21 comments
Closed

Comments

@JanLuca
Copy link
Contributor

JanLuca commented Aug 5, 2016

Forward bug report from Debian bug tracker: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833135

Package: profile-sync-daemon
Version: 6.25-1

After a system crash (or blackout), profile-sync-daemon doesn't work properly anymore, when using the overlay filesystem.
It does not sync any changes back anymore, maynly because of following message:

Aug  1 11:20:57 RMMbook profile-sync-daemon[5565]: ln: failed to create symbolic link '/home/rmm/.mozilla/firefox/rk7hh7a0.default/null': File exists
Aug  1 11:20:57 RMMbook profile-sync-daemon[5565]: mv: cannot overwrite directory '/home/rmm/.mozilla/firefox/rk7hh7a0.default-backup' with non-directory
Aug  1 11:21:04 RMMbook profile-sync-daemon[5565]: #033[01mfirefox resync successful#033[00m
Aug  1 11:21:04 RMMbook profile-sync-daemon[5565]: #033[01mfirefox unsync successful#033[00m

It says it is successfull, but in reality it didn't sync changes back, also the mounts still exist even after ending the daemon.
After deleting that null file, things seem to work again. I'm not sure what that "non-directory" message has to do with everything.

-- System Information:
Debian Release: stretch/sid
APT prefers testing
APT policy: (900, 'testing'), (800, 'unstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.6.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages profile-sync-daemon depends on:
pn rsync

profile-sync-daemon recommends no packages.

Versions of packages profile-sync-daemon suggests:
ii libpam-systemd 230-7
ii systemd-sysv 230-7

-- no debconf information

@graysky2
Copy link
Owner

graysky2 commented Aug 5, 2016

To the bug reporter - Please post the output of psd p then ensure that psd.service is in a stopped state, then verify that there are no overlayfs mounts from it mount | grep overlay and if not, edit ~/.config/psd/psd.conf and temporarly disable overlayfs. Does psd start then?

@seankhl
Copy link

seankhl commented Aug 5, 2016

Same issue, also Debian (testing/unstable).

profile-sync-daemon[26601]: mv: cannot move '/home/sean/.config/google-chrome-unstable' to '/home/sean/.config/google-chrome-unstable-backup': Directory not empty
profile-sync-daemon[26601]: ln: failed to create symbolic link '/home/sean/.config/google-chrome-unstable/null': File exists

psd p reported as active, identical to working output. No overlayfs mounts present on chrashed browsers (other browsers worked fine). Disabling overlayfs did not help.

Removing the backup dir fixes the problem. (I removed the ovfs dir too.)

@graysky2
Copy link
Owner

graysky2 commented Aug 5, 2016

Can you outline the steps for me to reproduce on my system? I can spin up a debian VM if needed.

@seankhl
Copy link

seankhl commented Aug 5, 2016

I'm not 100% sure what triggered it. My system went into hibernate because I forgot to plug my charger in, haha. So my whole system hibernated, then I brought it back and the browser seemed fine.

Sometime later the browser itself crashed too. I think I restarted the browser and hibernate had done something dumb, so chrome claimed it had crashed when it oepend back up. I noticed my hard disk wasn't going into standby (the whole reason I use psd and asd -- my hdd sucks and gets hot when it's constantly spinning, so I want it to spin down the vast majority of the time). So I checked the /run/user/1000/google* directory, and it was empty. I then performed the above checks and got the above results.

Sorry I can't be of more help.

@graysky2
Copy link
Owner

graysky2 commented Aug 6, 2016

Doesn't sound like a psd issue... I put sid on a VM and simulated power failures by closing the VM without shutting it down. I just got the expected recovery snapshots and a functional browser profile. I never used hibernate (not just testing the VM but ever though)..

@yennor
Copy link

yennor commented Aug 8, 2016

I'm the original bug reporter.
psd always starts up correctly and also works correctly. The only problem is, that when stopping the service - manually or by shutting down the system - the newly written data don't get synced back.
The null files usually (not always) appear after a system crash. I also managed to get them, by manually stopping and restarting psd several times. I can look further into this later, but as it seems now, also the "profile-sync-daemon[5657]: mv: cannot overwrite directory '/home/rmm/.mozilla/firefox/a38n37oj.default-backup' with non-directory" stops psd from syncinc the data back. (I originally looked into the problem a few months ago, and then there where only the null files. But I never had time to report the problem...).
Anyway, the way it looks now, in a normal run, without any system crash or anything:

psd p:

Profile-sync-daemon v6.25 on Debian GNU/Linux stretch/sid

Systemd service is currently active.
Systemd resync-timer is currently active.
Overlayfs v23 is currently active.

Psd will manage the following per /home/rmm/.config/psd/.psd.conf:

browser/psname: chromium/chromium
owner/group id: rmm/1000
sync target: /home/rmm/.config/chromium
tmpfs dir: /run/user/1000/rmm-chromium
profile size: 33M
overlayfs size:
recovery dirs: none

browser/psname: firefox/firefox
owner/group id: rmm/1000
sync target: /home/rmm/.mozilla/firefox/a38n37oj.default
tmpfs dir: /run/user/1000/rmm-firefox-a38n37oj.default
profile size: 29M
overlayfs size:
recovery dirs: none

Then when I stop the psd service.

Aug 8 13:16:07 RMMbook systemd[4300]: Stopped Timer for profile-sync-daemon - 1Hour.
Aug 8 13:16:07 RMMbook systemd[4300]: Stopping Profile-sync-daemon...
Aug 8 13:16:07 RMMbook profile-sync-daemon[5657]: mv: cannot overwrite directory '/home/rmm/.config/chromium-backup' with non-directory
Aug 8 13:16:07 RMMbook profile-sync-daemon[5657]: mv: cannot overwrite directory '/home/rmm/.mozilla/firefox/a38n37oj.default-backup' with non-directory
Aug 8 13:16:07 RMMbook profile-sync-daemon[5657]: #33[01mchromium resync successful#033[00m
Aug 8 13:16:08 RMMbook profile-sync-daemon[5657]: #33[01mfirefox resync successful#033[00m
Aug 8 13:16:08 RMMbook profile-sync-daemon[5657]: #33[01mchromium unsync successful#033[00m
Aug 8 13:16:08 RMMbook profile-sync-daemon[5657]: #33[01mfirefox unsync successful#033[00m
Aug 8 13:16:08 RMMbook systemd[4300]: Stopped Profile-sync-daemon.

The psd service is stopped, but the mounts still exists, and the data where not synced back:

overlaid on /dev/shm/rmm-chromium type overlay (rw,nosuid,nodev,relatime,lowerdir=/home/rmm/.config/chromium-backup,upperdir=/dev/shm/rmm-chromium-rw,workdir=/dev/shm/.rmm-chromium)
overlaid on /dev/shm/rmm-firefox-a38n37oj.default type overlay (rw,nosuid,nodev,relatime,lowerdir=/home/rmm/.mozilla/firefox/a38n37oj.default-backup,upperdir=/dev/shm/rmm-firefox-a38n37oj.default-rw,workdir=/dev/shm/.rmm-firefox-a38n37oj.default)

The directory, "lowerdir=/home/rmm/.mozilla/firefox/a38n37oj.default-backup" exists while psd is running, but doesn't exist anymore after stopping psd, eventhough the mount still exist.

Then I unmount those mounts by hand.
Then when I look into /var/run/shm I get the directories:
rmm-chromium
rmm-chromium-rw
rmm-firefox-a38n37oj.default
rmm-firefox-a38n37oj.default-rw

in the *-rw directories are the changes made by the browsers, and also a file called "null" which are a symlink to /dev/null (that seems to be one way to get those "null" files, I'm not sure yet if there are others). The other directories are empty.

@yennor
Copy link

yennor commented Aug 8, 2016

I disabled the overlayfs. The result is following:

After stopping the psd service:
Aug 8 13:43:44 RMMbook systemd[4297]: Stopping Profile-sync-daemon...
Aug 8 13:43:44 RMMbook systemd[4297]: Stopped Timer for profile-sync-daemon - 1Hour.
Aug 8 13:43:44 RMMbook profile-sync-daemon[5447]: mv: cannot overwrite directory '/home/rmm/.config/chromium-backup' with non-directory
Aug 8 13:43:44 RMMbook profile-sync-daemon[5447]: mv: cannot overwrite directory '/home/rmm/.mozilla/firefox/a38n37oj.default-backup' with non-directory
Aug 8 13:43:44 RMMbook profile-sync-daemon[5447]: #33[01mchromium resync successful#033[00m
Aug 8 13:43:44 RMMbook profile-sync-daemon[5447]: #33[01mfirefox resync successful#033[00m
Aug 8 13:43:45 RMMbook profile-sync-daemon[5447]: #33[01mchromium unsync successful#033[00m
Aug 8 13:43:45 RMMbook profile-sync-daemon[5447]: #33[01mfirefox unsync successful#033[00m
Aug 8 13:43:45 RMMbook systemd[4297]: Stopped Profile-sync-daemon.

So I get the same messages, BUT the data is synced back. Also now there is a symlink "null" pointing to /dev/null in the /home/rmm/.mozilla/firefox/a38n37oj.default and the corresponding chromium directory.
The directories /var/run/shm/rmm-chromium and /var/run/shm/rmm-firefox-a38n37oj.default
which contain also the "null" file and also seem to contain still the whole copy of the browsers data.

@graysky2
Copy link
Owner

graysky2 commented Aug 8, 2016

This is very unusual... do you have any type of non-standard setup?

Psd is designed to prevent the browser itself from starting while the sync is in progress, so it creates that link to /dev/null (https://github.com/graysky2/profile-sync-daemon/blob/master/common/profile-sync-daemon.in#L432)

Once the sync finishes, it unlinks the /dev/null and links it to the tmpfs target (https://github.com/graysky2/profile-sync-daemon/blob/master/common/profile-sync-daemon.in#L458)

For some reason, your setup seems to be breaking down between there... I need to be able to reproduce it to figure out what's going on and thus far, I cannot.

@yennor
Copy link

yennor commented Aug 9, 2016

I've got a completely standard setup. Debian Testing. I deleted the
complete firefox profile directories, just to be sure there is nothing
wrong in there. But the same result.

I guess the main problem is the message
"mv: cannot overwrite directory
'/home/rmm/.mozilla/firefox/a38n37oj.default-backup' with non-directory"
But I've really got no idea what it means. Propably some error when
using the mv command? Like trying to move a file or a symlink, some
problems with globbing? But that should behave the same on most systems?
Tell me if there is anything I could try out. Unfortunately right now, I
don't have time to debug the script myself.

Am 8.8.2016 20:05, schrieb graysky:

This is very unusual... do you have any type of non-standard setup?

Psd is designed to prevent the browser itself from starting while the
sync is in progress, so it creates that link to /dev/null
(https://github.com/graysky2/profile-sync-daemon/blob/master/common/profile-sync-daemon.in#L432)

Once the sync finishes, it unlinks the /dev/null and links it to the
tmpfs target
(https://github.com/graysky2/profile-sync-daemon/blob/master/common/profile-sync-daemon.in#L458)

For some reason, your setup seems to be breaking down between there...
I need to be able to reproduce it to figure out what's going on and
thus far, I cannot.

You are receiving this because you commented.
Reply to this email directly, view it on GitHub [1], or mute the
thread [2].

Links:

[1]
#178 (comment)
[2]
https://github.com/notifications/unsubscribe-auth/AGS6GwD1lYuZrXPG5lL0qmzZ2Led7J0_ks5qd2_pgaJpZM4Jdpwt

@graysky2
Copy link
Owner

Again, just stopping the service gives you this error? I will again try to repeat in a VM running stretch. Please provide me with the output of psd p and your psd.conf

@yennor
Copy link

yennor commented Aug 15, 2016

yes, just stopping the service, gives the message.

psd p:

Profile-sync-daemon v6.25 on Debian GNU/Linux stretch/sid

Systemd service is currently active.
Systemd resync-timer is currently active.
Overlayfs v23 is currently active.

Psd will manage the following per /home/rmm/.config/psd/.psd.conf:

browser/psname: chromium/chromium
owner/group id: rmm/1000
sync target: /home/rmm/.config/chromium
tmpfs dir: /run/user/1000/rmm-chromium
profile size: 33M
overlayfs size:
recovery dirs: 1 <- delete with the c option
dir path/size: /home/rmm/.config/chromium-backup-crashrecovery-20160810_094003 (32M)

browser/psname: firefox/firefox
owner/group id: rmm/1000
sync target: /home/rmm/.mozilla/firefox/a38n37oj.default
tmpfs dir: /run/user/1000/rmm-firefox-a38n37oj.default
profile size: 33M
overlayfs size:
recovery dirs: 1 <- delete with the c option
dir path/size: /home/rmm/.mozilla/firefox/a38n37oj.default-backup-crashrecovery-20160810_094005 (33M)

@yennor
Copy link

yennor commented Aug 15, 2016

that freaking idiot tries to format the comment signs in my psd.conf and it doesn't allow to upload it...
anyway, the only lines in it, which are NOT commented out are:

USE_OVERLAYFS="yes"
VOLATILE="/dev/shm"

@graysky2
Copy link
Owner

I still cannot reproduce in my stretch VM.

Insidently (and not related), you're defining VOLATILE in the config file; this has been deprecated since v6.16.

@graysky2
Copy link
Owner

I am not sure what to do here since I am unable to reproduce this behavior in a sid VM. @yennor - does the current version still do this for you or have you stopped using it all together?

@seankhl
Copy link

seankhl commented Sep 23, 2016

I can try to help you reproduce this but we'll have to work together a bit.

Every time this has happened to me it's been because my battery died while I had chrome (beta or unstable) open. My computer is set to try to hibernate when it dies, but it only sometimes works. Other times it just shuts down. I also use laptop-mode-tools, so it goes through this process when my battery is at 2 or 3 percent or something.

I think the problem comes when my laptop starts trying to elegantly hibernate, but oh no, the power is just too low and cuts out, terminating my laptop's execution completely. In the middle of trying to elegantly hibernate, it runs the systemd job for profile-sync-daemon, which attempts to sync things in memory back to disc. At some point during this process, it performs actual copies of data from memory to disc. I think the issue above happens when my laptop happens to die in the middle of psd's systemd job. Enough of the data has been copied back but not all, such that psd doesn't think that the dirs are synced. So, it tries to create symbolic links to files that exist, causing bugs, causing the above problem I posted.

The quick fix is to check for this kind of situation and delete the backup dir in the script if so. Then the rest works fine.

Funny thing, I use asd too for some /var dirs to avoid my HDD spinning up as described above. Sometimes when my laptop dies, I get a problem where these symbolic links are also not created, or where data has not been returned to the original location on disc if asd is marked as off. Turning asd off and back on (or on -> off -> on if indicated as off) from a tty fixes the problem, but it can actually keep my computer from booting! I think the ultimate problem is that some of those directories in /var are created by systemd units themselves, and so there should be some dependencies for my asd unit... but I've been too lazy to figure it out, and just have asd disabled and enable it once my computer boots.

@graysky2
Copy link
Owner

I never use hibernate but perhaps the problem is an incomplete write out to the disk? Seems like quite a few variables at play in what you're describing.

@graysky2 graysky2 changed the title profile-sync-daemon fails to work properly after system crash profile-sync-daemon fails to work properly after battery dies on laptop Sep 24, 2016
@yennor
Copy link

yennor commented Oct 10, 2016

sorry, I didn't reply earlier, I didn't had an Internet connection the last few weeks. I stopped using psd, because it was too much work fixing the profiles every 2-3 days.
Right now I'm under quite a bit workload so I don' have time to check if it's still the same with the newest version (and kernel). But as soon as I'll have more time I'll look into it...
But just as a comment to the above, my crashes usually were also related to power managment (after a certain update, my computer regularly (not always) crashed after a few minutes after waking up from sleep mode, meanwhile I stopped using sleep mode, since I didn't feel like down grading until I find a working kernel/powermanagment/what-ever-library combination). Or it run out of power while beeing asleep... But as i mentioned above, I also managed to trigger the behaviour manually without using powermanagment or a crash...

@graysky2
Copy link
Owner

graysky2 commented Oct 10, 2016

I am not an expert with systemd units but I am wondering if adding the following psd-resync.service would solve your problem.... would you be willing to test?

/usr/lib/systemd/user/psd-resync.service 
[Unit]
Description=Timed resync
After=psd.service
Wants=psd-resync.timer
PartOf=psd.service
Before=sleep.target


[Service]
Type=oneshot
ExecStart=/usr/bin/profile-sync-daemon resync

[Install]
WantedBy=default.target sleep.target

@graysky2
Copy link
Owner

@JanLuca - Any update?

@JanLuca
Copy link
Contributor Author

JanLuca commented Nov 30, 2016

@graysky2: I have no futher informations, as described I could not reproduce the bug, too

@graysky2
Copy link
Owner

OK, reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants