Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seaf-daemon crashing because of missing password #2662

Closed
nicolamori opened this issue Mar 29, 2023 · 16 comments
Closed

seaf-daemon crashing because of missing password #2662

nicolamori opened this issue Mar 29, 2023 · 16 comments

Comments

@nicolamori
Copy link

Since this morning seaf-daemon segfaults on my system (Archlinux). In my setup I have several libraries, some of them are encrypted. The libraries have been created on the server a couple of years ago, and the client has been set up two months ago with seafile-client 8.0.10 (which I'm still using at the moment). Everything worked flawlessly until this morning; no modification of the seafile installation has been performed in the meantime.

Investigating the crash with gdb I got the following:

Core was generated by `/usr/bin/seaf-daemon -c /home/mori/.ccnet -d /home/mori/Seafile/.seafile-data -'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:76
76              VPCMPEQ (%rdi), %ymm0, %ymm1                                                                                                                                                                                                                                  
[Current thread is 1 (Thread 0x7fe46d5276c0 (LWP 96048))]
(gdb) bt
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:76
#1  0x0000562b36935c4b in seafile_decrypt_repo_enc_key
    (enc_version=2, passwd=passwd@entry=0x0, random_key=0x7fe4581310c0 "<value>", repo_salt=0x0, key_out=key_out@entry=0x7fe46d525c20 "<value>", iv_out=iv_out@entry=0x7fe46d525bf0 "la.mori@fi.infn.it\", \"mtime\": 1376768269, \"name\"787ddc6a83ef11edacfcd83c1a4dd5d2a821c825") at ../common/seafile-crypt.c:239
#2  0x0000562b3694a571 in seaf_repo_fetch_and_checkout (http_task=http_task@entry=0x562b36fdc0b0, remote_head_id=remote_head_id@entry=0x562b36fdc11c "add1215edd063a1caeae54b028566e44b3b8145f") at repo-mgr.c:5853
#3  0x0000562b3692beeb in http_download_thread (vdata=0x562b36fdc0b0) at http-tx-mgr.c:4883
#4  0x0000562b3692467b in job_thread_wrapper (vdata=0x562b36fdc350, unused=<optimized out>) at job-mgr.c:66
#5  0x00007fe4707919a3 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/glib/gthreadpool.c:350
#6  0x00007fe47078c315 in g_thread_proxy (data=0x7fe468000d50) at ../glib/glib/gthread.c:831
#7  0x00007fe47045ebb5 in start_thread (arg=<optimized out>) at pthread_create.c:444
#8  0x00007fe4704e0d90 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Inspecting the line 239 of common/seafile-crypt.c I see there's a call to strlen(passwd), which is probably the cause of the segfault since as shown above passwd is NULL and the actual segfault is in __strlen_avx2 inside glibc. I checked ~/Seafile/.seafile-data/repo.db and I see no entries in the RepoPasswd table, but I don't know if this can be related/relevant.

I don't know which other info could be useful, but I can provide it if needed.

@nicolamori
Copy link
Author

I have been able to make it work again by removing ~/Seafile/ and re-configuring from scratch (had to re-sync all the libraries, though, so this is more an emergency workaround than a fix). But restoring the old ~/Seafile/ the problem happens again, so I'd say it's something related to a corrupted configuration.

@feiniks
Copy link
Contributor

feiniks commented Mar 30, 2023

Hello @nicolamori , can you show the seafile.log output when the crash occurs ?

@nicolamori
Copy link
Author

nicolamori commented Mar 30, 2023

Where can I find it? I can't find it in ~/Seafile/. If you are referring to a server log then I have no access to it, I'm just running a client.

@bionade24
Copy link

bionade24 commented Mar 30, 2023

@nicolamori Under ~/.ccnet/logs

@nicolamori
Copy link
Author

This is an excerpt from the log file during several crash and restart:

[03/30/23 08:01:11] seaf-daemon.c(525): starting seafile client 8.0.10
[03/30/23 08:01:11] seafile-session.c(388): client id = 59dc467f86cd583f2a6ec5feb394fdac39446d7f, client_name = stryke
[03/30/23 08:01:11] socket file exists, delete it anyway
[03/30/23 08:01:11] seaf-daemon.c(553): rpc server started.
[03/30/23 08:01:11] clone-mgr.c(678): Transition clone state for 532dcdd6 from [init] to [check server].
[03/30/23 08:01:11] clone-mgr.c(678): Transition clone state for 532dcdd6 from [check server] to [fetch].
[03/30/23 08:01:11] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'init') --> ('normal', 'check')
[03/30/23 08:01:11] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'check') --> ('normal', 'commit')
[03/30/23 08:01:11] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'commit') --> ('normal', 'fs')
[03/30/23 08:01:12] start to serve on pipe client
[03/30/23 08:01:12] start to serve on pipe client
[03/30/23 08:01:12] start to serve on pipe client
[03/30/23 08:01:12] start to serve on pipe client
[03/30/23 08:01:12] sync-mgr.c(1648): File syncing protocol version on server https://basket.fi.infn.it is 1. Client file syncing protocol version is 2. Use version 1.
[03/30/23 08:01:12] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'fs') --> ('normal', 'data')
[03/30/23 08:01:12] start to serve on pipe client
[03/30/23 08:01:12] start to serve on pipe client
[03/30/23 08:01:15] seaf-daemon.c(525): starting seafile client 8.0.10
[03/30/23 08:01:15] seafile-session.c(388): client id = 59dc467f86cd583f2a6ec5feb394fdac39446d7f, client_name = stryke
[03/30/23 08:01:15] socket file exists, delete it anyway
[03/30/23 08:01:15] seaf-daemon.c(553): rpc server started.
[03/30/23 08:01:15] clone-mgr.c(678): Transition clone state for 532dcdd6 from [init] to [check server].
[03/30/23 08:01:15] clone-mgr.c(678): Transition clone state for 532dcdd6 from [check server] to [fetch].
[03/30/23 08:01:15] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'init') --> ('normal', 'check')
[03/30/23 08:01:15] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'check') --> ('normal', 'commit')
[03/30/23 08:01:15] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'commit') --> ('normal', 'fs')
[03/30/23 08:01:16] start to serve on pipe client
[03/30/23 08:01:16] start to serve on pipe client
[03/30/23 08:01:16] start to serve on pipe client
[03/30/23 08:01:16] start to serve on pipe client
[03/30/23 08:01:16] start to serve on pipe client
[03/30/23 08:01:16] start to serve on pipe client
[03/30/23 08:01:16] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'fs') --> ('normal', 'data')
[03/30/23 08:01:19] seaf-daemon.c(525): starting seafile client 8.0.10
[03/30/23 08:01:19] seafile-session.c(388): client id = 59dc467f86cd583f2a6ec5feb394fdac39446d7f, client_name = stryke
[03/30/23 08:01:19] socket file exists, delete it anyway
[03/30/23 08:01:19] seaf-daemon.c(553): rpc server started.
[03/30/23 08:01:19] clone-mgr.c(678): Transition clone state for 532dcdd6 from [init] to [check server].
[03/30/23 08:01:19] clone-mgr.c(678): Transition clone state for 532dcdd6 from [check server] to [fetch].
[03/30/23 08:01:19] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'init') --> ('normal', 'check')
[03/30/23 08:01:19] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'check') --> ('normal', 'commit')
[03/30/23 08:01:19] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'commit') --> ('normal', 'fs')
[03/30/23 08:01:20] start to serve on pipe client
[03/30/23 08:01:20] start to serve on pipe client
[03/30/23 08:01:20] start to serve on pipe client
[03/30/23 08:01:20] start to serve on pipe client
[03/30/23 08:01:20] start to serve on pipe client
[03/30/23 08:01:20] start to serve on pipe client
[03/30/23 08:01:20] sync-mgr.c(1648): File syncing protocol version on server https://basket.fi.infn.it is 1. Client file syncing protocol version is 2. Use version 1.
[03/30/23 08:01:20] http-tx-mgr.c(1156): Transfer repo '532dcdd6': ('normal', 'fs') --> ('normal', 'data')

@feiniks
Copy link
Contributor

feiniks commented Mar 30, 2023

Hi @nicolamori , the crash occurred when the library was downloaded for the first time, can you take a look at the records in the CloneTasks table, in the clone.db database. You can find this table in your old ~/Seafile/.seafile-data/clone.db. I guess there should be no passwd data recorded in this table.

@nicolamori
Copy link
Author

Here it is:

$ sqlite3 clone.db 
SQLite version 3.41.2 2023-03-22 11:56:21
Enter ".help" for usage hints.
sqlite> select * from CloneTasks;
532dcdd6-6fe9-4a24-8a34-345030375288|.config|9449e0e613d7c606b5641104be17c4d430e41353||/home/mori/.config||||nicola.mori@fi.infn.it

@feiniks
Copy link
Contributor

feiniks commented Apr 4, 2023

Here it is:

$ sqlite3 clone.db 
SQLite version 3.41.2 2023-03-22 11:56:21
Enter ".help" for usage hints.
sqlite> select * from CloneTasks;
532dcdd6-6fe9-4a24-8a34-345030375288|.config|9449e0e613d7c606b5641104be17c4d430e41353||/home/mori/.config||||nicola.mori@fi.infn.it

Hello @nicolamori, there is no password in this table, which will cause this crash. We will add a check for password in next release.

@nicolamori
Copy link
Author

@feiniks Ok, thank you. Have you got any idea why this problem suddenly happened? As I wrote, everything worked up to the previous day, and nothing changed in my system (i.e. no upgrades) before the issue came up.

@nicolamori
Copy link
Author

nicolamori commented Apr 4, 2023

@feiniks It just happened again. The password seems to be missing again from the clone.db:

[09:54 mori@stryke ~]$ sqlite3 Seafile/.seafile-data/clone.db 
SQLite version 3.41.2 2023-03-22 11:56:21
Enter ".help" for usage hints.
sqlite> select * from CloneTasks;
532dcdd6-6fe9-4a24-8a34-345030375288|.config|1fa4030dbcce5a33a5be2409ba4be01da154e736||/home/mori/.config||||nicola.mori@fi.infn.it

Is it possible to repair the entry? If yes, can you provide some info about how to do that? Thank you.

Edit: I removed the entry and restarted the client. This fixes the daemon segfault but leaves the library unsynced, so I had to re-sync from scratch, deal with conflicts etc. From inspecting the clone.db during the sync I see that the password must be inserted in clear text in the field after the local path (/home/mori/.config in the above case). This makes me think that the problem stems from the library somehow becoming unsynced and in the middle of a sync. I really don't understand what's happening but I hope this information can be useful for the devs.

@feiniks
Copy link
Contributor

feiniks commented Apr 4, 2023


Is it possible to repair the entry? If yes, can you provide some info about how to do that? Thank you.

Edit: I removed the entry and restarted the client. This fixes the daemon segfault but leaves the library unsynced, so I had to re-sync from scratch, deal with conflicts etc. From inspecting the clone.db during the sync I see that the password must be inserted in clear text in the field after the local path (/home/mori/.config in the above case). This makes me think that the problem stems from the library somehow becoming unsynced and in the middle of a sync. I really don't understand what's happening but I hope this information can be useful for the devs.

Hello @nicolamori Deleting the row in the database, then resyncing the library should fix the issue. As for why the password here is empty, I checked the parameters passed in during gui synchronization, but I didn't find the possible reason for this problem. Did you enter the encrypted library password through the gui?

@nicolamori
Copy link
Author

@feiniks Yes I do everything through the GUI. Why after deleting the row I had to re-sync from scratch? Yesterday the library was synced and working beautifully, so whatever happens makes me I end up with a non-synced library.

If this might be useful, I disable the automatic sync for the library and manually sync it once per day. I didn't notice if the problem raises because of manual sync, but for sure almost always manual sync simply worked.

@nicolamori
Copy link
Author

@feiniks The problem happened again, but this time I paid attention to what was happening. After booting my pc I manually launched a sync for the library. I got a dialog message asking me if I wanted to delete a SFConflict file plus many others (~15k files). I refused, and the library display in the GUI showed something like "Waiting for file deletion confirmation". I checked the local folder and no SFConflict file was present, so I tried again to sync manually. At this point the GUI displayed something like "Downolading files list", the same sentence displayed when syncyng a library from scratch. At the end of the download the daemon started to crash repeatedly, and the entry in CloneTasks with missing password appeared again.

This time I repaired the entry in CloneTasks, and on client restart it began downloading files from remote. I then deleted the resulting SFConflict files, turned off automati sync, and synced manually. Everything fine. I re-started the client to simulate a fresh instance like the one I got after reboot, but this time I didn't get any trouble on manyal sync. Even adding files to the local folder in between the client restart did not trigger the error, so I cannot consistently reproduce.

Hope this helps, I can do other tests if needed. By the way, I have another library that I manually sync, but it never gave this problem; although I always sync it after the troublesome one and so it could just be that the first manually synced triggers the problem, this could point towards a problem with the library itself.

@feiniks
Copy link
Contributor

feiniks commented Apr 6, 2023

@nicolamori Thank you for your feedback. I think I know where the problem is. It should be caused by not passing in a password when re-sync the encrypted library during the deletion confirmation process. I will fix this problem in the next release.

@nicolamori
Copy link
Author

@feiniks It happened again with 9.0.1. I didn't expect this since you wrote that you would fix the problem in the "next release", and at that time I was on 8.0.10. Maybe the fix wasn't included in 9.0.1? If yes, in which version do you plan to include it?

@feiniks
Copy link
Contributor

feiniks commented Apr 28, 2023

The next release is 9.0.2

@killing killing closed this as completed Jun 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants