New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
panic: device present in global list but missing as device/fileinfo entry when scanning folder #6855
Comments
The actual panic information is in the mentioned file. It's apparently been auto reported so in theory we're already aware, but to get an answer to what's going on please attach or paste at least the first 100 lines or so from that panic log here. |
panic-20200726-180245.reported.log |
Shared the wrong log, should be the right one now :-D |
Related to #6501, maybe even a duplicate. They happen under different circumstances, but the source of the problem might be the same. Something related I noticed: This panic can occur when a file info is present, but the block list missing. The db check/repair mechanism triggered on upgrade and periodically does not catch that, as it only loads truncated file infos. I think it would only be consequent if we went all the way loading full file infos there (it's already a heavy and ugly process, likely not getting much heavier like this). |
A panic loop for a broken fileinfo isn't great regardless of the root cause, really. I'm thinking we could legitimately just return these things as "nope, nothing in the database" and let the scanner pick it up as a new file. Worst case we might miss a delete and incorrectly resurrect a file, or otherwise cause a conflict copy. Still, a spurious conflict copy seems better than getting stuck like this. |
I am not 100% following everything, but I have noticed a large amount of conflict copies in my database, which might be related to this "bug" as well then. Is there anything I should do now to fix the issue on my machine, or should I wait for an update which might resolve the panic loop (and the sync conflicts)? |
Happened to me just now when I removed some devices from a receive only folder. |
Me too, report is at https://forum.syncthing.net/t/v1-8-0-rc-3-panic-during-upgrade-restart/15394 |
Ok, my second node in the cluster totally stopped after upgrading to v1.8.0-rc.3, it worked fine before on rc2. I don't know why it worked fine because only the versioning had been fixed between the releases... 😮 There are tons of panic logs containing the "same" content. Maybe the problem the node was heavily scanning one folder with ~226 MByte/Sec?! |
Reading the comments above, very interesting. To sum up my guess what's causing the trouble:
I'll backup the database of the affected node which constantly crashes now and try deleting the index-db.folder. It's looking very weird to me, because I just started off fresh from a working rc2 instance. |
Getting this on 1.7.1 on a new install. Extremely frustrating. |
@rcarmo Just reset your db and start afresh again. Don't reconfigure or restart Syncthing during heavy load e.g. initial scanning. This has worked for me a few days ago. First time, it scanned and I configured and restarted multiple times - panic. Second time, just let the Syncthing do its thing - up2date and fine after initial scan. |
Please do not advice to just (or otherwise) reset the db. That might be a solution of last resort, but the key is "last". For one we want to discover what's actually going on, so any info on setups, what you did or anything else that might be relevant would be great, for another there's other steps to take to try and remedy the problem which are less invasive than deleting the entire db (and might even provide some insight). |
Well, I have to admit that I have been eyeing on that reset db button (both on the desktop and on the phones), but I have been wanting to see where this leads so I am 100% with you on that it should be a last resort. The setup I have is:
I have not removed any devices, which the error "device present in global list but missing as device/fileinfo entry" sounds like to me... |
When I run syncthing from command line now it does sometimes completely scan local recieve-only-folders, except one, and some new items get synced over, but then it panics again. Not sure if something happens locally on the receive-machine or if it happens when it tries to sync from either one of the send-only devices... |
It looks like the messages changes a bit, thought I'd share it:
I looked up a couple of the items which Syncthing says "no connected device has the required version of this file" just to actually see if they exist on the send-only device, and they are there. I don't know what the "required version" is, but the originals exist. |
For reference, I had the exact same panic with 1.7.1 and a huge receive-only folder (~3.5TB, 12+ million files). Upgrading to a recent git version (after the commit referenced above : 1b9e5c0) made it possible to progress. The configuration is quite simple : 2 syncthing servers with a single folder shared as send only on one end and receive only on the other. The two sides were previously synced by other means, so each Syncthing had a full and mostly synced folder to begin with (the receiving side lacked recent modifications). Their history is a little more complicated and I didn't take notes. The panic with 1.7.1 on the receiving side occurred in a loop (approximately every 7 minutes) just after the initial first scan. I found this thread and built Syncthing from the git repository. The panic disappeared and the receiving side seems to be catching up now. The only odd thing is that the numbers of out of sync items reported in the interface don't match at all between the folder and the remote device. |
Yeah, I'm probably gonna wait until it gets incorporated into the stable channel. Should be an update coming soon, right? |
I found a/the problem (I hope the latter), and who would have thought: Truncation comes back to haunt us again (more precisely: it was never gone). Calling PR incoming. |
My syncthing got updated to syncthing v1.8.0 "Fermium Flea", but nothing changed for me there. Let me know if I indeed should build the git version to test if this is the problem. |
@HackaN The relevant new changes are at #6888. If, and only if, you have backups and feel comfortable testing not even fully reviewed yet code (i.e. don't mind much if you need to roll back to your backup), you are very welcome to try it out and let me know if it helped. A bit lower level of risk would be to wait until that is merged, though backups are obviously still necessary. This should also make it into next weeks RC. |
Alright, thank you for the infromative reply. I'll await until it's merged into the next update :-) |
* main: (368 commits) build: We now target Go 1.14 lib/fs: Disable ioctl on ppc (fixes syncthing#6898) (syncthing#6901) gui, man, authors: Update docs, translations, and contributors lib/dialer: Try dialing without reuse in parallel (fixes syncthing#6892) (syncthing#6893) cmd/stcrashreceiver: Don't crash on nil err all: Remove need to restart syncthing (syncthing#6883) lib/db: Don't put truncated files (ref syncthing#6855, ref syncthing#6501) (syncthing#6888) lib/osutil: Check returned error instead of info (ref syncthing#6885) (syncthing#6887) gui, man, authors: Update docs, translations, and contributors lib/osutil: Preserve perms in AtomicWriter (fixes #tbd) (syncthing#6885) lib/fs: Fix WatchRename test for FreeBSD (fixes syncthing#6613) lib/fs: Unwrap mtimeFile, get fd the "correct" way (ref syncthing#6875) (syncthing#6877) lib/model: Don't close file early (fixes syncthing#6875) (syncthing#6876) lib/fs: Unwrap mtimeFile, get fd the "correct" way (ref syncthing#6875) (syncthing#6877) gui, man, authors: Update docs, translations, and contributors lib/fs: Fix WatchRename test for FreeBSD (fixes syncthing#6613) lib/model: Don't close file early (fixes syncthing#6875) (syncthing#6876) lib/db: Log context on panic (syncthing#6872) gui: Don't show pull order on SO folders (ref syncthing#6807) (syncthing#6871) lib/model: Check folder error before sync-waiting (fixes syncthing#6793) (syncthing#6847) ...
Good news - I can now confirm that with version 1.9.0 all files, as far as I can tell, are being correctly synchronized :-) |
I should begin with saying that this might be a duplicate of #6457, but that report didn't educate me much.
I am running syncthing on Ubuntu 20.04 and parallel to a system update (before actually performing the update) I got an error from Syncthing GTK saying that the "daemon exited too fast" and wanted me to provide a path for daemon binary.
I ran syncthing from terminal and I get several panic messages. First I got a panic about Config file version (31) that was newer than the supported version (28). I followed an existing error report from the Synthing forum and switched out the Ubuntu repo version for the syncthing's own repo, uninstalled/installed and rebooted (just to be sure changes would take effect) and am now faced with a new set of errors:
[start] 18:02:42 INFO: syncthing v1.7.1 "Fermium Flea" (go1.14.4 linux-amd64) deb@build.syncthing.net 2020-07-11 18:17:41 UTC
[CLMWS] 18:02:45 INFO: Completed initial scan of receiveonly folder "hackan Pixel Kamera" (pixel_3xdn foton)
panic: device present in global list but missing as device/fileinfo entry
[monitor] 18:02:45 WARNING: Panic detected, writing to "/home/hackan/.config/syncthing/panic-20200726-180245.log"
[monitor] 18:02:45 WARNING: Please check for existing issues with similar panic message at https://github.com/syncthing/syncthing/issues/
[monitor] 18:02:45 WARNING: If no issue with similar panic message exists, please create a new issue with the panic log attached
[monitor] 18:02:45 INFO: Reporting crash found in panic-20200726-180245.log (report ID bfe3c0d1) ...
[monitor] 18:02:45 INFO: Syncthing exited: exit status 2
[monitor] 18:02:46 WARNING: 4 restarts in 16.671957674s; not retrying further
I have seen suggestions about removing the index-folder and sort of starting over from scratch, but I have a large library and want to try to figure this out before attempting that as a solution... if that even would be a solution.
Let me know what other information I can provide.
The text was updated successfully, but these errors were encountered: