-
-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scanning loop #1164
Comments
Very weird! I never saw this, and cannot think of a way for this to be triggered in the current scanner code.... Unfortunatelly without a way to reproduce this, it will be impossible to figure this out... Is there any other message between these "Skipping DirEntry" messages? |
Hey, no, all the log messages were identical, with what looks like 4 messages every 0.001 seconds. I'm not an ELK expert so I don't know how to enumerate through them all for differences, but I looked at 5 in a row, and then 5 random ones, and they were all the same. My volume source data is mounted from a SAMBA share. Could this bug be triggered if the file share hiccuped during a scan? |
Maybe.... But I checked the code and it does not make sense.... Well, as we can't reproduce it, I'll close this for now, but please let me know if you have any other related issue. Thanks! |
Can confirm this is occurring in 0.44.1 at random. As above, stopping and restarting navidrome will continue scanning until the error happens again. It happens on different directories each time. When this happens, kernel reports: CIFS VFS: Send error in read = -4 Drive was mounted using SMB protocol 1. Tried SMB2 and SMB3 - same problem. Looks related to golang/go#38836. If the same share is mounted with NFS, scanning completes successfully. |
I noticed that when Navidrome was stuck in the scanning loop, every other docker container also reported an error accessing files, pointing to an issue with my SMB host (though as the SMB share seemed fine on observation, restarting all docker containers fixed this). Though while the other containers threw 1 error, Navidrome loops and throws many errors. |
@nellistc Yeah, it seems to be related to that issue, but it is not the same. That issue used to break Navidrome scanner in older versions, but it was fixed in Go 1.15 (we are now on Go 1.16). And the solution/workaround proposed in that issue is actually still in place in Navidrome's Dockerfile:
@talkingseedling, do you see this error when Navidrome is not scanning? |
@deluan Elastic reports the term "async" was never present in any log message If it helps, what is triggering this bug is when the docker container expects to find data in a volume mount, and the volume mount has failed in some way (different to the volume mount working, and a file just not being there). What I haven't tested yet, is if after the error happens, if the volume re-mounts and is viewable from "docker exec -it navidrome ls /music". My SMB server (where the docker volume mounts point) also auto-updates, and I have not had this error since 3rd June. |
Ok, I found and fixed the infinite loop. If the SMB fs fails when reading a directory, it will just skip that dir. To test it, you can use the Please let me know if this is working as expected. If not, feel free to reopen this issue. |
Apologies for the churn - I made a comment in the commit, but that doesn't show up at all on the mobile client nor did it tag this, so I'm replicating here for visibility @deluan The fix in eb8ffc6 will break the "find all subdirs when some are missing rx permissions" behavior fixed by #1054
#1164 reports a scenario where mounts appear to be broken; the golang behavior here doesn't seem right (it should "succeed" and not remain stuck on the same directoryentry), but the fix is not to bail on the first error - there may be more entries available with a properly functioning filesystem. The guts of this bug appear in this case limited to the unix os ( |
Are you sure? I just tested with this scenario:
After a clean scan, the Arctic Monkeys album does not appear in my albums, but the other two are there, together with all other albums in my library. Logs:
The fact that we are calling |
Yes, I am sure
Restoring
The "Skipping unreadable directory" entries are simply chowned 0 for navidrome's running group |
Hummm.... How to simulate this error then? What kind of permissions you have in place to cause this? Anyways, we cannot have the EDIT:
I can implement one of these solution, but I need to be able to reproduce it here. Or else, feel free to submit a PR. |
How does this look? lmk if you'd prefer a PR:
I have the opposite problem of replicating the broken filesystem...
|
Yeah, the code above may work. But not sure if you can compare if prevErr.Error() == err.Error() {
... By the way, why are you blocking ND's access to folders this way (like in the "Synth" folder)? Why not just remove read access? Aren't we try to put in place a workaround for an issue that may not exist, given that you could just |
I believe equality should work; it's normally not recommended precisely because it includes "too specific" information. If you would prefer string comparisons, which I agree are perhaps less ambiguous, perhaps this is clearer:
|
As to the "why" question... |
Big thanks! |
Hi, is this fix already in the latest release or should i stay on |
It was not released yet, so if you need this, please keep using the |
👌 |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Description
Error while scanning folders for new music. Navidrome looped indefinitely over the same folder, causing high CPU usage and strerr log generation until manual Navidrome restart.
Expected Behaviour
Successful scan of a music folder, or error then move on.
Steps to reproduce
Problem started with the automated scheduled scan.
A restart corrected the problem, not reproducible.
Platform information
Additional information
Discovered with the large amount of logs which Navidrome was pushing to the ELK logging server, at a rate of 18,000 log entries every 5 seconds, totaling 80 million logs before manual Navidrome shutdown.
The log file:
The message in all 80 million log files is exactly the same:
time="2021-06-03T02:54:33Z" level=warning msg="Skipping DirEntry" error="readdirent /music/Music/Ed Kuepper/Reflections of Ol' Golden Eye: no such file or directory"
After restarting Navidrome, all folders were successfully scanned.
The text was updated successfully, but these errors were encountered: