Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library scan: ingore hidden files (and thus osx resource forks) #2074

Closed
lorenrogers opened this issue Oct 27, 2016 · 27 comments
Closed

Library scan: ingore hidden files (and thus osx resource forks) #2074

lorenrogers opened this issue Oct 27, 2016 · 27 comments
Labels

Comments

@lorenrogers
Copy link

I've got a large library I keep on my NAS, and I access it over a share like AFP, SMB, or SSHFS. (At the moment, I'm using SSHFS, but the issue applies to all.) When I open QL, it starts an auto scan of my library, which is great. The problem is, when it gets to the end, it hangs for about 2 minutes. Interacting with it does nothing, and it looks like it crashed. I imagine this has something to do with the database updating the lists in the UI.

If I pause the scan, start playing something, then unpause it, everything is fine. I assume the UI still locks up, but it continues playing whatever I'd started.

I'm running Debian 8, with QL version 3.2.2.

@lazka
Copy link
Member

lazka commented Oct 27, 2016

Please run the following command python2 -m cProfile -o prof.out $(which quodlibet). Wait until QL settles after scanning has finished and quit the program. This should result in a file "prof.out" being generated which you can send me (reiter.christoph@gmail.com)

You could also try a newer version (3.7.1 currently) through our debian repo: https://quodlibet.readthedocs.io/en/latest/downloads.html#debian

@lazka lazka added the needinfo label Oct 27, 2016
@lorenrogers
Copy link
Author

Profile sent - I'll try the updated version and send a profile of that as well. I believe I was running the QL in the debian repo.

@lazka
Copy link
Member

lazka commented Oct 27, 2016

Thanks. 26 sec is spend in os.listdir and 14 sec in os.stat. Not much we can do (besides you disabling auto scan in the settings)

https://pypi.python.org/pypi/scandir would be a solution, but that's not in debian

@lorenrogers
Copy link
Author

Just confirmed that the issue still exists in 3.7.1. I'll send a profile for that too.

I'm also getting a lot of errors in the terminal that look like this:

E: 24.576: formats._misc.MusicFile: flac.py:904:__check_header: AudioFileError: '/home/lorentrogers/mount/music/Jazz/Nat King Cole/Giants Of Jazz/._23. Exactly Like You.flac' is not a valid FLAC file

Unlikely to be related right?


Audio device: GStreamer
Python: 2.7.9
Mutagen: 1.34.1
GTK+: 3.14.5 (X11)
PyGObject: 3.14.0
GStreamer: 1.4.4.0

@lorenrogers
Copy link
Author

Actually, this issue seems to freeze my entire window manager. (I'm using i3)

@lazka
Copy link
Member

lazka commented Oct 27, 2016

Just confirmed that the issue still exists in 3.7.1. I'll send a profile for that too.

Thanks

I'm also getting a lot of errors in the terminal that look like this:

E: 24.576: formats._misc.MusicFile: flac.py:904:__check_header: AudioFileError: '/home/lorentrogers/mount/music/Jazz/Nat King Cole/Giants Of Jazz/._23. Exactly Like You.flac' is not a valid FLAC file

Unlikely to be related right?

Hm, I see it tries to load 901 songs for whatever reason but failing rather quickly. Do you know where these hidden files are coming from? Might be a similar to #914

The second profile for some reason locks in some gtk stuff, which suggests that QL is overwhelming the system somehow and your WM freezes.

@lorenrogers
Copy link
Author

Hmm. Watching the output more carefully, I see that the errors are all dumped at the end, right when it freezes.

Looking at the files referenced in that last error, it seems like they're some sort of mac-generated file. They're mostly a binary, but I'm able to see "Mac OS X", "ATTR", "DEPRIMARY", and "This resource fork intentionally left blank". These are mixed in with a bunch of binary blocks. No idea where these files came from -- they may have been in there when I got them.

I'm also getting this error throughout my logs:

E: 16.645: formats._misc.MusicFile: __init__.py:387:__init__: AudioFileError: can't sync to MPEG frame

I'm not sure how many error lines are being printed, but it seemed like it could be about 901.

@lazka
Copy link
Member

lazka commented Oct 27, 2016

If you are confident that you don't need these files please try to delete them. If that helps we should look into fixing #914 ..

@lorenrogers
Copy link
Author

I deleted all the offending .flac files:

find . -type f -name '._*.flac' -delete

Not sure if it helped, but it may have paused a little less. I'm not getting the "not a FLAC file" error now, but I'm still getting a bunch of these:

E: 23.080: formats._misc.MusicFile: __init__.py:387:__init__: AudioFileError: can't sync to MPEG frame

@lazka
Copy link
Member

lazka commented Oct 27, 2016

There should also be around 400 mp3 files with similar problems (maybe do the same with *.mp3?)

@kaffeeundsalz
Copy link

These ._files are actually the resource fork parts of their originals and get written by macOS whenever it wants to store file metadata on non-HFS+ filesystems, such as it happens with network shares via SMB. On HFS+ volumes, this data gets stuffed into the extended file attributes. Further reading: https://en.wikipedia.org/wiki/Resource_fork

QuodLibet presumably doesn't recognize these as being resource forks and treats them like actual data files, thus produces the errors in the log.

@lazka lazka added bug and removed needinfo labels Oct 27, 2016
@lazka lazka changed the title UI blocked while scanning Library scan: ingore hidden files (and thus osx resource forks) Oct 27, 2016
@lorenrogers
Copy link
Author

This is definitely the issue. After deleting those fake .mp3 files, the hang at the end of the scan is way shorter. It looks like I've got a bunch of audio files that are actually problematic, but I'll look into that separately.

Also, is it worth opening a new issue requesting better error logs for non-flac files? I'm still getting a bunch of the MPEG errors, along with a few of these:

E: 9.338: formats._misc.MusicFile: monkeysaudio.py:50:__init__: AudioFileError: not a Monkey's Audio file

It would be super helpful if it told me what file was the issue. At this point, I have no idea.

@lazka
Copy link
Member

lazka commented Oct 28, 2016

Yeah, that could be improved. You might get more info if you start with "--debug". If you find any files which work in other players but fail to load in QL feel free to send them to me.

@lazka
Copy link
Member

lazka commented Oct 28, 2016

Also, is it worth opening a new issue requesting better error logs for non-flac files?

I think that's covered by #2077

@lorenrogers
Copy link
Author

Any suggestions on how ignored files should be specified? I was thinking of something like .gitignore, but with a default file referenced from the system installation. If there's a .quodlibetignore file in ~/, it would override the system one. (Which would include things like resource forks.)

@kaffeeundsalz
Copy link

Are you talking about how QL should technically handle ignored files – or how users should specify them? If the latter is the case, I'd opt for a comma-separated list in the preferences rather than having users create config files manually.

@lorenrogers
Copy link
Author

Hmm. I guess I'm not totally sure which one. There are two things happening here, as I see it. One is that users should be able to specify ignored files explicitly, and I agree that having a list in the settings would be a logical way to do that.

The other thing happening is that there are going to be files that QL should ignore by default, but may need to be overridden (like the resource forks). Although unlikely, some folks may want to name their files in a way that matches the files we think QL should ignore by default. In this case, they should be free to disable that section of the ignore rules.

As I see it, this will be a rarely used thing. For the most part, users would just keep everything as the defaults. However it's configured wouldn't matter. But, for those who are interested in adding / removing specific lines, a default dotfile config override pattern may make sense. At least, it's the first thing that came to my mind.

For example, we may have a default ignore file on the system that looks like this:

._*.flac
._*.FLAC
._*.mp3
._*.MP3
(etc...)

Then, when the program loads, if there's no ignore file in ~/, it uses that default one. If someone wants to have custom file ignore rules (or if the system offers to ignore them if they're corrupted,) they could copy that file to ~/.quadlibetignore (or whatever it's called) and modify it as needed:

._*.flac
._*.FLAC
._*.MP3
/home/user/music/corrupt_files/*
/home/user/music/even_more_corrupt_files/*
(etc...)

(Note the ._*.mp3 was removed.)

Anyway, sorry for the long comment, that was just my thought.

@declension
Copy link
Member

Perhaps let's start with the simple / most consistent with the rest of QL:

  • A config item entry consisting of a newline-separated glob (as above) or regex (always confusing with files, but more powerful)
  • ...that is editable in the prefs directly at first
  • ...and once stable has a prefs widget.

Ideally it would use the existing library.exclude settings, it's just that migrating that could be tricky. On examination, it's currently a list of strings delimited on : that form string prefixes with which to exclude full paths. @lazka we have config migration methods now though I guess... right?

I suppose it could just rewrite each entry ex as r'^%s' % re_escape(ex) and do a re.search to keep existing functionality.

@lazka
Copy link
Member

lazka commented Nov 2, 2016

I don't see the use case for ignoring anything besides hidden files and I don't see the use case for having hidden files you would want to import. So I'd just hardcode ignoring hidden files.

If you have other "corrupt" files this is either a bug in QL and should be fixed so they can be loaded or they are really corrupt and should be deleted.

@declension
Copy link
Member

That's definitely a lot simpler yes.

And to confirm, by hidden we're talking files prefixed with ., right (no Windows / xattr FS magic)

@lazka
Copy link
Member

lazka commented Nov 2, 2016

And to confirm, by hidden we're talking files prefixed with ., right (no Windows / xattr FS magic)

Yeah. But I guess there is no drawback in supporting native variants of the "hidden file" concept as well (GetFileAttributesW() on Windows, resource forks on OSX). The "." prefix should probably supported everywhere since it being part of the name tends to spread to all platforms.

@kaffeeundsalz
Copy link

I can think of many use cases where custom ignore patterns would be useful, but I'll just stick with my personal situation: I professionally work with audio files, thus have a bunch of .wav files in my user directory. I never use .wav for my music library, but I keep feeding QL my home directory because the rest of my music is somewhat scattered across many different folders. A simple ignore pattern for *.wav would help me here to avoid having my work files pop up in QL.

There's more to it. What if a user wants to manage only lossless files with QL? What if, for whatever reason, someone likes to keep their music files prefixed with a dot (which is something @lorentrogers already pointed out)?

Sure, there are still other ways to achieve this. I can of course use Search Library to hide .wav files, but this permanently clutters up my search term. And I can of course point QL only to specific folders, but this would mean a lot of work instead of just telling it to ignore certain files. Talk about making things simpler.

One of the things I like about QL is that it is much more customizable than other music players/managers. As it says on the website, "it's designed around the idea that you know how to organize your music better than we do." Please keep it that way.

@declension
Copy link
Member

@kaffeeundsalz fair point.
As an aside though, I do the same thing for WAVs with a global filter (Prefs -> Browser -> Search): &(otherstuff, ~format=!wav)

@lazka
Copy link
Member

lazka commented Nov 2, 2016

Ok, fair enough. -> #2078

@lorenrogers
Copy link
Author

@kaffeeundsalz Agreed - there are lots of situations where I'd want custom rules.

I'll close out this issue for now. The original issue has been solved (I've got corrupt files), and #2078 covers the request for custom ignore filters. We should continue the discussion over there.

@lazka
Copy link
Member

lazka commented Nov 2, 2016

I'd like to track this separately.

@lazka lazka reopened this Nov 2, 2016
@lazka lazka closed this as completed in ad07873 Nov 17, 2016
@urielz
Copy link
Contributor

urielz commented Dec 15, 2017

This is a closed thread but I thought of adding a bit of relevant info in case someone runs into this issue in the future:

Like @saucemcboss, I also have a part of my music library on a NAS and I use SSHFS to mount the directory. A way to prevent the creation of those ._* files is to SSHFS with the following arguments:

sshfs usr@server:/path mnt_path -o noapplexattr,noappledouble

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants