New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library scan: ingore hidden files (and thus osx resource forks) #2074

Closed
saucemcboss opened this Issue Oct 27, 2016 · 27 comments

Comments

Projects
None yet
5 participants
@saucemcboss

saucemcboss commented Oct 27, 2016

I've got a large library I keep on my NAS, and I access it over a share like AFP, SMB, or SSHFS. (At the moment, I'm using SSHFS, but the issue applies to all.) When I open QL, it starts an auto scan of my library, which is great. The problem is, when it gets to the end, it hangs for about 2 minutes. Interacting with it does nothing, and it looks like it crashed. I imagine this has something to do with the database updating the lists in the UI.

If I pause the scan, start playing something, then unpause it, everything is fine. I assume the UI still locks up, but it continues playing whatever I'd started.

I'm running Debian 8, with QL version 3.2.2.

@lazka

This comment has been minimized.

Member

lazka commented Oct 27, 2016

Please run the following command python2 -m cProfile -o prof.out $(which quodlibet). Wait until QL settles after scanning has finished and quit the program. This should result in a file "prof.out" being generated which you can send me (reiter.christoph@gmail.com)

You could also try a newer version (3.7.1 currently) through our debian repo: https://quodlibet.readthedocs.io/en/latest/downloads.html#debian

@lazka lazka added the needinfo label Oct 27, 2016

@saucemcboss

This comment has been minimized.

saucemcboss commented Oct 27, 2016

Profile sent - I'll try the updated version and send a profile of that as well. I believe I was running the QL in the debian repo.

@lazka

This comment has been minimized.

Member

lazka commented Oct 27, 2016

Thanks. 26 sec is spend in os.listdir and 14 sec in os.stat. Not much we can do (besides you disabling auto scan in the settings)

https://pypi.python.org/pypi/scandir would be a solution, but that's not in debian

@saucemcboss

This comment has been minimized.

saucemcboss commented Oct 27, 2016

Just confirmed that the issue still exists in 3.7.1. I'll send a profile for that too.

I'm also getting a lot of errors in the terminal that look like this:

E: 24.576: formats._misc.MusicFile: flac.py:904:__check_header: AudioFileError: '/home/lorentrogers/mount/music/Jazz/Nat King Cole/Giants Of Jazz/._23. Exactly Like You.flac' is not a valid FLAC file

Unlikely to be related right?


Audio device: GStreamer
Python: 2.7.9
Mutagen: 1.34.1
GTK+: 3.14.5 (X11)
PyGObject: 3.14.0
GStreamer: 1.4.4.0

@saucemcboss

This comment has been minimized.

saucemcboss commented Oct 27, 2016

Actually, this issue seems to freeze my entire window manager. (I'm using i3)

@lazka

This comment has been minimized.

Member

lazka commented Oct 27, 2016

Just confirmed that the issue still exists in 3.7.1. I'll send a profile for that too.

Thanks

I'm also getting a lot of errors in the terminal that look like this:

E: 24.576: formats._misc.MusicFile: flac.py:904:__check_header: AudioFileError: '/home/lorentrogers/mount/music/Jazz/Nat King Cole/Giants Of Jazz/._23. Exactly Like You.flac' is not a valid FLAC file

Unlikely to be related right?

Hm, I see it tries to load 901 songs for whatever reason but failing rather quickly. Do you know where these hidden files are coming from? Might be a similar to #914

The second profile for some reason locks in some gtk stuff, which suggests that QL is overwhelming the system somehow and your WM freezes.

@saucemcboss

This comment has been minimized.

saucemcboss commented Oct 27, 2016

Hmm. Watching the output more carefully, I see that the errors are all dumped at the end, right when it freezes.

Looking at the files referenced in that last error, it seems like they're some sort of mac-generated file. They're mostly a binary, but I'm able to see "Mac OS X", "ATTR", "DEPRIMARY", and "This resource fork intentionally left blank". These are mixed in with a bunch of binary blocks. No idea where these files came from -- they may have been in there when I got them.

I'm also getting this error throughout my logs:

E: 16.645: formats._misc.MusicFile: __init__.py:387:__init__: AudioFileError: can't sync to MPEG frame

I'm not sure how many error lines are being printed, but it seemed like it could be about 901.

@lazka

This comment has been minimized.

Member

lazka commented Oct 27, 2016

If you are confident that you don't need these files please try to delete them. If that helps we should look into fixing #914 ..

@saucemcboss

This comment has been minimized.

saucemcboss commented Oct 27, 2016

I deleted all the offending .flac files:

find . -type f -name '._*.flac' -delete

Not sure if it helped, but it may have paused a little less. I'm not getting the "not a FLAC file" error now, but I'm still getting a bunch of these:

E: 23.080: formats._misc.MusicFile: __init__.py:387:__init__: AudioFileError: can't sync to MPEG frame
@lazka

This comment has been minimized.

Member

lazka commented Oct 27, 2016

There should also be around 400 mp3 files with similar problems (maybe do the same with *.mp3?)

@kaffeeundsalz

This comment has been minimized.

kaffeeundsalz commented Oct 27, 2016

These ._files are actually the resource fork parts of their originals and get written by macOS whenever it wants to store file metadata on non-HFS+ filesystems, such as it happens with network shares via SMB. On HFS+ volumes, this data gets stuffed into the extended file attributes. Further reading: https://en.wikipedia.org/wiki/Resource_fork

QuodLibet presumably doesn't recognize these as being resource forks and treats them like actual data files, thus produces the errors in the log.

@lazka lazka added bug and removed needinfo labels Oct 27, 2016

@lazka lazka changed the title from UI blocked while scanning to Library scan: ingore hidden files (and thus osx resource forks) Oct 27, 2016

@saucemcboss

This comment has been minimized.

saucemcboss commented Oct 28, 2016

This is definitely the issue. After deleting those fake .mp3 files, the hang at the end of the scan is way shorter. It looks like I've got a bunch of audio files that are actually problematic, but I'll look into that separately.

Also, is it worth opening a new issue requesting better error logs for non-flac files? I'm still getting a bunch of the MPEG errors, along with a few of these:

E: 9.338: formats._misc.MusicFile: monkeysaudio.py:50:__init__: AudioFileError: not a Monkey's Audio file

It would be super helpful if it told me what file was the issue. At this point, I have no idea.

@lazka

This comment has been minimized.

Member

lazka commented Oct 28, 2016

Yeah, that could be improved. You might get more info if you start with "--debug". If you find any files which work in other players but fail to load in QL feel free to send them to me.

@lazka

This comment has been minimized.

Member

lazka commented Oct 28, 2016

Also, is it worth opening a new issue requesting better error logs for non-flac files?

I think that's covered by #2077

@saucemcboss

This comment has been minimized.

saucemcboss commented Nov 1, 2016

Any suggestions on how ignored files should be specified? I was thinking of something like .gitignore, but with a default file referenced from the system installation. If there's a .quodlibetignore file in ~/, it would override the system one. (Which would include things like resource forks.)

@kaffeeundsalz

This comment has been minimized.

kaffeeundsalz commented Nov 1, 2016

Are you talking about how QL should technically handle ignored files – or how users should specify them? If the latter is the case, I'd opt for a comma-separated list in the preferences rather than having users create config files manually.

@saucemcboss

This comment has been minimized.

saucemcboss commented Nov 2, 2016

Hmm. I guess I'm not totally sure which one. There are two things happening here, as I see it. One is that users should be able to specify ignored files explicitly, and I agree that having a list in the settings would be a logical way to do that.

The other thing happening is that there are going to be files that QL should ignore by default, but may need to be overridden (like the resource forks). Although unlikely, some folks may want to name their files in a way that matches the files we think QL should ignore by default. In this case, they should be free to disable that section of the ignore rules.

As I see it, this will be a rarely used thing. For the most part, users would just keep everything as the defaults. However it's configured wouldn't matter. But, for those who are interested in adding / removing specific lines, a default dotfile config override pattern may make sense. At least, it's the first thing that came to my mind.

For example, we may have a default ignore file on the system that looks like this:

._*.flac
._*.FLAC
._*.mp3
._*.MP3
(etc...)

Then, when the program loads, if there's no ignore file in ~/, it uses that default one. If someone wants to have custom file ignore rules (or if the system offers to ignore them if they're corrupted,) they could copy that file to ~/.quadlibetignore (or whatever it's called) and modify it as needed:

._*.flac
._*.FLAC
._*.MP3
/home/user/music/corrupt_files/*
/home/user/music/even_more_corrupt_files/*
(etc...)

(Note the ._*.mp3 was removed.)

Anyway, sorry for the long comment, that was just my thought.

@declension

This comment has been minimized.

Member

declension commented Nov 2, 2016

Perhaps let's start with the simple / most consistent with the rest of QL:

  • A config item entry consisting of a newline-separated glob (as above) or regex (always confusing with files, but more powerful)
  • ...that is editable in the prefs directly at first
  • ...and once stable has a prefs widget.

Ideally it would use the existing library.exclude settings, it's just that migrating that could be tricky. On examination, it's currently a list of strings delimited on : that form string prefixes with which to exclude full paths. @lazka we have config migration methods now though I guess... right?

I suppose it could just rewrite each entry ex as r'^%s' % re_escape(ex) and do a re.search to keep existing functionality.

@lazka

This comment has been minimized.

Member

lazka commented Nov 2, 2016

I don't see the use case for ignoring anything besides hidden files and I don't see the use case for having hidden files you would want to import. So I'd just hardcode ignoring hidden files.

If you have other "corrupt" files this is either a bug in QL and should be fixed so they can be loaded or they are really corrupt and should be deleted.

@declension

This comment has been minimized.

Member

declension commented Nov 2, 2016

That's definitely a lot simpler yes.

And to confirm, by hidden we're talking files prefixed with ., right (no Windows / xattr FS magic)

@lazka

This comment has been minimized.

Member

lazka commented Nov 2, 2016

And to confirm, by hidden we're talking files prefixed with ., right (no Windows / xattr FS magic)

Yeah. But I guess there is no drawback in supporting native variants of the "hidden file" concept as well (GetFileAttributesW() on Windows, resource forks on OSX). The "." prefix should probably supported everywhere since it being part of the name tends to spread to all platforms.

@kaffeeundsalz

This comment has been minimized.

kaffeeundsalz commented Nov 2, 2016

I can think of many use cases where custom ignore patterns would be useful, but I'll just stick with my personal situation: I professionally work with audio files, thus have a bunch of .wav files in my user directory. I never use .wav for my music library, but I keep feeding QL my home directory because the rest of my music is somewhat scattered across many different folders. A simple ignore pattern for *.wav would help me here to avoid having my work files pop up in QL.

There's more to it. What if a user wants to manage only lossless files with QL? What if, for whatever reason, someone likes to keep their music files prefixed with a dot (which is something @lorentrogers already pointed out)?

Sure, there are still other ways to achieve this. I can of course use Search Library to hide .wav files, but this permanently clutters up my search term. And I can of course point QL only to specific folders, but this would mean a lot of work instead of just telling it to ignore certain files. Talk about making things simpler.

One of the things I like about QL is that it is much more customizable than other music players/managers. As it says on the website, "it's designed around the idea that you know how to organize your music better than we do." Please keep it that way.

@declension

This comment has been minimized.

Member

declension commented Nov 2, 2016

@kaffeeundsalz fair point.
As an aside though, I do the same thing for WAVs with a global filter (Prefs -> Browser -> Search): &(otherstuff, ~format=!wav)

@lazka

This comment has been minimized.

Member

lazka commented Nov 2, 2016

Ok, fair enough. -> #2078

@saucemcboss

This comment has been minimized.

saucemcboss commented Nov 2, 2016

@kaffeeundsalz Agreed - there are lots of situations where I'd want custom rules.

I'll close out this issue for now. The original issue has been solved (I've got corrupt files), and #2078 covers the request for custom ignore filters. We should continue the discussion over there.

@saucemcboss saucemcboss closed this Nov 2, 2016

@lazka

This comment has been minimized.

Member

lazka commented Nov 2, 2016

I'd like to track this separately.

@lazka lazka reopened this Nov 2, 2016

@lazka lazka closed this in ad07873 Nov 17, 2016

@urielz

This comment has been minimized.

Contributor

urielz commented Dec 15, 2017

This is a closed thread but I thought of adding a bit of relevant info in case someone runs into this issue in the future:

Like @saucemcboss, I also have a part of my music library on a NAS and I use SSHFS to mount the directory. A way to prevent the creation of those ._* files is to SSHFS with the following arguments:

sshfs usr@server:/path mnt_path -o noapplexattr,noappledouble

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment