A problem I'm running into is files that are part of a series but the series number is in the middle.
Title Name part 1 - Author Album etc.ext
Title Name part 2 - Author Album etc.ext
Title Name part 3 - Author Album etc.ext
These clearly aren't duplicates but are hard to filter out without a near 100% hardness level. There are a few of these scattered throughout the list of matches that are hard to find manually. What would really help is one of the following sort options to isolate these:
Column to sort any files with numbers in their name together
This is the simplest method as it merely separates any file with numerical content so they can be easily combed through manually.
Column to sort any files where dupes differ only in number
A slightly more targeted approach would be instead of sorting by any file with numbers sort files where the number is the only differing word in the duplicate list. This would quickly separate out file series' from regular dupes.
You could also include logic to capture other number formats (1 2 3, one two three, I II II, etc).
Numbered file series' are pretty common and people may place the number anywhere. This would save a ton of manual searching to be able to sort these together using either of these two methods.
I was about to suggest enabling "Use regular expressions when filtering", which would have allowed you to use the filter box with something like part \d+. That would have been a solution similar to your first proposition, working right out of the box.
However, as I tried it myself, I noticed that the regex feature seem broken.
I'll check this out soon and see if there's a bug that needs fixing. But normally, that should help you solve your problem.
Ah, sweet, thanks. Let me know when when it's updated.
The Regexp filtering option is fine, it's just that its activation is reversed under Windows and Linux. So until the next bugfix release, you have to uncheck the "Use regular expressions when filtering" checkbox to enable the feature.
So my first recommendation still stands: If you enable regex filtering, you could search for something like parts \d+ to quickly locate the dupes you're looking for.
Ah, ok. I'm on mac so it should be working like normal. So it seems to be working with parts \d+ but for some reason the regex doesn't work alone \d+. I want to use the regex alone because some multi part files simply have a number and don't actually say "part" or "parts". But when I type \d+ by itself it keeps showing all results including those with and without any digits as if the filter box was empty. Is this a bug or am I doing something wrong?
Also, is there any way to do boolean searches with the filter? I tried NOT and - before a term but it didn't exclude the term.
The search is done on the whole path, so if there's any number anywhere in the path, \d+ would always match. Your best resource here would be a tutorial on regexes, like http://www.regular-expressions.info/tutorial.html
Oh. In your documentation it says filename, not file path, you might want to update that if it is searching path as that could lead to accidents.
Is there a way to apply the filter to just the file names? Or should I open a new request? It seems like a genuinely useful feature if it's not already there.
Clarify documentation about results filtering
It wasn't clear that filtering was applied to whole paths.
You're right, it wasn't clear in the documentation that filters are applied to whole paths. I fixed in in the commit referenced above.
As for filtering just the filename, again, it's possible with some regex-foo, but I'll admit that it's not the easiest thing to do. For example, with the filter d+[^/]*$, we get all dupes with a digit in their filenames. We do this by saying "give me all paths with a digit not followed by a slash until the end". This is equivalent to a filename constraint.
Cool, thanks. Perhaps adding a page in the documentation for this and a few other common regex's would be useful if you don't want to add the specific feature. The first regex I could probably have done but the second one would probably take me a while to figure out and I'd still be hesitant about its reliability with a big set of files I'm about to delete.
A link in the preferences next to the regex option would help people to find this list, too.
Yes, I agree the documentation can be improved further to give example. I'm changing the scope of this ticket to a Documentation ticket with the goal of letting the next user with the same needs as you to have a smoother ride :)
Hey, I finally got around to trying out the d+[^/]*$ you suggested and I'm not sure if I'm using it wrong but it doesn't seem to be working. You can see here it returns some items that don't have any numbers in either the file name or the path:
Oh! sorry, I mistyped my example and left out the leading backslash. The correct expression is \d+[^/]*$!