-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decompose unicode paths and unidecode each component. Fixes #1087 #1159 #1541 #2286 #2303
Conversation
titles that decode to path separator. Added an empty .mp3 file with problematic characters for asciify_paths in id3v2 tags for future tests if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fantastic; thank you! ✨
I noticed that the PR adds a new test file, unicode_path_sep.mp3
, but it's not used anywhere that I can see. Any chance that was a mistake?
self.lib.replacements = [(re.compile(u'"'), u'q')] | ||
self.lib.replacements = [ | ||
(re.compile(u'"'), u'q'), | ||
(re.compile(os.path.sep), config['path_sep_replace'].get()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this addition to the replacements
necessary for this test? I don't quite see how it works…
It might also be nice to have a separate test just focused on the new functionality so we can see exactly what's being tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was needed since the test just operates on a string, not a path. As you point out and I notice in hindsight, this test doesn't actually cover the whole code path that I modified, so it's back to the drawing board.
The unicode_path_sep.mp3 was going to be used to make an end-to-end test of asciify_paths, but as a test newbie it was hard for me to figure out how to write the test, so I went with modifying the existing asciify test. If you can enlighten me as to what test frameworks are being used so I can look for documentation about them, I can try writing a more fully-featured test. |
Aha; thanks for explaining. How about this test? Nice and simple, and I think it tests the change you've made. def test_asciify_character_expanding_to_slash(self):
config['asciify_paths'] = True
self.lib.directory = b'lib'
self.lib.path_formats = [(u'default', u'$title')]
self.i.title = u'ab\xa2d'
self.assertEqual(self.i.destination(), np('lib/abC_d')) |
I'll throw it into my branch this week and finally get tests working
locally on my workstation, then update the PR. If this test is good, do
you still want the test .mp3 checked in?
|
Cool; thanks for looking into it! No—if this unit test suffices, there's no need for the new test MP3. |
I added the test you came up with off the top of your head because you're an old wizard from a kung fu movie and it passed all my tests; something was up with one of the travis tests, though. |
It me. Anyway, something weird is up with Travis… I'm not sure what to make of it, but I don't think it's related. But actually, the Windows build on AppVeyor turned up something interesting: https://ci.appveyor.com/project/beetbox/beets/build/1141/job/mncrvbw3y1425e4e In particular, the path still has backslashes where there were slashes introduced by "asciification." I think this means that the code should probably check for both Lines 74 to 77 in ae5e55c
Perhaps that should become a utility function? |
Yeah, the paths should all go through the same asciification process. The thing I'm not sure about (because I don't have quite an encyclopedic knowledge of the code), is at what stage the dbcore snippet comes into play - before or after the library asciify_paths bit? Is the output of dbcore used for subpath value, vice versa, or neither? |
Good question. It happens before. Here's the deal: dbcore is responsible for filling out templates and constructing paths in a general, abstract way; then the |
It sounds to me like the question is: do we expect that paths are always going to be calculated based on asciify_paths as they move around inside beets, or have that applied only at the last minute as the library is about to write to the file system? Right now it looks like beets considers the output of dbcore as the path as objects are slung around by the longshoremen, and then at the last minute it gets asciified as part of library. If my understanding is right, we've got three options: should asciify_paths be done in dbcore so everything is consistent all the time, which might open up more bugs later? *I can see a plugin that exports to a fat32 usb stick or something using asciify_paths, but leaving the main library on a unicode-aware volume. The other question is one of etiquette: barring option 3, do you want this pull request closed and a new one opened for the refactor, or keep working in this PR on my branch so context is kept? |
I'd agree with the longshoreman-slinging summary except for an important difference: I don't think anyone sees the direct output of dbcore's path formatting. For beets, all generated paths come from that So yes, let's make it a utility function! In fact, we probably want to use it for the plain %asciify path function too, also in About PR etiquette, I don't have a strong opinion either way, but I generally find it easiest to keep on the same PR to continue the thread instead of switching to a new one. |
* make asciifying handle both os.sep and os.altsep (testing needed as I don't have a windows box handy) * make %asciify{} use the same code path as the asciify_paths goop. * added a discrete test to %asciify{} so my life acts as a warning to others * changelog note now with 80% less antihistamine-induced runon sentences
Aside from the one travis thing which I believe is fixed in a @jrobeson PR (per IRC), this looks good to go. Third time's a charm? |
Yep, that'd be #2310. This looks great! Thank you for sticking with this. I'll merge this now. ✨ 🚀 All the folks watching the related tickets should get to thanking you profusely. |
if asciify_paths on: breakdown incoming paths, run unidecode on each one, substitute os.path.sep into beets.config['path_sep_replace'], recompose paths with os.path.sep.