Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upQuality-based Trumping during import #116
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 3, 2013
This would be superb. In my spare time I was going to try to write something like this.
hchapman
commented
Apr 3, 2013
|
This would be superb. In my spare time I was going to try to write something like this. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 3, 2013
Member
Awesome! Now that we have a central place in the importer workflow where replacement decisions are made, this should be relatively straightforward to implement -- it's just a matter of coming up with the right design for how it should work.
On Wed, Apr 3, 2013 at 9:47 AM, Harrison Chapman notifications@github.com
wrote:
This would be superb. In my spare time I was going to try to write something like this.
Reply to this email directly or view it on GitHub:
#116 (comment)
|
Awesome! Now that we have a central place in the importer workflow where replacement decisions are made, this should be relatively straightforward to implement -- it's just a matter of coming up with the right design for how it should work. On Wed, Apr 3, 2013 at 9:47 AM, Harrison Chapman notifications@github.com
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 7, 2013
This is my first working with beets, so mind how naive I am.
I've written a really silly (and still crufty) plugin which tries to use eyed3 to read the Xing header of the mp3 to determine the encoding preset (e.g. v0, cbr 320) and adds a $quality name template variable. It'd be nice to know the presets because if I were quality trumping, I'd personally take v0 over a 320, but a 320 over a v2, and I do not believe the ABR is enough information to do this reliably always.
I'd like to make this into a plugin, since right now it depends on eyed3, which I do not believe beets presently does. Any chance that the plugin docs are out of date, and that the database is now extensible by plugins? It would be nice to be able to beets ls quality:v0 or similar to help sort through the library without having to inspect every single mp3, and I don't if there is another way to implement this type of behavior.
I haven't given any look into modifying the replacement decisions via a plugin yet: my brief scan of the plugin docs didn't seem to include much information on this. I'll have to scan the source.
hchapman
commented
Apr 7, 2013
|
This is my first working with beets, so mind how naive I am. I've written a really silly (and still crufty) plugin which tries to use eyed3 to read the Xing header of the mp3 to determine the encoding preset (e.g. v0, cbr 320) and adds a $quality name template variable. It'd be nice to know the presets because if I were quality trumping, I'd personally take v0 over a 320, but a 320 over a v2, and I do not believe the ABR is enough information to do this reliably always. I'd like to make this into a plugin, since right now it depends on eyed3, which I do not believe beets presently does. Any chance that the plugin docs are out of date, and that the database is now extensible by plugins? It would be nice to be able to I haven't given any look into modifying the replacement decisions via a plugin yet: my brief scan of the plugin docs didn't seem to include much information on this. I'll have to scan the source. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mister-walter
Apr 7, 2013
Check out this thread on the Google Groups forum for information on storing arbitrary data in the beets library. Sampsyo indicated that such a feature (flexattrs) might be coming in beets version 1.2.
mister-walter
commented
Apr 7, 2013
|
Check out this thread on the Google Groups forum for information on storing arbitrary data in the beets library. Sampsyo indicated that such a feature (flexattrs) might be coming in beets version 1.2. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 8, 2013
Member
Very cool! Nice work, and not silly at all.
Regarding the dependency on eyed3: yes, we'll need to replace that with an equivalent parser for Mutagen/MediaFile. Do you know how eyed3 determines the LAME preset? Is it by parsing the "encoder" string? If so, then we can re-implement that in a fairly straightforward way.
As @mister-walter mentioned, yes, we're currently in the middle of building support for arbitrary metadata fields in the database. See also #101 for the current status of this feature. I'm planning to make it the first big feature in 1.2, which will start development in earnest soon when 1.1 is out the door (very nearly ready). This is yet another good use case for that feature.
|
Very cool! Nice work, and not silly at all. Regarding the dependency on eyed3: yes, we'll need to replace that with an equivalent parser for Mutagen/MediaFile. Do you know how eyed3 determines the LAME preset? Is it by parsing the "encoder" string? If so, then we can re-implement that in a fairly straightforward way. As @mister-walter mentioned, yes, we're currently in the middle of building support for arbitrary metadata fields in the database. See also #101 for the current status of this feature. I'm planning to make it the first big feature in 1.2, which will start development in earnest soon when 1.1 is out the door (very nearly ready). This is yet another good use case for that feature. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 8, 2013
I think eyed3 uses the xing header which is in the blank first frame of the mp3. Encoders like LAME use it, and possibly some others as well.
I don't know how eyed3 uses this data to determine/guess the encoding preset! It's open source though...
hchapman
commented
Apr 8, 2013
|
I think eyed3 uses the xing header which is in the blank first frame of the mp3. Encoders like LAME use it, and possibly some others as well. I don't know how eyed3 uses this data to determine/guess the encoding preset! It's open source though... |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 8, 2013
Member
Cool. That's worth investigating to see how complex the logic involved is.
|
Cool. That's worth investigating to see how complex the logic involved is. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 8, 2013
By lifting the LameHeader class and three utility functions from eyed3, I've added super rudimentary LAME header support to mutagen in this branch (super rudimentary means that it just tries to find the LAME header, and spits it out while it opens an mp3 file, with no actual functionality). It's understandable that this code isn't in mutagen, which doesn't seem to be concerned with this extra data (especially since it may or not have an equivalent in any other type of media file), but it was interesting to see how it's certainly not impossible to add it.
To test:
import mutagen.mp3
mutagen.mp3.Open(MP3_FILE_NAME)
hchapman
commented
Apr 8, 2013
|
By lifting the LameHeader class and three utility functions from eyed3, I've added super rudimentary LAME header support to mutagen in this branch (super rudimentary means that it just tries to find the LAME header, and spits it out while it opens an mp3 file, with no actual functionality). It's understandable that this code isn't in mutagen, which doesn't seem to be concerned with this extra data (especially since it may or not have an equivalent in any other type of media file), but it was interesting to see how it's certainly not impossible to add it. To test: import mutagen.mp3
mutagen.mp3.Open(MP3_FILE_NAME) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 8, 2013
Member
Cool; thanks. Looks like this header is actually substantially more complex than I originally imagined—the Xing header is yet another arcane binary format. FWIW, there's discussion (years old now) on the Mutagen bug tracker for including similar functionality:
http://code.google.com/p/mutagen/issues/detail?id=66
There's a patch but it needs some cleaning up. Maybe the best course of action here would be to pick up that patch, which seems to have been abandoned by the original author, clean it up, add tests, and land it in Mutagen for the next release. (The Mutagen maintainers have been pretty receptive lately.)
Who knew this little bit of data would be so onerous to recover?
|
Cool; thanks. Looks like this header is actually substantially more complex than I originally imagined—the Xing header is yet another arcane binary format. FWIW, there's discussion (years old now) on the Mutagen bug tracker for including similar functionality: There's a patch but it needs some cleaning up. Maybe the best course of action here would be to pick up that patch, which seems to have been abandoned by the original author, clean it up, add tests, and land it in Mutagen for the next release. (The Mutagen maintainers have been pretty receptive lately.) Who knew this little bit of data would be so onerous to recover? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 8, 2013
That explains a lot; I didn't even consider mutagen wouldn't be hosted on github, and I didn't think to look at googlecode/bitbucket. I'll give that a look over.
hchapman
commented
Apr 8, 2013
|
That explains a lot; I didn't even consider mutagen wouldn't be hosted on github, and I didn't think to look at googlecode/bitbucket. I'll give that a look over. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 11, 2013
The patch applies to present master branch! It works for one of my songs, but none of the test files included with mutagen at present have any preset data :(.
hchapman
commented
Apr 11, 2013
|
The patch applies to present master branch! It works for one of my songs, but none of the test files included with mutagen at present have any preset data :(. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 15, 2013
I've put the functionality merged into today's master/default (whatever it's called with hg!) with testing on my Bitbucket fork of mutagen. I've also told the mutagen issues tracker. I haven't heard anything on the issue yet, though.
hchapman
commented
Apr 15, 2013
|
I've put the functionality merged into today's master/default (whatever it's called with hg!) with testing on my Bitbucket fork of mutagen. I've also told the mutagen issues tracker. I haven't heard anything on the issue yet, though. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 15, 2013
Member
Awesome, thanks for taking action! Let's see what the authors think; if necessary, we can email the Quod Libet mailing list in a few days.
|
Awesome, thanks for taking action! Let's see what the authors think; if necessary, we can email the Quod Libet mailing list in a few days. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 17, 2013
Well- it does not seem like there is any "way" for a plugin to, at present, influence the importer in checking for duplicates, or even to modify the manual duplicate resolver in the UI (e.g. make it say "we have this album as MP3 in the database, but you're importing a FLAC"). Correct me if I'm wrong! I'm going to think about how to add this broad functionality in.
hchapman
commented
Apr 17, 2013
|
Well- it does not seem like there is any "way" for a plugin to, at present, influence the importer in checking for duplicates, or even to modify the manual duplicate resolver in the UI (e.g. make it say "we have this album as MP3 in the database, but you're importing a FLAC"). Correct me if I'm wrong! I'm going to think about how to add this broad functionality in. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 17, 2013
Member
That's correct; the hook doesn't exist yet. We'll need to add it in the resolve_duplicate method of ImportSession in importer.py.
|
That's correct; the hook doesn't exist yet. We'll need to add it in the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 17, 2013
Ideally the plugin (at least one which knows the user's preference to replace MP3 with FLAC) would be UI-agnostic, and as of now resolve_duplicate just throws a NotImplementedError (and the implementation in ui doesn't call the super method as of yet). My initial thought was to add the hook right before resolve_duplicate is called.
I'm not used to thinking this much about software infrastructure. It's fun! Please let me know if the issues thread is not an appropriate place for my musing...
hchapman
commented
Apr 17, 2013
|
Ideally the plugin (at least one which knows the user's preference to replace MP3 with FLAC) would be UI-agnostic, and as of now resolve_duplicate just throws a I'm not used to thinking this much about software infrastructure. It's fun! Please let me know if the issues thread is not an appropriate place for my musing... |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 17, 2013
Member
Yes, absolutely—you're right that invoking the plugin before the call to resolve_duplicate makes more sense. That way, subclasses can override that method without replacing this new functionality. Good call.
And yes, absolutely, this issue tracker is the perfect place for this kind of discussion! To me, this is more organized and therefore useful than the mailing list or ad-hoc emails... there's some hope of coming back to the discussion later when we need to remember the decisions we made.
|
Yes, absolutely—you're right that invoking the plugin before the call to And yes, absolutely, this issue tracker is the perfect place for this kind of discussion! To me, this is more organized and therefore useful than the mailing list or ad-hoc emails... there's some hope of coming back to the discussion later when we need to remember the decisions we made. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 23, 2013
Moral conundrum: I'm not used to parallel programming so... It'd be nice to add task duplicate data into Importer task, so that plugins could add or remove duplicates automatically by some score calculation... It's easy to me to just carry along duplicate data as, say, a list, rather than calling _check_duplicates twice. But this crumbles under the completely possible scenario where a user has e.g. FLAC and MP3 in the import query for the same album, and the user would like the FLAC to auto trump the MP3. But it's not clear to me that we can just reliably hold on to any ad-hoc album data added to recents in user_query and expect it to still be correct by the time the better quality comes through...
I originally hoped this wouldn't need any sweeping infrastructure changes. Just saying where I'm stumped, and how the coroutine solution is pretty brilliant and still magic to me..
hchapman
commented
Apr 23, 2013
|
Moral conundrum: I'm not used to parallel programming so... It'd be nice to add task duplicate data into Importer task, so that plugins could add or remove duplicates automatically by some score calculation... It's easy to me to just carry along duplicate data as, say, a list, rather than calling _check_duplicates twice. But this crumbles under the completely possible scenario where a user has e.g. FLAC and MP3 in the import query for the same album, and the user would like the FLAC to auto trump the MP3. But it's not clear to me that we can just reliably hold on to any ad-hoc album data added to recents in user_query and expect it to still be correct by the time the better quality comes through... I originally hoped this wouldn't need any sweeping infrastructure changes. Just saying where I'm stumped, and how the coroutine solution is pretty brilliant and still magic to me.. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 23, 2013
Member
Yes, the recent check is a complicated wrinkle in all of this.
One simplification that may help: all tasks hit the database in order. That means that, in the apply_choices coroutine, duplicate albums are already guaranteed to be in the database, even if they were just imported. That means that if the plugin can make all of its decisions there, without user input, then its job is simplified.
So a plugin could prompt the user (or not) based on the limited data available in recent during user_query and then do the heavyweight comparisons later, in apply_choices. Would that make sense?
|
Yes, the One simplification that may help: all tasks hit the database in order. That means that, in the So a plugin could prompt the user (or not) based on the limited data available in |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 24, 2013
Here's a branch with a preliminary solution. I've moved everything dealing with duplicates (including the user query...) to the apply_changes. It's pretty much entirely untested, and I'm presently not really happy with the implementation.
This beetsplug uses it and it also uses my branch of mutagen (about which I've still heard nothing, although I've still only just posted on the googlecode issue) to get preset data. It's a rudimentary quality trumper which prefers FLAC > V0 > 320 > [etc...]
hchapman
commented
Apr 24, 2013
|
Here's a branch with a preliminary solution. I've moved everything dealing with duplicates (including the user query...) to the This beetsplug uses it and it also uses my branch of mutagen (about which I've still heard nothing, although I've still only just posted on the googlecode issue) to get preset data. It's a rudimentary quality trumper which prefers FLAC > V0 > 320 > [etc...] |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 24, 2013
I've updated my branch so that all tests pass, but I haven't written any new tests. Curious question: there was an issue in _duplicate_check where cur_artist threw an AttributeError... But the same never happened in the item counterpart...
hchapman
commented
Apr 24, 2013
|
I've updated my branch so that all tests pass, but I haven't written any new tests. Curious question: there was an issue in |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 24, 2013
It just hit me why moving a user query for the duplicate check is not a good solution. I'm also trying to work in a score system for duplicate replacement calculation instead of a binary take-it-or-leave-it
hchapman
commented
Apr 24, 2013
|
It just hit me why moving a user query for the duplicate check is not a good solution. I'm also trying to work in a score system for duplicate replacement calculation instead of a binary take-it-or-leave-it |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 24, 2013
Member
Great; I'm glad that makes sense. Yes, all communication with the user has to be done in the same pipeline stage.
|
Great; I'm glad that makes sense. Yes, all communication with the user has to be done in the same pipeline stage. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 24, 2013
I've changed my strategy to a "trump score" system paralleling the distance that's in now for MB matching. Progress on this branch is preliminary, but works for a really rough manual test I've run myself. Automated tests don't fail, but there are no new ones to test my code.
hchapman
commented
Apr 24, 2013
|
I've changed my strategy to a "trump score" system paralleling the distance that's in now for MB matching. Progress on this branch is preliminary, but works for a really rough manual test I've run myself. Automated tests don't fail, but there are no new ones to test my code. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 24, 2013
Member
Interesting approach! I like the change of recent from a set to a dict so we keep around the album itself rather than just its identifier. We should definitely use this (but ensure that it doesn't leak memory...).
And the scoring system is interesting too, but I can also see a more specific interface (which might ask "given these albums, which is the best?" instead of "what's the score for this album?") being somewhat simpler. It would push some of the intelligence into the plugins rather than burdening the import flow more, which is already frighteningly complex. Does that make sense?
On Apr 24, 2013, at 1:33 PM, Harrison Chapman notifications@github.com wrote:
I've changed my strategy to a "trump score" system paralleling the distance that's in now for MB matching. Progress on this branch is preliminary, but works for a really rough manual test I've run myself. Automated tests don't fail, but there are no new ones to test my code.
—
Reply to this email directly or view it on GitHub.
|
Interesting approach! I like the change of And the scoring system is interesting too, but I can also see a more specific interface (which might ask "given these albums, which is the best?" instead of "what's the score for this album?") being somewhat simpler. It would push some of the intelligence into the plugins rather than burdening the import flow more, which is already frighteningly complex. Does that make sense? On Apr 24, 2013, at 1:33 PM, Harrison Chapman notifications@github.com wrote:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
Apr 24, 2013
The primary reason behind the score would be an (immutable, i.e. not changing after the tracks are imported into the library) number which wouldn't require holding on to any further data of items being imported. Perhaps plugins would be able to declare which properties about recent imports to hold on to? This doesn't really parallel anything in beets that I can think of at present however... Comparing relatively would involve holding on to some of this recent import information specifically, and since the changes haven't been applied yet, I'm not sure how messy this could get.
An example reason for this would be quality trumping-- right now quality trumping involves having the mutagen file info somewhere and I don't know for sure if the actual import/moving into library could cause some sort of rare race conditions when trying to grab/parse this information from tasks long finished.
Certainly with a score system the idea would be to present the user with a choice if the new import score doesn't exceed some threshold (which would ideally only happen if the new import completely eclipses an old; better quality, all the same tracks, same length, same release year...), perhaps with some delta information to explain to the user why we're scoring duplicate choices as such.
hchapman
commented
Apr 24, 2013
|
The primary reason behind the score would be an (immutable, i.e. not changing after the tracks are imported into the library) number which wouldn't require holding on to any further data of items being imported. Perhaps plugins would be able to declare which properties about recent imports to hold on to? This doesn't really parallel anything in beets that I can think of at present however... Comparing relatively would involve holding on to some of this recent import information specifically, and since the changes haven't been applied yet, I'm not sure how messy this could get. An example reason for this would be quality trumping-- right now quality trumping involves having the mutagen file info somewhere and I don't know for sure if the actual import/moving into library could cause some sort of rare race conditions when trying to grab/parse this information from tasks long finished. Certainly with a score system the idea would be to present the user with a choice if the new import score doesn't exceed some threshold (which would ideally only happen if the new import completely eclipses an old; better quality, all the same tracks, same length, same release year...), perhaps with some delta information to explain to the user why we're scoring duplicate choices as such. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Apr 24, 2013
Member
Scoring seems just fine as a general approach to this problem; I'm just questioning whether the logic that handles scores should be done in importer.py or in qualitytrump.py. If at all possible, it would be great to move as much as possible (e.g., finding the max, totaling up track_scores and album_scores, etc.) to plugins to simplify the importer itself as much as possible. Of course, it's possible there's no clean way to do that.
To avoid races, one approach you could take in the plugin is to calculate scores when import tasks are first created (handling the import_task_start event) so you have all scores available for later comparisons. You can keep an internal map from tasks to scores or just assign your scores right on the task objects (thanks, Python!).
|
Scoring seems just fine as a general approach to this problem; I'm just questioning whether the logic that handles scores should be done in importer.py or in qualitytrump.py. If at all possible, it would be great to move as much as possible (e.g., finding the max, totaling up track_scores and album_scores, etc.) to plugins to simplify the importer itself as much as possible. Of course, it's possible there's no clean way to do that. To avoid races, one approach you could take in the plugin is to calculate scores when import tasks are first created (handling the import_task_start event) so you have all scores available for later comparisons. You can keep an internal map from tasks to scores or just assign your scores right on the task objects (thanks, Python!). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
telejester
May 10, 2013
Trump attachment processing is a bit less open-ended by adding the semantic framework of #111. In my original ticket for Trumping on gcode, I mentioned image handling and cue and log files.
In that vein, I just added some more specific ideas to the attachment ticket (gcode 109). In the name of redundancy prevention, please see https://code.google.com/p/beets/issues/detail?id=109#c21
Trump processing needs to consider the attachments of each candidate album and item as they are moved. Images ought to be trumpable as well, with higher-res images replacing lower-res versions of the same image. I pointed at a resolution and scale-independent image comparison library which ought to help a lot in determining "sameness" although I can't vouch for how well it works in real life.
Cue and log files are interesting in that they are only valid if an album consists of the complete set of items they refer to. Any item or album trumping means the cue and log have to go, even if the new version doesn't have replacements.
telejester
commented
May 10, 2013
|
Trump attachment processing is a bit less open-ended by adding the semantic framework of #111. In my original ticket for Trumping on gcode, I mentioned image handling and cue and log files. In that vein, I just added some more specific ideas to the attachment ticket (gcode 109). In the name of redundancy prevention, please see https://code.google.com/p/beets/issues/detail?id=109#c21 Trump processing needs to consider the attachments of each candidate album and item as they are moved. Images ought to be trumpable as well, with higher-res images replacing lower-res versions of the same image. I pointed at a resolution and scale-independent image comparison library which ought to help a lot in determining "sameness" although I can't vouch for how well it works in real life. Cue and log files are interesting in that they are only valid if an album consists of the complete set of items they refer to. Any item or album trumping means the cue and log have to go, even if the new version doesn't have replacements. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
telejester
May 10, 2013
I also have no idea what to do with assets which have been trumped. I suppose the options are "delete" (which I find terrifying) and "mark and move", but to where? A "discard" path?
telejester
commented
May 10, 2013
|
I also have no idea what to do with assets which have been trumped. I suppose the options are "delete" (which I find terrifying) and "mark and move", but to where? A "discard" path? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hchapman
May 14, 2013
beets already seems to be content skipping/removing albums at user query, so perhaps the trumping could (possibly unless the user informs otherwise) provide a summary of/query for the destructive actions it will take at the end of an import? I am not entirely clear if this fits into the structure/morals of the project however.
hchapman
commented
May 14, 2013
|
beets already seems to be content skipping/removing albums at user query, so perhaps the trumping could (possibly unless the user informs otherwise) provide a summary of/query for the destructive actions it will take at the end of an import? I am not entirely clear if this fits into the structure/morals of the project however. |
sampsyo
referenced this issue
Jun 5, 2013
Closed
Show technical details when encountering duplicates #312
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
squisher
Nov 16, 2013
Ping, how come this patch is collecting dust? :-) I need something like this before I can attempt to actually put beets to use...
squisher
commented
Nov 16, 2013
|
Ping, how come this patch is collecting dust? :-) I need something like this before I can attempt to actually put beets to use... |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Nov 18, 2013
Member
There are a few outstanding issues awaiting a champion:
- Add LAME preset detection. This is actually orthogonal and should maybe have its own ticket.
- Add logic for short-circuiting the duplicate detection. This should be in core. In its most basic form, a config option makes the decision.
- Either in core or in a plugin, develop a flexible/complex set of rules for making decisions automatically. This can come after the basic work is done.
Please feel free to chip in on any of the steps.
|
There are a few outstanding issues awaiting a champion:
Please feel free to chip in on any of the steps. |
sampsyo
referenced this issue
Dec 8, 2013
Closed
duplicate album: return information about file format and encoding #467
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
daks
Dec 8, 2013
This issue (the original one on googlecode) resume exactly what I want. I'll track it.
daks
commented
Dec 8, 2013
|
This issue (the original one on googlecode) resume exactly what I want. I'll track it. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
This would be a very useful feature to have! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Alderian
Mar 10, 2014
Hi,
This would be very nice, I'm doing this manually with 'duplicates' plugin using "beet dup -F -f '$path $bitrate'"... and then a manual rm... very simple, but painful
It would be useful to have the list of duplicates any time an import finds them. I mean, program will find a file already in library and have the newcomer. What if the program just detail the tags, that are different, of the two of them, this way I could choose wisely. This could be done the same way the program tells whenever its correcting tags. Ie:
/foo/bar/Megadeth/Rust in Peace/06 Lucretia.mp3
Tagging track: Megadeth - Lucretia
(Similarity: 100.0%)
This item is already in the library!
Candidates:
- Old (mp3, 128kbps)(encoder, bitrate)
- New (mp3, 196Kbps)(encoder, bitrate)
1, 2 to keep only that file, **[S]**kip new, Keep all, Remove old?
I'm considering that I might want to have lots of versions of any file.
With that information, one can easily decide what to do... also, I guess that adding this would be easier, just to wait for the other options. I don't know how to program in python to add this functionality myself :( I might learn, fork and contribute :)
Just my 2 cents
Thanks for this great program...
Alderian
commented
Mar 10, 2014
|
Hi, This would be very nice, I'm doing this manually with 'duplicates' plugin using "beet dup -F -f '$path $bitrate'"... and then a manual rm... very simple, but painful It would be useful to have the list of duplicates any time an import finds them. I mean, program will find a file already in library and have the newcomer. What if the program just detail the tags, that are different, of the two of them, this way I could choose wisely. This could be done the same way the program tells whenever its correcting tags. Ie: /foo/bar/Megadeth/Rust in Peace/06 Lucretia.mp3
I'm considering that I might want to have lots of versions of any file. With that information, one can easily decide what to do... also, I guess that adding this would be easier, just to wait for the other options. I don't know how to program in python to add this functionality myself :( I might learn, fork and contribute :) Just my 2 cents Thanks for this great program... |
sampsyo
referenced this issue
May 25, 2014
Open
Importer: Resolve duplicate tracks in a single directory with different format/quality #791
sampsyo
referenced this issue
Jun 13, 2014
Closed
Show quality when prompting to import duplicate album #821
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
franciscolourenco
commented
Nov 16, 2014
|
I'm also interested in having V0 V2 320kb available to use in paths |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
franciscolourenco
Nov 16, 2014
Would it be possible to specify which identifiers are used to detect duplicates?
Just like %aunique identifiers, but for duplicate management, so that beet importdoesn't ask about duplicates if, for example, the formats are different.
franciscolourenco
commented
Nov 16, 2014
|
Would it be possible to specify which identifiers are used to detect duplicates? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@aristidesfl Sounds like a reasonable extension, yes. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
push :) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lazka
Aug 13, 2015
Contributor
mutagen trunk (next version) will now expose lame version and bitrate mode. Feedback welcome.
|
mutagen trunk (next version) will now expose lame version and bitrate mode. Feedback welcome. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Fantastic! Awesome work. I'll look into exposing this in beets. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ekjaker
Oct 9, 2015
Fantastic idea this addition to have this at import!
"Would it be possible to specify which identifiers are used to detect duplicates?
Just like %aunique identifiers, but for duplicate management, so that beet importdoesn't ask about duplicates if, for example, the formats are different."
+1 on that!!
And also, at import the option to set default actions would be very helpful, like always take new. Ofcourse, it this gets implemented to its full extent with a scoring system based on bitrate, availability of log/cue,... would be even better than a basic new/old option.
ekjaker
commented
Oct 9, 2015
|
Fantastic idea this addition to have this at import! "Would it be possible to specify which identifiers are used to detect duplicates? +1 on that!! And also, at import the option to set default actions would be very helpful, like always take new. Ofcourse, it this gets implemented to its full extent with a scoring system based on bitrate, availability of log/cue,... would be even better than a basic new/old option. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Joshfindit
Nov 30, 2015
Sorry to complicate this, but I have something to add:
Joint stereo vs Separate stereo.
Personally, I'd much rather have a 192kbps separate stereo copy than a 320kbps joint stereo copy.
Joshfindit
commented
Nov 30, 2015
|
Sorry to complicate this, but I have something to add: Joint stereo vs Separate stereo. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lazka
Nov 30, 2015
Contributor
Personally, I'd much rather have a 192kbps separate stereo copy than a 320kbps joint stereo copy.
Why?
Why? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
smlx
Nov 30, 2015
Contributor
Personally, I'd much rather have a 192kbps separate stereo copy than a 320kbps joint stereo copy.
The mid-side stereo transform is lossless, and reduces redundant encoding of common information between channels.
The mid-side stereo transform is lossless, and reduces redundant encoding of common information between channels. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Joshfindit
Nov 30, 2015
I notice a difference.
On Nov 30, 2015, at 8:15 AM, smlx notifications@github.com wrote:
Personally, I'd much rather have a 192kbps separate stereo copy than a 320kbps joint stereo copy.
The mid-side stereo transform is lossless, and reduces redundant encoding of common information between channels.
―
Reply to this email directly or view it on GitHub.
Joshfindit
commented
Nov 30, 2015
|
I notice a difference.
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
franciscolourenco
commented
Nov 30, 2015
|
@Joshfindit did you double blind test / abx? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Joshfindit
Nov 30, 2015
Yes, many times over the years.
For those that are skeptical, I'll put these on the table:
- at higher bitrates, I hear it much less. At 320, it might be one part of one song on an album, for example.
- I always use headphones that go above 20k, and seek out clarity of sound (yes, I don't always use devices that output that high, but I do when I really want to enjoy the music)
- I have hearing damage in the mid-vocal range. This may cause my brain to overcompensate with the higher frequencies. (Have not tested it, but it's a point for the skeptics, so there it is)
- there is some social aspect to this as well; many tracks that I have had in the past have just been poorly ripped; the joint stereo is more of a canary in these cases
On Nov 30, 2015, at 1:31 PM, Aristides notifications@github.com wrote:
@Joshfindit did you double blind test / abx?
―
Reply to this email directly or view it on GitHub.
Joshfindit
commented
Nov 30, 2015
|
Yes, many times over the years. For those that are skeptical, I'll put these on the table:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
m8ram
Feb 6, 2016
I'm experimenting with beets for the first time and noticed that when I import a directory containing both ogg vorbis and MP3 files it imports only the MP3 files (exactly the opposite of what I want).
Is there a configuration option to change this or do I need to wait for this feature to be implemented?
m8ram
commented
Feb 6, 2016
|
I'm experimenting with beets for the first time and noticed that when I import a directory containing both ogg vorbis and MP3 files it imports only the MP3 files (exactly the opposite of what I want). |
ghost commentedFeb 28, 2013
This issue was automatically migrated from Google Code.
Original author: telejes...@gmail.com (February 16, 2012 01:16:19)
Original issue: google-code-export/beets#341