Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add customisable disambiguation strings for importer #4547

Merged
merged 1 commit into from Oct 14, 2023

Conversation

Serene-Arc
Copy link
Contributor

Description

This adds configurable disambiguation strings when importing. The defaults I've provided are what is currently in place but there are a number of limitations.

  • When used with the beets-audible plugin, sometimes there are 'None' values that are printed, I haven't managed to narrow down the cause of that yet, but it may be limited to something that that plugin does.
  • Values that may be printed can contain comma-separated values, which is how the disambiguation string is printed. This makes it appear as though there are more fields than there are, because it is CSVs nested in CSVs.
  • This modification does not allow for any computation or formatting of values. Note that I've had to leave the medium and disc-number parsing in place, because my solution does not allow for formatting the variables in this manner.

If you have any solutions for these, I'd love to hear them! I'll leave it as a draft PR for now since this is very much a work in progress.

To Do

  • Documentation. (If you've add a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
  • Changelog. (Add an entry to docs/changelog.rst near the top of the document.)
  • Tests. (Encouraged but not strictly required.)

@stale
Copy link

stale bot commented Mar 18, 2023

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 18, 2023
@sampsyo
Copy link
Member

sampsyo commented Mar 25, 2023

Let's keep this alive, please!

@stale stale bot removed the stale label Mar 25, 2023
@Serene-Arc
Copy link
Contributor Author

Sorry I've been really busy as my semester gets into gear. I need to sit down and go through a bunch of these PRs.

@JOJ0
Copy link
Member

JOJ0 commented Mar 26, 2023

Oh yes, thanks for stopping the close @sampsyo, I also came across this feature draft and thought it's very useful and I wished for something like that. Currently I have a distantly related thing going on. Some tiny improvements to the disambiguation string when importing singletons: https://github.com/beetbox/beets/pull/4717/files#diff-e324b20657a7d6b43b8b7aeb5754b96774f5062294b5ba7f1e3062845e9e7044R215-R220

and probably that made me find this PR. Awesome idea @Serene-Arc !

The questions you raise are good one's. My two cents:

  1. Sorry dont use that plug. Quick look at it seemed to much hassle for quickly trying it out. Sorry!!!

  2. My quick & dirty approcah would be to just sanity check for such delimiter and get rid of them by replacing with something else. Not sure what but since beets paths get replaced with _, I would just choose that one, even if some might find that ugly. But everything is better then unexpectely getting fields broken up into multiple fields.

  3. That was also one thing I thought is tricky. My personal needs, especially after I tested out the new UI idea in Importer UI overhaul rebase/update #3721 made me wish for primarily just reordering the things in the disambig. string. So you could try "catching" some names of the default/user configuration, that require further handling, like eg. mediums and disc-number and pipe that through a function before adding it to the string.

@JOJ0
Copy link
Member

JOJ0 commented Mar 26, 2023

Some context. This is how the string is presented in the new UI idea and I thought that for example the catalognumber which for me is one of the most important one's I look at when decidingwhich album to choose, I wished to be the first or second field from the left:

➜ [A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates? m

Finding tags for album "Fracture - 0860".
  Candidates:
  1. (100.0%) Fracture - 0860
             Spotify | 2022 | Astrophonica
  2. (84.6%) Fracture - 0860
             ≠ id
             Discogs | File | 2022 | UK | Astrophonica | APHA0860
  3. (84.1%) Fracture - 0860
             ≠ id
             Deezer | 2022 | Astrophonica
  4. (81.5%) Fracture - 0860
             ≠ id, mediums
             Discogs | 2xVinyl | 2022 | UK | Astrophonica | APHA0860
  5. (65.9%) Fracture - 0860
             ≠ tracks, id, catalognum
             Discogs | Cassette | 2022 | UK | Astrophonica | APHA0860C

With the current UI it looks lik this and I could also imagine that having catalog right after the album name could help:

Finding tags for album "Fracture - 0860".
Candidates:
1. Fracture - 0860 (100.0%) (Spotify, 2022, Astrophonica)
2. Fracture - 0860 (84.6%) (id) (Discogs, File, 2022, UK, Astrophonica, APHA0860)
3. Fracture - 0860 (84.1%) (id) (Deezer, 2022, Astrophonica)
4. Fracture - 0860 (81.5%) (id, mediums) (Discogs, 2xVinyl, 2022, UK, Astrophonica, APHA0860)
5. Fracture - 0860 (65.9%) (tracks, id, catalognum) (Discogs, Cassette, 2022, UK, Astrophonica, APHA0860C)

But actually I have to admit: I changed everything back after my experiments and am happily using the default ordering, since I'm so used to it actually ;-)

This all is just my personal ideas and shouldn't stop you from realising a slightly less advanced idea. We could always move on from your change if we want it / if someone would refactor this again. So, just a little brainstorming here and HTH.

@JOJ0 JOJ0 self-assigned this Jul 25, 2023
@JOJ0
Copy link
Member

JOJ0 commented Jul 25, 2023

Hi @Serene-Arc, since I finally merged a PR that changes the disambiguation string for singletons I kind of feel motivated to review and brainstorm around this feature idea :-)

This is how the thing looks like atm:

beets/beets/ui/commands.py

Lines 187 to 228 in c03bd26

def disambig_string(info):
"""Generate a string for an AlbumInfo or TrackInfo object that
provides context that helps disambiguate similar-looking albums and
tracks.
"""
disambig = []
if info.data_source and info.data_source != 'MusicBrainz':
disambig.append(info.data_source)
if isinstance(info, hooks.AlbumInfo):
if info.media:
if info.mediums and info.mediums > 1:
disambig.append('{}x{}'.format(
info.mediums, info.media
))
else:
disambig.append(info.media)
if info.year:
disambig.append(str(info.year))
if info.country:
disambig.append(info.country)
if info.label:
disambig.append(info.label)
if info.catalognum:
disambig.append(info.catalognum)
if info.albumdisambig:
disambig.append(info.albumdisambig)
# Let the user differentiate between pseudo and actual releases.
if info.albumstatus == 'Pseudo-Release':
disambig.append(info.albumstatus)
if isinstance(info, hooks.TrackInfo):
if info.index:
disambig.append("Index {}".format(str(info.index)))
if info.track_alt:
disambig.append("Track {}".format(info.track_alt))
if (config['import']['singleton_album_disambig'].get()
and info.get('album')):
disambig.append("[{}]".format(info.album))
if disambig:
return ', '.join(disambig)

In short we could say that I added prefixes to the actual data being added to the disambiguation string for singletons.

What do you think of addressing your bullet point here:

  • This modification does not allow for any computation or formatting of values. Note that I've had to leave the medium and disc-number parsing in place, because my solution does not allow for formatting the variables in this manner.

by adding another config option that allows to set a prefix for each value in the list of disambiguation strings. If you think that sounds like a clumsy approach, we could brainstorm about a better form of describing a "pair" of prefix+value in yaml.

@Serene-Arc
Copy link
Contributor Author

Serene-Arc commented Jul 26, 2023

Actually, a solution that I found for a similar problem in the beets-audible project might work! We can link all of the keys to callback functions. So if someone uses the key 'medium', we can have a specific function that calls up the processing needed. Keys that don't require any pre-processing can be transparently passed through.

Of course, this will require a lot of work if there are many keys that need custom functions but these can be added piecemeal if needed. Since there isn't any functionality so far for any other keys in the disambiguation, users shouldn't notice the lack if we want to take our time.

Is this similar to what you had in mind with the prefixes (what I presume i'm calling keys)?

@JOJ0
Copy link
Member

JOJ0 commented Jul 31, 2023

Hi @Serene-Arc thanks for getting back on this. With "prefix" I rather mean a user-customizable string written before the actual value of the field. I think your idea is to automatically pollute these prefixes with the keys of the actual beets fields. Did I understand correctly?

Let's take an example then it is easier to talk about it. have alook on the new features I merged for singleton imports: #4854 (comment)

So I was thinking that the Index and Track prefixes are rather longish and some people might want this shorter, for example something like Idx and Trk or even I: and T: .

Now with your proposed solution we would get this:

Candidates:
1. Portishead - Numb (100.0%) (Spotify, index 7)
2. Portishead - Numb (100.0%) (Spotify, index 1)
3. Portishead - Numb (100.0%) (Spotify, index 27)
4. Portishead - Numb (100.0%) (Discogs, index 1, track_alt A1)
5. Portishead - Numb (100.0%) (Discogs, index 1, track_alt A1)
6. Portishead - Numb (100.0%) (Discogs, index 1, track_alt 1)
7. Portishead - Numb (100.0%) (Discogs, index 1, track_alt 1)
8. Portishead - Numb (100.0%) (Discogs, index 1, track_alt 1)

Did I understand correctly that this would be the outcome of your idea?

I do understand that for disc-number and medium parsing there is now way around functions that "do the magic". maybe as a first step we keep those in place and focus on the other disambig fields only? Most of them do not have prefixes and with albums I never had a problem with that (album and catalog number being the most important for my personal taste) but now with this advanced singleton+album features I had the need to add "prefixes" to make it better readable.

@Serene-Arc
Copy link
Contributor Author

I think we're talking about two different parts of the program. Mine is for the disambiguation field which is the part when you're choosing a candidate with the information such as the year, the publisher, and so on. It wouldn't impact the part you posted.

/smaug_data/torrents/audio/music/Gregory Alan Isakov Discography/This Empty Northern Hemisphere (13 items)                                                                                                                                    
Tagging:                                                                                                               
    Gregory Alan Isakov - This Empty Northern Hemisphere                                                                                                                                                                                      
URL:                                                                                                                   
    https://musicbrainz.org/release/70c54784-0537-4c30-8639-fed4e01c67a0                                                                                                                                                                      
(Similarity: 100.0%) (Digital Media, 2009, AL)                                                                                                                                                                                                
 * One Of Us Cannot Be Wrong -> One of Us Cannot Be Wrong                                                                                                                                                                                     

In the above example, the disambiguation string is the part (Digital Media, 2009, AL).

@JOJ0
Copy link
Member

JOJ0 commented Aug 4, 2023

Thanks for clarifying. Have a look again at the code I posted above: #4547 (comment)

The cli output example I posted is coming from exactly the same function. You are talking about the disambiguation string for albums (starting at line 187) and I am talking about the disambiguation string for singletons (starting at line 218).

The information that is useful in either of those strings is somewhat different. So do you think this should be split out into two different functions - one for albums and one for singletons?

@JOJ0
Copy link
Member

JOJ0 commented Aug 4, 2023

What do you think of that suggestion: You go ahead with your idea proposed before, focusing on the album-disambiguation string. We leave out changing the part for singletons in this PR entirely. Maybe we find out that some parts of your code is useful for the singleton disambiguation-string as well and we then work from there in a new PR.

@Serene-Arc
Copy link
Contributor Author

@JOJ0 I did end up changing the singleton processing in this section. Now they both follow the same logic, but if the default configurations aren't changed, there should be no noticeable effect on the end user.

I just need to update the documentation but take a look and see what you think. This should make it pretty easy to add further computed fields if required as well.

@JOJ0
Copy link
Member

JOJ0 commented Sep 17, 2023

That is an awesome solution. I like it! Thanks

@Serene-Arc Serene-Arc marked this pull request as ready for review September 17, 2023 10:39
@Serene-Arc
Copy link
Contributor Author

@JOJ0 great! If you approve, I'll write the documentation and then this can be merged.

@Serene-Arc
Copy link
Contributor Author

@JOJ0 I believe you wrote the singleton_album_disambig option. Do you think this should be kept? It's rather superfluous with the changes I've made to the disambiguation string system.

@JOJ0 JOJ0 changed the title Add customisable disamiguation strings for importer Add customisable disambiguation strings for importer Sep 18, 2023
@JOJ0
Copy link
Member

JOJ0 commented Sep 18, 2023

Get rid of it please! Dont forget to also remove the docs for it. I'm sure I left some somewhere. Thanks 😊

@Serene-Arc
Copy link
Contributor Author

Great, will do.

@Serene-Arc
Copy link
Contributor Author

@JOJ0 Do you agree that this is ready for merging?

@Serene-Arc Serene-Arc merged commit ff36c7a into beetbox:master Oct 14, 2023
14 checks passed
@Serene-Arc Serene-Arc deleted the configurable_disambig branch October 14, 2023 06:00
@JOJ0 JOJ0 mentioned this pull request Oct 14, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants