No time limit on candidate lookup #2446

RollingStar · 2017-02-19T22:45:49Z

Problem

If a candidate is a huge box set, beets slows to an absolute crawl trying to match tracks to it. There should be a cap on maximum lookup time, and beets should cancel a lookup thread that goes passed that time limit. Then it can provide the user with its candidates, including the interrupted lookups, and give its usual prompts, like (A)pply, but add a new one: something like (C)ontinue interrupted lookups. The time limit could default to something reasonable (10-60 seconds) and perhaps be user-specified in the config.

The import hangs (gets really slow but eventually finishes) on two attempts for the same files. Similar lookups took at least 25 minutes, but I haven't timed it exactly. edit: I started this command at least before posting this issue 1 hour and 45 minutes ago. Beets is still considering the entire boxset as a candidate and Python still shows CPU usage.

Running this command in verbose (-vv) mode:

beet -vv import e:\music\mozart
user configuration: C:\Users\RollingStar\AppData\Roaming\beets\config.yaml
data directory: C:\Users\RollingStar\AppData\Roaming\beets
plugin paths:
Sending event: pluginload
inline: adding item field disc_layer
inline: adding album field album_bitrate_kbps
library database: e:\songs\beetslibrary.bib
library directory: e:\songs
Sending event: library_opened
Sending event: import_begin
Import of the directory:
e:\music\mozart
was interrupted. Resume (Y/n)?
Resuming interrupted import of e:\use
Skipping previously-imported path: [omitted]
Sending event: import_task_created
Sending event: import_task_start
Looking up: e:\music\mozart\Volume 6 - Dances & Marches
Tagging Mozart, Wolfgang Amadeus - Tänze · Märsche (CD 1) (Complete Mozart Edition Vol.6)
No album ID found.
Search terms: Mozart, Wolfgang Amadeus - Tänze · Märsche (CD 1) (Complete Mozart Edition Vol.6)
Album might be VA: False
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: albuminfo_received
Candidate: Wolfgang Amadeus Mozart - Complete Mozart Edition, Volume 6: Dances / Marches
Success. Distance: 0.25
Sending event: albuminfo_received
Candidate: Judas Priest - The Complete Albums Collection
Success. Distance: 0.79
Sending event: albuminfo_received
Candidate: Wolfgang Amadeus Mozart - Complete Mozart Edition, Volume 1: Early Symphonies
Success. Distance: 0.73
Sending event: albuminfo_received
Candidate: Wolfgang Amadeus Mozart - Complete Mozart Edition
[here is where it's stuck for many minutes, although candidate 1 (correct) and 3 are also slow]

Led to this problem:

beet -vv import e:\music\mozart

Here's a link to the music files that trigger the bug (if relevant): n/a, but this dummy file that matches the box set probably simulates the same effect. dummy - 01 - Mozart, Wolfgang Amadeus - 7 Menuette, KV 61b (65a) - 1. Menuett in G-dur.zip

Setup

OS: Win10 x64
Python version: 3.6.0
beets version: 1.4.3
Turning off plugins made problem go away (yes/no): no (can disable my last two plugins, but the log shows that they should be unrelated). I have already disabled most of my plugins and trimmed my config down to the bare essentials.

My configuration (output of beet config) is:

directory: e:\songs\
library: e:\songs\beetslibrary.bib
import:
    copy: no
    move: yes
    write: yes
    timid: no
per_disc_numbering: yes

plugins: inline copyartifactspy3

paths:
    default: $albumartist - $original_year - $album [$album_bitrate_kbps]/%if{$disc_layer,$disc_layer}$track. $title 
album_fields:
    album_bitrate_kbps: |
        #dummied out my custom album bitrate function
        #this dummy function should have almost 0 performance penalty
        return(str(abcdefghi))
item_fields:
    disc_layer: |
        if disctotal > 1:
           return str(disc) + '-'
        else:
           return ''

The text was updated successfully, but these errors were encountered:

sampsyo · 2017-02-20T00:43:57Z

Hi! Thanks for the details on this.

Can you tell whether beets is stuck communicating with MusicBrainz to get matches or doing the matching? I've found in the past that the matching algorithm itself can be the culprit in these situations with very large albums. Specifically, it seems from the log like it's getting stuck when it tries to match with a very large candidate album.

This might be testable, for example, by giving beets an MBID to search for and comparing the time to match a few different options in isolation.

RollingStar · 2017-02-20T01:15:04Z

I gave beets a MBID for the correct match (volume 6 of the set rather than the entire set) and the whole operation was quite manageable. Probably under 5 minutes, compared to the 1h45+ which I eventually cancelled.

How specifically do you propose I test the performance? I think this is the MBID that beets was hanging on, but I'm not sure. 03ce3f61-9294-4f5d-aace-a4687aad4229

edit: and keep in mind, my point for this issue is that no matter how optimized beets gets, there should be a time limit, because it is a more graceful handling of unpredictable edge cases.

sampsyo · 2017-02-20T03:54:47Z

Thanks for looking into it. And also for including the "dummy" file above, which does match the same albums.

Oddly, on my machine (an aging MacBook Pro), this only takes a few seconds per match. It's not exactly snappy, but it's also not in the tens of minutes. It seems like about half the time is being spent in track matching (i.e., on the CPU) and the other half is just downloading data from MusicBrainz.

So, let's try to narrow down what's taking so long on your machine. I just pushed a bunch of extra logging that should delimit when beets is trying to match metadata and when it's communicating with MB. It will be helpful to sort that out regardless of whether we add a timeout or just optimize that chunk of code so it can't take so long. Can you give the most recent source a try in verbose mode and report back about which phase takes the longest?

This could help debug #2446. It will help, at least, narrow down when the Munkres algorithm is taking too long.

Even more performance-isolating logging to help debug #2446.

dosoe · 2017-03-22T23:42:47Z

Hi! I'm having a similar problem with this release:
https://musicbrainz.org/release/9bcd75dd-995e-482b-8ba7-1ef074d253de
However, with very big releases (like a complete edition of bach) I don't have problems since I have only a fraction of the files. Complete releases with ~15 CDs like https://musicbrainz.org/release/b0326242-344d-4a07-a175-c36a1418b644 are handled without problems, even if all files are there.
Also, when some files are double and some are lacking, beets is trying to put them in place of the missing files. Maybe there should be a check if a file is actually a duplicate.
I would think that beets is trying all different configurations of files to check which one has the best fit. If you have 700 or 1000 files that you can combine, it makes a lot of combinations to try out. If it's not done already, it would make sense to check for a tag like 'tracknumber' to reduce the number of possible combinations.

The timeout is not implemented so far, I got beets searching for a few hours to match an album.

dosoe · 2017-03-28T21:28:04Z

If the problem is the handling and comparison of too many candidates, could there be a way to give the ID on MusicBrainz as an input to get rid of the problem? Or set a timeout and after it has expired, prompt for an ID.

sampsyo · 2017-03-28T22:08:48Z

It's actually the time taken to evaluate a single candidate album, when that candidate itself is very large.

dosoe · 2017-03-28T23:10:51Z

Still, if we already tell him: "this is the right release, now you have only to get the tracks at the right place", could that speed it up or is it exactly that step that is taking time?

sampsyo · 2017-03-28T23:53:01Z

Yes, disabling the track alignment procedure could work.

RollingStar · 2017-03-29T04:21:49Z

I still think a time limit (perhaps with a user prompt to keep looking) would help retain user-friendliness for unpredictable edge cases. This could be done as the only fix, or in addition to any other fixes.

dosoe · 2017-04-30T16:52:12Z

I have the same problem with a multi-disc release. I ended up importing all the CDs individually, but now he considers that he has the same release several times, one for every import. Is there a possibility to merge them, so that he only has one release that contains all the CDs?

sampsyo · 2017-05-01T19:19:50Z

The most relevant issue about merging albums is #112.

boardfish · 2018-04-16T15:52:10Z

Currently dealing with this issue for d51f64e5-1d52-4d40-8308-5c9f22c4af03.

8h2a · 2019-01-24T19:33:22Z

I'm having the same(?) issue with these releases:
https://musicbrainz.org/series/aa610319-3b4a-4a26-a127-b5dc78c3b732
Beets just seems to be doing something forever.
Any bigger release (e.g. dozens of discs, or hundreds of tracks) seems to be impossible to import using beets.
Another such example would be this: https://musicbrainz.org/release-group/9d0636f7-e959-4a00-9787-9a23c07f9655

KW4NP · 2019-12-28T16:24:45Z

We're still dealing with this issue, an album of 260 tracks took over an hour before beets finally realized that there was no match. Would be very nice to have at least a timeout to keep this from happening needlessly.

stale · 2020-07-11T22:10:44Z

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

RollingStar · 2020-07-12T17:07:02Z

I still think a time limit (perhaps with a user prompt to keep looking) would help retain user-friendliness for unpredictable edge cases. This could be done as the only fix, or in addition to any other fixes.

The "what is blocking it" line works better on a huge project than this one. There aren't many devs, and even fewer devs know beets in detail.

jtpavlock · 2020-07-12T17:35:41Z

The purpose of that line, in the case of these issues, is to hopefully realize what is preventing the issue from being labeled as a feature or bug and provide the necessary info to "upgrade" it from needinfo status. So in the case of this issue, perhaps one blocking element is the request from @sampsyo:

So, let's try to narrow down what's taking so long on your machine. I just pushed a bunch of extra logging that should delimit when beets is trying to match metadata and when it's communicating with MB. It will be helpful to sort that out regardless of whether we add a timeout or just optimize that chunk of code so it can't take so long. Can you give the most recent source a try in verbose mode and report back about which phase takes the longest?

@RollingStar were you ever able to get around to this?

stale · 2020-09-10T17:51:39Z

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

KW4NP · 2020-09-10T20:09:47Z

I'm guessing this is still relevant as it wasn't ever fixed as far as I know.

…

On Thu, Sep 10, 2020, 1:51 PM stale[bot] ***@***.***> wrote: Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2446 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIQYQZKGXILQ6CYN5K7GKXDSFEG33ANCNFSM4DAWP4LA> .

RollingStar · 2020-09-10T20:58:52Z

Part of why I haven't gone back and try to reproduce this specific bug with the massive Mozart box set is it would be plugging one potential bug. Enforcing a time-out would stop a whole class of odd behavior. Another reason is I eventually caved and imported the box set as sub-sets.

sampsyo · 2020-09-10T21:52:10Z

Would anybody who's affected by this issue be interested in doing the aforementioned digging to figure out the root cause (cf #2446 (comment), for example)?

dosoe · 2020-09-11T08:59:37Z

Hi! I have a few releases that fall in the category 'too big to handle', but I'm a bit occupied right now. Le jeu. 10 sept. 2020 à 23:52, Adrian Sampson <notifications@github.com> a écrit :

…

Would anybody who's affected by this issue be interested in doing the aforementioned digging to figure out the root cause (cf #2446 (comment) <#2446 (comment)>, for example)? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2446 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGGIAODCD4DLAIL5VFO7FU3SFFDBXANCNFSM4DAWP4LA> .

stale · 2020-11-10T09:45:08Z

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

RollingStar mentioned this issue Feb 20, 2017

Performance: Common operations are slow. #2388

Open

sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Feb 20, 2017

sampsyo added a commit that referenced this issue Feb 20, 2017

Add a little more logging about matching process

26408dd

This could help debug #2446. It will help, at least, narrow down when the Munkres algorithm is taking too long.

sampsyo added a commit that referenced this issue Feb 20, 2017

More logging for MusicBrainz requests

8ccdb1c

Even more performance-isolating logging to help debug #2446.

sampsyo mentioned this issue Mar 22, 2017

Beets freezes after resuming import #2487

Closed

stale bot added the stale label Jul 11, 2020

stale bot removed the stale label Jul 12, 2020

stale bot added the stale label Sep 10, 2020

stale bot removed the stale label Sep 10, 2020

stale bot added the stale label Nov 10, 2020

stale bot closed this as completed Nov 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No time limit on candidate lookup #2446

No time limit on candidate lookup #2446

RollingStar commented Feb 19, 2017 •

edited

sampsyo commented Feb 20, 2017 •

edited

RollingStar commented Feb 20, 2017 •

edited

sampsyo commented Feb 20, 2017

dosoe commented Mar 22, 2017

dosoe commented Mar 28, 2017

sampsyo commented Mar 28, 2017 •

edited

dosoe commented Mar 28, 2017

sampsyo commented Mar 28, 2017

RollingStar commented Mar 29, 2017

dosoe commented Apr 30, 2017

sampsyo commented May 1, 2017

boardfish commented Apr 16, 2018

8h2a commented Jan 24, 2019 •

edited

KW4NP commented Dec 28, 2019

stale bot commented Jul 11, 2020

RollingStar commented Jul 12, 2020

jtpavlock commented Jul 12, 2020 •

edited

stale bot commented Sep 10, 2020

KW4NP commented Sep 10, 2020 via email

RollingStar commented Sep 10, 2020

sampsyo commented Sep 10, 2020

dosoe commented Sep 11, 2020 via email

stale bot commented Nov 10, 2020

No time limit on candidate lookup #2446

No time limit on candidate lookup #2446

Comments

RollingStar commented Feb 19, 2017 • edited

Problem

Setup

sampsyo commented Feb 20, 2017 • edited

RollingStar commented Feb 20, 2017 • edited

sampsyo commented Feb 20, 2017

dosoe commented Mar 22, 2017

dosoe commented Mar 28, 2017

sampsyo commented Mar 28, 2017 • edited

dosoe commented Mar 28, 2017

sampsyo commented Mar 28, 2017

RollingStar commented Mar 29, 2017

dosoe commented Apr 30, 2017

sampsyo commented May 1, 2017

boardfish commented Apr 16, 2018

8h2a commented Jan 24, 2019 • edited

KW4NP commented Dec 28, 2019

stale bot commented Jul 11, 2020

RollingStar commented Jul 12, 2020

jtpavlock commented Jul 12, 2020 • edited

stale bot commented Sep 10, 2020

KW4NP commented Sep 10, 2020 via email

RollingStar commented Sep 10, 2020

sampsyo commented Sep 10, 2020

dosoe commented Sep 11, 2020 via email

stale bot commented Nov 10, 2020

RollingStar commented Feb 19, 2017 •

edited

sampsyo commented Feb 20, 2017 •

edited

RollingStar commented Feb 20, 2017 •

edited

sampsyo commented Mar 28, 2017 •

edited

8h2a commented Jan 24, 2019 •

edited

jtpavlock commented Jul 12, 2020 •

edited