Beets Work Queue Discussion #1375

ab623 · 2015-03-24T15:51:51Z

Work Queue Discussion

I wanted to start a discussion on work queuing.
With current beets functionality, you run the command beet import ~/music and that folder is imported but in an interactive state.
Beets can be run in a psudo non-interactive state, but at the risk of reducing the confidence of the autotagger.
What I would like is the ability to still have the full functionality of beets, but simply at a later date, which is why I would like to a see a task based functionality implemented.

Proposed Scenario

Here are my proposed scenario:

Run a beet import with a switch to run in an non-interactive mode
Beets starts an import, and either stays open, or demonises itself for the period of the import process.
I can then run a beet tasks or something to that effect
Based on a configuration, this would contain all the tasks that require human intervention
From here I could then go through each task, which is either an album or track, and see the options available to me, and then either apply, skip, keep, etc, as I would when running interactively.

Maybe this tasks functionality could also be used to hold other tasks or actions such as

Verify correct cover image before applying
Verify chroma matches
Update library
…and anything else you can think of

Applications

The applications for this are the following:

Import a HUGE library and not have to worry about sitting down for ages
Import and worry about the tasks at a later date
Work on 4 or 5 tasks at a time, and leave the rest when you are free
Because tasks are being done in the background, when an albums confidence is low, it can request similar albums as well. So when the task is revisited the user can pick from multiple, and it’s entered immediately.
A GUI web application could be created, so that it can easily pick up the tasks, and display the content. Which can facilitate people who aren’t comfortable with the command line. This will also facilitate with being able to show album art.

Thoughts? Discussions? Implementation ideas?

The text was updated successfully, but these errors were encountered:

guibog · 2015-03-25T01:18:48Z

Hi, you could try to use unison sync tool, it has something similar to what you propose.

ab623 · 2015-03-25T08:51:41Z

@guibog I'm not really sure how the Unison Sync tool, fulfills the use case I outlined in my original post. Can you explain further.

sampsyo · 2015-03-27T22:15:50Z

Thanks for bringing this up! This is a great idea to begin discussing.

There are some notes on the same idea on the wiki: https://github.com/sampsyo/beets/wiki/Refactoring
See the bullet for "Asynchronous import decisions".

This would be a huge amount of work, but I'm especially excited for the alternate UI possibilities this might open up.

ab623 · 2015-03-30T14:16:58Z

I had a look at the Asynchronous import decisions and it is pretty much what i described too. So I'm glad other people have been thinking about this too.

I think a simple method by far will be the best. Using a SQLite table to insert the task that requires completion, along with a status, and outcome of actions.

Then a separate instance of beets could be run which runs in a daemon mode, and picks up any pending tasks, and proceeds to work on them, and update the SQLite table as required. This demon, can handle all the priority etc.

Then whenever you run an instance of beet you can either specify if the task is done immediately or put into a queue beet import -q /path/to/file. This could be set by default in the config file.

Then beets tasks can simply format and display the details from the table.

What we need to figure first is what is the best solution that we can implement that is expandable not only to the autotagger but that other plugins can potentially hook into. I know there are other engines such as Celery which we could use, but is that the best idea? Could be, as we don't really want to reinvent the wheel.

sampsyo · 2015-03-31T03:41:34Z

What about re-using the existing database structure? That is, we'd just have a "pending" flag on items and albums indicating that they've been tracked but not fully imported yet. Or, possibly, a more general "status" field indicating which tasks have been run on them so far.

guibog · 2015-03-31T04:06:41Z

A general status makes sense, but sometime we need more than one. For instance, I have my big library imported but because I am obsessive about it, I haven't dared yet to write tags to files and move them. If I ever do that, ideally I would like to do it step by step, and be able to know which file was written/moved when. Another example: MusicBrainz data is updated from time to time, so I would like a "last_checked_on_mb" status. For these kind of things, I have a taste for event tables: "id | entity_id | event_name | timestamp". This is a bit like a log, and allow for much flexibility, at the cost of a bit more complexity in SQL queries.

Example of events that could apply:

played
detected_missing (the file is not there)
moved
metadata_written_to_file
metadata_checked_on_mb
cover_added
...

ab623 · 2015-03-31T12:29:07Z

@sampsyo - I wouldn't want to use the existing database structure as it wasn't designed to be used for this type of feature, meaning it won't be able to be easily expanded to allow additional functionality. If the autotagger runs in the background, then it has more time to request more information, and store that info. So if its confidence was low, it could automatically request and store additional candidates, or plugins could hook into it, such as lyrics, and autodownload and store lyrics. I wouldn't want to muddy the existing table with this information.

@guibog - I think a log file should is a good addition to beets, but that should be a beets core function. It should be logging what it does. This way we can see what changes were applied at what time. This can then be used to roll back changes. But this should be raised as a separate feature.

What we need is a table which stores

Beet command to be run - beet import /path/, beet ftintitle
Date/time of addition
Status - Pending, Cancelled, Superseded, Completed
Output

This is the basic information, that I think could be supplied to a beets daemon. So when i run a beets import, i can specify to run it in the background. This adds it to the task list, and then the daemon will begin processing it. Once processed it will change its status to completed, and an output is applied, this may be a decision that the user must make a later point in time.

I can then fire up beets frontend, and run a command which lists all tasks, and any actions I need to take against them. This way in your scenario @guibog i can create a cron job which queries my library each week for updates to musicbrains data, and it runs in the background, kepping my data in sync. A status of superseded could also be used which can invalidate a previous import by a new import.

sampsyo · 2015-03-31T12:57:00Z

Thanks of the comments. I'm still not quite sure why this functionality would "muddy the existing table"—we can of course hide the new functionality from other user-facing interactions (i.e., pending stuff would not show up in beet list). I'm willing to buy this with a little more explanation, of course. Maybe doing a more detailed design of both alternatives—on a wiki page, for example—would help clarify?

ab623 · 2015-03-31T13:27:10Z

@sampsyo - Sounds like a good idea. Are you able to provide an export of the main tables beets uses along with columns, and a typical subset of data, maybe 100 rows? Maybe paste in a gist (comma/tab separated). That would help with the design and visualisation.

sampsyo · 2015-03-31T23:32:25Z

You can get this from your own library using the SQLite command-line program:

$ sqlite3 ~/.config/beets/library.db
SQLite version 3.8.5 2014-08-15 22:37:57
Enter ".help" for usage hints.
sqlite> .schema
[...]
sqlite> select * from items limit 100
[...]

ab623 · 2015-04-01T07:38:44Z

@SamPsy - Thanks. My library isn't as fully featured at the moment with plugins etc, but i will dump mine anyway. I can already envision a issues with using the existing table, but I will get them noted down anyway.

hrehfeld · 2015-09-04T15:48:25Z

from #1538 :

It would be awesome if we could run

 $ beet import --quiet

which then writes a file similar to a temporary commit message file. It would contain the interactive prompt:

# lines starting with # will be ignored
# (32 items)
# Tagging:
#     Dead Brothers - Dead Music for Dead People
# URL:
#     http://musicbrainz.org/release/c775bc5b-26d9-4da6-8b3b-b464040d3147
# (Similarity: 83.0%) (unmatched tracks) (CD, 2000, CH, Voodoo Rhythm Records)
# Unmatched tracks (16):
#  ! Dead Brothers Stomp (#1) (3:02)
#  ! I've Always Known (#2) (2:04)
#  ! Farmer Boy (#3) (2:42)
#  ! Besame Mucho (#4) (2:14)
#  ! Roger (#5) (0:54)
#  ! She Collects Postcards (#6) (5:03)
#  ! Banjo Villa Against Tarass Boulba (#7) (0:51)
#  ! Crying (#8) (4:38)
#  ! Hora (#9) (2:46)
#  ! Allons Aux Paquis! (#10) (2:11)
#  ! Somewhere Between Dog & Wolf (#11) (2:58)
#  ! Buy It! (#12) (1:11)
#  ! Good Time Religion (#13) (5:04)
#  ! Orally (#14) (1:01)
#  ! Ramblin' Man (#15) (3:57)
#  ! [untitled] (#16) (1:08)
# (A)pply, [S]kip, (U)se as-is, as (T)racks, (G)roup albums, Requery (i)nteractively
# Enter search, enter Id, aBort? 
/music/beets_/ Dead Music for Dead People - Voodoo Rhythm Records - c775bc5b
<optional temporary uuid here>

I would then put a character with whatever choice I decide on:

# Enter search, enter Id, aBort? 
/music/beets_/ Dead Music for Dead People - Voodoo Rhythm Records - c775bc5b
<optional temporary uuid here>
a

and the next item would follow.

Then i could call

$ beet import --file

and beets would apply my choices.

This would help with the dreadful importing stage. (I am importing my library since... a few days ago, and it's still going on)

ab623 · 2016-04-01T10:45:10Z

Based on previous discussions should we look into a simple python package which we can use a a task system. Something which will underpin the entire concept is the ability to add tasks to a queue and process them and get back results.

We ideally need the following requirements in my opinion

Ability to store items in a queue
Ability to persist queue if server goes down, so it can resume processing
Ability to store job responses and exceptions

Nice to haves would be:

Simple workflow system
Multiple queues for high / low priority jobs

Many tasks based queues use a back end broker, a popular one is Redis. I'm not sure we want to include another dependency onto beets, but this tasked based system could be a config option, allow people to use it without a back end broker.

Any suggestions?

sampsyo · 2016-04-01T17:31:47Z

This does sound like approximately the right list of requirements. For beets, though, I'd argue that something much simpler than Redis would be the right way to go—in particular, just storing "tasks" as records in a SQLite database should work great. In particular, I'm nervous about any solution that involves a separate process just to store and distribute tasks. That's good for a server setting, where lots of daemons run constantly anyway, but less good for an interactive, user-facing application.

What do you think of a simple database-backed queue?

ab623 · 2016-04-01T20:06:18Z

I understand your hesitance for an external system to manage queues and I
agree.

I've been looking into a database backed system, but I havnt had much luck.
It may be that we need to implement our own.

Maybe a new table in the db...or a new database entirely (I dont like the
library db to filled with temp data, with constant read writes). It could
store the beets command intended to run, priority, output success etc. And
then beets in the background in server mode could poll it ever x seconds.
For the next task.

For the amount of tasks that we will run (minimal) and the features
required. It's perfect plausible to roll our own.
On 1 Apr 2016 6:32 p.m., "Adrian Sampson" notifications@github.com wrote:

This does sound like approximately the right list of requirements. For
beets, though, I'd argue that something much simpler than Redis would be
the right way to go—in particular, just storing "tasks" as records in a
SQLite database should work great. In particular, I'm nervous about any
solution that involves a separate process just to store and distribute
tasks. That's good for a server setting, where lots of daemons run
constantly anyway, but less good for an interactive, user-facing
application.

What do you think of a simple database-backed queue?

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#1375 (comment)

gryphonmyers · 2021-03-30T12:50:46Z

No activity on this for some time, but this feature would unlock just about the most ideal import process I can imagine for music library software. Being able to navigate tagging decisions as interactions with a Telegram bot on my own time, as new music comes in via automated processes... 😍

sampsyo added the discussion label Mar 27, 2015

sampsyo mentioned this issue Jul 7, 2015

import --quiet should write a file that can be user edited and reimported #1538

Closed

sampsyo mentioned this issue Mar 31, 2016

telegram bot #1922

Closed

sampsyo mentioned this issue May 3, 2016

quiet incremental import #1988

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beets Work Queue Discussion #1375

Beets Work Queue Discussion #1375

ab623 commented Mar 24, 2015

guibog commented Mar 25, 2015

ab623 commented Mar 25, 2015

sampsyo commented Mar 27, 2015

ab623 commented Mar 30, 2015

sampsyo commented Mar 31, 2015

guibog commented Mar 31, 2015

ab623 commented Mar 31, 2015

sampsyo commented Mar 31, 2015

ab623 commented Mar 31, 2015

sampsyo commented Mar 31, 2015

ab623 commented Apr 1, 2015

hrehfeld commented Sep 4, 2015

ab623 commented Apr 1, 2016

sampsyo commented Apr 1, 2016

ab623 commented Apr 1, 2016

gryphonmyers commented Mar 30, 2021 •

edited

Beets Work Queue Discussion #1375

Beets Work Queue Discussion #1375

Comments

ab623 commented Mar 24, 2015

Work Queue Discussion

Proposed Scenario

Applications

guibog commented Mar 25, 2015

ab623 commented Mar 25, 2015

sampsyo commented Mar 27, 2015

ab623 commented Mar 30, 2015

sampsyo commented Mar 31, 2015

guibog commented Mar 31, 2015

ab623 commented Mar 31, 2015

sampsyo commented Mar 31, 2015

ab623 commented Mar 31, 2015

sampsyo commented Mar 31, 2015

ab623 commented Apr 1, 2015

hrehfeld commented Sep 4, 2015

ab623 commented Apr 1, 2016

sampsyo commented Apr 1, 2016

ab623 commented Apr 1, 2016

gryphonmyers commented Mar 30, 2021 • edited

gryphonmyers commented Mar 30, 2021 •

edited