Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edit: allow interactive editing during the importer #1846

Merged
merged 13 commits into from
Feb 7, 2016
Merged
25 changes: 15 additions & 10 deletions beets/importer.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@

action = Enum('action',
['SKIP', 'ASIS', 'TRACKS', 'MANUAL', 'APPLY', 'MANUAL_ID',
'ALBUMS'])
'ALBUMS', 'RETAG'])

QUEUE_SIZE = 128
SINGLE_ARTIST_THRESH = 0.25
Expand Down Expand Up @@ -443,7 +443,8 @@ def set_choice(self, choice):
# Not part of the task structure:
assert choice not in (action.MANUAL, action.MANUAL_ID)
assert choice != action.APPLY # Only used internally.
if choice in (action.SKIP, action.ASIS, action.TRACKS, action.ALBUMS):
if choice in (action.SKIP, action.ASIS, action.TRACKS, action.ALBUMS,
action.RETAG):
self.choice_flag = choice
self.match = None
else:
Expand Down Expand Up @@ -482,7 +483,7 @@ def chosen_ident(self):
(in which case the data comes from the files' current metadata)
or APPLY (data comes from the choice).
"""
if self.choice_flag is action.ASIS:
if self.choice_flag in (action.ASIS, action.RETAG):
return (self.cur_artist, self.cur_album)
elif self.choice_flag is action.APPLY:
return (self.match.info.artist, self.match.info.album)
Expand All @@ -493,7 +494,7 @@ def imported_items(self):
If the tasks applies an album match the method only returns the
matched items.
"""
if self.choice_flag == action.ASIS:
if self.choice_flag in (action.ASIS, action.RETAG):
return list(self.items)
elif self.choice_flag == action.APPLY:
return self.match.mapping.keys()
Expand Down Expand Up @@ -620,8 +621,12 @@ def align_album_level_fields(self):
"""Make some album fields equal across `self.items`.
"""
changes = {}
# Determine where to gather the info from for the RETAG action.
retag_asis = (self.choice_flag == action.RETAG and
not self.items[0].artist and
not self.items[0].mb_artistid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow why these fields are the right ones to check for this... is there a particular reason why you chose artist and mb_artistid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry: artist should have been albumartist, in a lazy attempt to just use the same fields that the elif is using.

The ultimate reason is trying to find out if we get here by doing a RETAG-based-on-asis or
RETAG-based-on-candidate, as there is no self.match in either case. I'm not sure if this can be achieved without further "polluting" the class (by adding a flag variable of sorts) or similar (the match doesn't get returned from commands.choose_candidate() either, just action.RETAG). Even if it is not bulletproof, it seemed like a good compromise to check the mb_artistid existence in particular as a decent indicator that we are coming from RETAG-based-on-candidate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for the explanation---I understand now.

Here's a crazy proposal: what if we never apply the "as-is fixup" logic for the RETAG case? The philosophy would be: if you're willing to edit the data manually, then you can take matters into your own hands.


if self.choice_flag == action.ASIS:
if self.choice_flag == action.ASIS or retag_asis:
# Taking metadata "as-is". Guess whether this album is VA.
plur_albumartist, freq = util.plurality(
[i.albumartist or i.artist for i in self.items]
Expand All @@ -637,7 +642,7 @@ def align_album_level_fields(self):
changes['albumartist'] = config['va_name'].get(unicode)
changes['comp'] = True

elif self.choice_flag == action.APPLY:
elif self.choice_flag in (action.APPLY, action.RETAG):
# Applying autotagged metadata. Just get AA from the first
# item.
if not self.items[0].albumartist:
Expand Down Expand Up @@ -672,7 +677,7 @@ def manipulate_files(self, move=False, copy=False, write=False,
# old paths.
item.move(copy, link)

if write and self.apply:
if write and (self.apply or self.choice_flag == action.RETAG):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps self.apply should be set in the RETAG case? (I'm not certain; it's just worth considering.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit uncertain about what you mean: modifying the task.apply() method so it returns True for both self.choice_flag in (action.APPLY, action.RETAG)? I wouldn't oppose to that or have a strong opinion (didn't go through the implications yet), as I merely opted for the most explicit and simple way possible of adding the new action.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's exactly what I meant. I also don't have a strong opinion; I was just noticing that we were checking flag in (APPLY, RETAG) quite often, so it might make sense to use a shortcut.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, and yes, indeed makes sense. However, it seems it would require a small modification to importer.apply_choice() in order to avoid incorrectly applying the metadata changes if it is a RETAG (something similar to):

@@ -1353,7 +1353,7 @@ def apply_choice(session, task):
         return

     # Change metadata.
-    if task.apply:
+    if task.apply and task.choice_flag == action.APPLY:
         task.apply_metadata()
         plugins.send('import_task_apply', session=session, task=task)

and probably a similar tweak in regards to the align_album_level_fields issue at #1846 (diff).

Both the existing approach and your suggested one sound like a reasonable trade-off to me!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After checking this issue again, I have decided to keep the explicit comparison with RETAG at the moment, mainly because the behaviour of RETAG does not always fall under the APPLY choice:

function RETAG follows
chosen_ident() ASIS
imported_items() ASIS
align_album_level_fields() APPLY
manipulate_files() ~APPLY

It seemed to me that merging both on task.apply() would end up resulting in more confusion and a harder to follow flow! Again, it's rather subjective, so please let me know if you prefer the other approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great; thanks for looking into it. This is a very reasonable conclusion.

item.try_write()

with session.lib.transaction():
Expand Down Expand Up @@ -807,8 +812,8 @@ def __init__(self, toppath, item):
self.paths = [item.path]

def chosen_ident(self):
assert self.choice_flag in (action.ASIS, action.APPLY)
if self.choice_flag is action.ASIS:
assert self.choice_flag in (action.ASIS, action.APPLY, action.RETAG)
if self.choice_flag in (action.ASIS, action.RETAG):
return (self.item.artist, self.item.title)
elif self.choice_flag is action.APPLY:
return (self.match.info.artist, self.match.info.title)
Expand Down Expand Up @@ -1315,7 +1320,7 @@ def resolve_duplicates(session, task):
"""Check if a task conflicts with items or albums already imported
and ask the session to resolve this.
"""
if task.choice_flag in (action.ASIS, action.APPLY):
if task.choice_flag in (action.ASIS, action.APPLY, action.RETAG):
found_duplicates = task.find_duplicates(session.lib)
if found_duplicates:
log.debug('found duplicates: {}'.format(
Expand Down
4 changes: 3 additions & 1 deletion beets/ui/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -870,7 +870,9 @@ def _get_plugin_choices(self, task):
checking the right conditions and returning a list of `PromptChoice`s,
which is flattened and checked for conflicts.

Raises `ValueError` if two of the choices have the same short letter.
If two or more choices have the same short letter, a warning is
emitted and all but one choices are discarded, giving preference
to the default importer choices.

Returns a list of `PromptChoice`s.
"""
Expand Down
152 changes: 141 additions & 11 deletions beetsplug/edit.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,10 @@
from beets import util
from beets import ui
from beets.dbcore import types
from beets.ui.commands import _do_query
from beets.importer import action, SingletonImportTask
from beets.library import Item, Album
from beets.ui.commands import _do_query, PromptChoice
from copy import deepcopy
import subprocess
import yaml
from tempfile import NamedTemporaryFile
Expand Down Expand Up @@ -119,7 +122,7 @@ def flatten(obj, fields):
return d


def apply(obj, data):
def apply_(obj, data):
"""Set the fields of a `dbcore.Model` object according to a
dictionary.

Expand Down Expand Up @@ -151,6 +154,23 @@ def __init__(self):
'ignore_fields': 'id path',
})

self.register_listener('before_choose_candidate',
self.before_choose_candidate_listener)
self.register_listener('import_begin', self.import_begin_listener)

def _set_reference_field(self, field):
"""Set the "unequivocal, non-editable field" that will be used for
reconciling back the user changes.
"""
if field == 'id':
self.reference_field = 'id'
self.ref_field_value = lambda o: int(o.id)
self.obj_from_ref = lambda d: int(d['id'])
elif field == 'path':
self.reference_field = 'path'
self.ref_field_value = lambda o: util.displayable_path(o.path)
self.obj_from_ref = lambda d: util.displayable_path(d['path'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... It's unfortunate that this needs to update fields on the plugin object. Is there any way we can control this instead by just passing a parameter to the relevant functions? That should make the logical flow easier to read (and, hopefully, it should make for easier testing).


def commands(self):
edit_command = ui.Subcommand(
'edit',
Expand All @@ -174,6 +194,9 @@ def commands(self):
def _edit_command(self, lib, opts, args):
"""The CLI command function for the `beet edit` command.
"""
# Set the reference field to "id", as all Models have valid ids.
self._set_reference_field('id')

# Get the objects to edit.
query = ui.decargs(args)
items, albums = _do_query(lib, query, opts.album, False)
Expand Down Expand Up @@ -202,8 +225,8 @@ def _get_fields(self, album, extra):
if extra:
fields += extra

# Ensure we always have the `id` field for identification.
fields.append('id')
# Ensure we always have the reference field for identification.
fields.append(self.reference_field)

return set(fields)

Expand All @@ -216,20 +239,26 @@ def edit(self, album, objs, fields):
everything).
"""
# Present the YAML to the user and let her change it.
success = self.edit_objects(objs, fields)
if album:
success = self.edit_objects(objs, None, fields)
else:
success = self.edit_objects(objs, fields, None)

# Save the new data.
if success:
self.save_changes(objs)

def edit_objects(self, objs, fields):
def edit_objects(self, objs, item_fields, album_fields):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm not thrilled about the need to introduce two field sets here, I can see that it might be necessary. (Previously, it was nice that this method just worked on Model objects abstractly; it didn't know or care about Item and Album specifically.)

"""Dump a set of Model objects to a file as text, ask the user
to edit it, and apply any changes to the objects.

Return a boolean indicating whether the edit succeeded.
"""
# Get the content to edit as raw data structures.
old_data = [flatten(o, fields) for o in objs]
old_data = [flatten(o,
item_fields if isinstance(o, Item)
else album_fields)
for o in objs]

# Set up a temporary file with the initial data for editing.
new = NamedTemporaryFile(suffix='.yaml', delete=False)
Expand Down Expand Up @@ -262,10 +291,20 @@ def edit_objects(self, objs, fields):
return False

# Show the changes.
# If the objects are not on the DB yet, we need a copy of their
# original state for show_model_changes.
if all(not obj.id for obj in objs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we need to do this for the not-in-DB case, maybe we should just do it unconditionally? Having this if here doesn't seem to buy us much...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a closer look, seems like a deepcopy of an object that is on the library might not be a good idea:
TypeError: object.__new__(thread.lock) is not safe, use thread.lock.__new__()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah indeed, that does seem problematic. Maybe we should just be keeping a dictionary of data (i.e, dict(obj) instead of a full copy of the object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, using a dict(obj) didn't work either - the main issue is that we need to use something that is understood by ui.show_model_changes, and I'm not really sure we can use anything other than a DBModel without modifying that function (or providing a similar alternative that works on plain dicts, etc).

In any case, I have tidied up a bit the code by always generating objs_old (with a deepcopy if the object is not in DB, None otherwise) and removing the ifs by iterating both lists at the same time - hope it makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! Once again, thank you for investigating and discarding my half-baked ideas. 😃 This solution looks straightforward enough after the cleanup.

objs_old = {self.ref_field_value(obj): deepcopy(obj)
for obj in objs}
self.apply_data(objs, old_data, new_data)
changed = False
for obj in objs:
changed |= ui.show_model_changes(obj)
if not obj.id:
# TODO: remove uglyness
obj_old = objs_old[self.ref_field_value(obj)]
else:
obj_old = None
changed |= ui.show_model_changes(obj, obj_old)
if not changed:
ui.print_('No changes to apply.')
return False
Expand Down Expand Up @@ -299,7 +338,7 @@ def apply_data(self, objs, old_data, new_data):
self._log.warn('number of objects changed from {} to {}',
len(old_data), len(new_data))

obj_by_id = {o.id: o for o in objs}
obj_by_ref = {self.ref_field_value(o): o for o in objs}
ignore_fields = self.config['ignore_fields'].as_str_seq()
for old_dict, new_dict in zip(old_data, new_data):
# Prohibit any changes to forbidden fields to avoid
Expand All @@ -313,8 +352,9 @@ def apply_data(self, objs, old_data, new_data):
if forbidden:
continue

id = int(old_dict['id'])
apply(obj_by_id[id], new_dict)
# Reconcile back the user edits, using the reference_field.
val = self.obj_from_ref(old_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing the theme, I'm also in general somewhat saddened by the need for all this extra complexity to identify objects by an appropriate field.

Here's one other crazy idea: give the objects a temporary id for the purpose of this workflow. Then, after the process is done, remove it before continuing with the import process.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's one other crazy idea: give the objects a temporary id for the purpose of this workflow. Then, after the process is done, remove it before continuing with the import process.

That would certainly help lower the amount of complexity introduced on this pull request (taking care of #1846 (diff) as well), and something that crossed my mind but felt like dangerous territory.

Are there any special considerations needed to be taken into account when assigning the temporary ids (ie. total_item_count + i)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. As long as the IDs never actually get added to the database, I think they can be anything unique---so even just "i" should be a fine way to assign them. Then the IDs can be removed again before they objects get added to the database (at which point they'll be assigned real IDs).

apply_(obj_by_ref[val], new_dict)

def save_changes(self, objs):
"""Save a list of updated Model objects to the database.
Expand All @@ -324,3 +364,93 @@ def save_changes(self, objs):
if ob._dirty:
self._log.debug('saving changes to {}', ob)
ob.try_sync(ui.should_write(), ui.should_move())

# Methods for interactive importer execution.

def before_choose_candidate_listener(self, session, task):
"""Append an "Edit" choice to the interactive importer prompt.
"""
choices = [PromptChoice('d', 'eDit', self.importer_edit)]
if task.candidates:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one actually revealed a problem if the user uses Enter search or enter Id to find a candidate not found during the initial lookup: as things stand, I believe this listener has no way of retrieving that candidate (as the new candidate it is handled within TerminalImportSession.choose_match(), and only saved when exiting the prompt by applying). I have added a note to the docs, but I'm wondering if something else should be done about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. I also don't see an immediate solution, so let's revisit the problem later on.

choices.append(PromptChoice('c', 'edit Candidates',
self.importer_edit_candidate))

return choices

def import_begin_listener(self, session):
"""Initialize the reference field to 'path', as during an interactive
import session Models do not have valid 'id's yet.
"""
self._set_reference_field('path')

def importer_edit(self, session, task):
"""Callback for invoking the functionality during an interactive
import session on the *original* item tags.
"""
singleton = isinstance(object, SingletonImportTask)
item_fields = self._get_fields(False, [])
items = list(task.items) # Shallow copy, not modifying task.items.
if not singleton:
# Prepend a FakeAlbum for allowing the user to edit album fields.
album = FakeAlbum(task.items, task.toppath)
items.insert(0, album)
album_fields = self._get_fields(True, [])
else:
album_fields = None

# Present the YAML to the user and let her change it.
success = self.edit_objects(items, item_fields, album_fields)

# Save the new data.
if success:
if not singleton:
# Propagate the album changes to the items.
album._apply_changes()
# Return action.RETAG, which makes the importer write the tags
# to the files if needed.
return action.RETAG
else:
# Edit cancelled / no edits made. Revert changes.
for obj in task.items:
obj.read()

def importer_edit_candidate(self, session, task):
"""Callback for invoking the functionality during an interactive
import session on a *candidate* applied to the original items.
"""
# Prompt the user for a candidate, and simulate matching.
sel = ui.input_options([], numrange=(1, len(task.candidates)))
# Force applying the candidate on the items.
task.match = task.candidates[sel - 1]
task.apply_metadata()

return self.importer_edit(session, task)


class FakeAlbum(Album):
"""Helper for presenting the user with an Album to be edited when there
is no real Album present. The album fields are set from the first item,
and after editing propagated to the items on `_apply_changes`.
"""
def __init__(self, items, path):
self._src_items = items

# Create the album structure using metadata from the first item.
values = dict((key, items[0][key]) for key in Album.item_keys)
# Manually set the path as a single value field.
values[u'path'] = util.displayable_path(path)
super(FakeAlbum, self).__init__(**values)

def _getters(self):
"""Remove 'path' from Album._getters(), treating it as a regular field
in order to be able to use it directly."""
getters = Album._getters()
getters.pop('path')
return getters

def _apply_changes(self):
"""Propagate changes to the album fields onto the Items.
"""
values = dict((key, self[key]) for key in Album.item_keys)
for i in self._src_items:
i.update(values)
9 changes: 5 additions & 4 deletions docs/dev/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -585,10 +585,11 @@ album has no candidates, the relevant checks against ``task.candidates`` should
be performed inside the plugin's ``before_choose_candidate_event`` accordingly.

Please make sure that the short letter for each of the choices provided by the
plugin is not already in use: the importer will raise an exception if two or
more choices try to use the same short letter. As a reference, the following
characters are used by the choices on the core importer prompt, and hence
should not be used: ``a``, ``s``, ``u``, ``t``, ``g``, ``e``, ``i``, ``b``.
plugin is not already in use: the importer will emit a warning and discard
all but one of the choices using the same letter, giving priority to the
core importer prompt choices. As a reference, the following characters are used
by the choices on the core importer prompt, and hence should not be used:
``a``, ``s``, ``u``, ``t``, ``g``, ``e``, ``i``, ``b``.

Additionally, the callback function can optionally specify the next action to
be performed by returning one of the values from ``importer.action``, which
Expand Down