Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

info plugin: Allow custom formatting and human-readable lengths #1737

Merged
merged 8 commits into from Nov 26, 2015

Conversation

diego-plan9
Copy link
Member

Work in progress for implementing the feature discussed on #1689, as the workflow proposed by @awesomer would be pretty much aligned with one of my long-term wishlist for my library (ie. a neat way to assist in adding obscure/missing albums to MusicBrainz without the need for extra tools), and hopefully encourage others contributing back to MusicBrainz.

These commits intend to implement the suggested initial changes on the info plugin:

A good first step, should anyone choose to accept it, would be to extend the info plugin so it can be configured to output using formatting, like the list command.

The command now accepts the standard formatting argument (via cmd.parser.add_format_option), relying on the Item.__format__ machinery for formatting. Modifications have been made so an Item (either directly retrieved from the library if using library_data_emitter, or a "fake" temporary one built from the media file is using tag_data_emitter) is available to print_data().

At the same time, it would be a good idea to format length as a time by default—nobody probably wants to see a raw number of seconds.

Makes much sense to me! Just in case the raw number of seconds is useful to someone, an argument (--raw-length) has been added to switch between raw length and human-readable length (human-readable being the default).

I'd love some feedback on the current work, and also some thoughts on the preferred way for implementing the next steps. My thoughts go along the line of extending the info plugin with a listener for the "MusicBrainz match not found" event, but I'm not sure if this functionality would be better fitted into a less generic plugin?

* Add custom output formatting via a format string to InfoPlugin. The command
accepts a formatting string via the "-f" parameter, which is handled by its
CommonOptionsParser and applied during print_data().
* Modify the emitters in order to include an Item into the list of fields, that
is formatted according to the format string if specified.
* Add an argument to allow the user to choose if track lengths are displayed as
raw floats or using a human-readable form (mm:ss), defaulting to human-readable
form.
* Rename filter() function to avoid warning of reserved built-in symbol.
* Remove mediafile fixture on two tests.
* Add tests for length (human/raw, library/path) and custom format.
@@ -52,7 +53,11 @@ def emitter():
tags[field] = getattr(mf, field)
tags['art'] = mf.art is not None
tags['path'] = displayable_path(path)
# create a temporary Item to take advantage of __format__
tags['item'] = Item(db=None, **tags)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a bit fragile and up for debate. The rationale behind it is to have an Item be available for making use of Item.__format__ during print_data().

It might be cleaner to modify the emitters so they return an (Item, data) tuple instead of packing it inside data['item'] as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. Maybe the thing to do would be to have both emitters just produce Items instead of dictionaries. Then, the later machinery can be responsible for flattening the Item for display. That would certainly be simpler—but perhaps I'm missing something that would prevent it from working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea, although after playing a bit with potential solutions returning Items directly might have subtle implications on the output. Currently:

  • if the info command is passed a file, it only prints the fields present on the MediaFile (plus a boolean art field, and a genres field).
  • if the info command is passed a library item, it prints all the fields, including the empty ones.
$ ./beet info test/rsrc/image.mp3 
/home/x/venvs/beets/git/beets/test/rsrc/image.mp3
       art: True
  bitdepth: 0
   bitrate: 68536
  channels: 1
    format: MP3
    genres: 
    length: 1.0
samplerate: 44100

$ ./beet info -l foo
/tmp/music/artist/album/track.mp3
acoustid_fingerprint: 
         acoustid_id: 
               added: 2015-10-15 19:04:53
               album: Album
            album_id: 5
...
               track: 1
          tracktotal: 12
                year: 1988

This seems to be mainly due to the dict being built from the MediaFile directly, and as a result, having None as a value for the tags not present on the file (print_data has a if value is not None: line which explicitly checks for None).

My main concern is that it might be hard to find out what fields are really to be printed due to being on the MediaFile if the emitters return just Items directly (for example, fields on the mediafile stored as a zero or empty string, or fields initialized by the Item constructor with a default empty value, etc), where the difference between a non-present field and an "empty" one is not that clear.

Perhaps it would make sense to have tag_data_emitter produce MediaFiles, and library_data_emitter produce Items, having the flattening handle the differences?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm; that's a very good point. It makes sense to keep the output for files as faithful as possible to the files themselves.

On further reflection, I think I like your current solution the best. Both emitters can produce the same kind of data—a pair of a dictionary (for the default mode) and an Item (for the --format mode). That way, the data output piece can stay agnostic with respect to the data source. Sorry for getting in your way when you had it right all along! 😃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, as a matter of fact thank you for making discussing these tiny-ish details and contributing to beets in general so enjoyable! I have just pushed a commit that makes the emitters return a (dict, Item) pair, adjusting print_data() (so it accepts an extra parameter, the Item) and the main loop accordingly.

I also took the liberty of revising how the path was handled: it is only used for printing it at the top of print_data() when not using a format string, just before the fields. Since we now have Item.path available at print_data, I thought it would make sense to remove some code that was mainly used for juggling with the path dict key (appending it to the dict on the emitters, preserving it from being deleted during key_filter, and popping it during print_data), hopefully resulting in a slightly cleaner code.

@sampsyo
Copy link
Member

sampsyo commented Nov 25, 2015

Great! The format flag is an excellent first step, and something that could be useful in other contexts too—not just for uploading to MusicBrainz.

If it's OK with you, let's merge the format flag now. But I have a suggestion for the time formatting: let's make this the default for all of beets, not just the info plugin. Specifically, in library.py, let's change the $length field from types.FLOAT to a new type (TimeType?) that has a smarter format method. That way, even beet ls -f '$length' will get the new behavior.

Of course, that makes it somewhat more complicated to revert to the old, raw number if you want that for some reason. This, however, is a broader issue also shared by other fields like $bitrate and even $track, which people occasionally wish they could print as raw values. The current workaround is to define an inline field that just echoes the underlying field. But maybe we can think of a better solution for that also.

* Remove human length changes from the plugin and the tests, as they will
eventually be handled at a higher level.
* Remove human_length parameter from print_data()
@diego-plan9
Copy link
Member Author

Sure, I'd be happy to leave out the time formatting changes out in favor of implementing them later at a wider scope - as a matter of fact, just pushed some commits that remove that functionality. At a first glance, I'd prefer to leave that change to someone more experienced with the types system instead of working on it myself, but if no one steps up I'd be more than willing to give it a try.

* Make emitters produce a pair (dict, Item), in order to preserve the output
at print_data (dict is used if no custom format is specified, Item otherwise).
* Simplify the handling of the paths, printed at the top of print_data. The
path key is removed from the dict entirely and fetched from the Item.
@sampsyo
Copy link
Member

sampsyo commented Nov 26, 2015

Wonderful! This looks great, and I appreciate the simplification of the path handling too. Let's call this done!

sampsyo added a commit that referenced this pull request Nov 26, 2015
info plugin: Allow custom formatting and human-readable lengths
@sampsyo sampsyo merged commit a333777 into beetbox:master Nov 26, 2015
@diego-plan9 diego-plan9 deleted the mbtracks branch October 17, 2016 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants