[Ogg] Support for METADATA_BLOCK_PICTURE #70

alixinne · 2017-09-13T15:36:13Z

According to the xiph.org wiki, the currently recommended way to store pictures as part of an Ogg Vorbis file is through the METADATA_BLOCK_PICTURE field. Values of this field are base64 encoded values following the FLAC specification for embedded pictures.

The wiki also mentions the existence of the deprecated and unofficial COVERART field, which is supported by the current code in taglib-sharp. I can see several approaches to support the newer field, for taglib-sharp to be compatible with other projects.

All following approaches refer to changing the Pictures property

Drop support for COVERART: replace COVERART with METADATA_BLOCK_PICTURE, and TagLib.Picture with TagLib.Flac.Picture. Most direct way, although breaks compatibility with older versions.
Support for both COVERART and METADATA_BLOCK_PICTURE: in the getter, read cover arts from both COVERART and METADATA_BLOCK_PICTURE and return corresponding picture objects. In the setter, two approaches are possible:
1. Store pictures depending on the object type: TagLib.Picture objects are stored in the COVERART field, and TagLib.FLAC.Picture objects are stored in the METADATA_BLOCK_PICTURE. Obviously the ordering of a file with mixed COVERART and METADATA_BLOCK_PICTURE pictures would be undefined.
2. Convert old TagLib.Picture objects from the COVERART field to TagLib.FLAC.Picture , and store everything in METADATA_BLOCK_PICTURE. This would however enforce migrating to the new format.

I think 2.i. is the most versatile way, however I'm open to suggestions before sending a PR.

The text was updated successfully, but these errors were encountered:

Starwer · 2017-09-13T17:37:59Z

Good initiative !
I would have a preference for 2.ii. although for 2.ii it is probably simpler to store everything (and convert) in the same TagLib.FLAC.Picture or TagLib.Picture array from the File reading. Then you have only one object class/collection to handle. This should make it also easier for the end-user.

alixinne · 2017-09-13T21:45:37Z

Thanks for the input!
Currently the XiphComment class only stores a list of string[] which represent the raw field values stored in the file. Although this means there is virtually no need for tracking state (especially if the user uses the GetField and SetField methods directly), property getters and setters can be quite expensive (in the case of the Pictures property, this involves decoding the base64 contents, then allocating data structures, and then parsing the picture data).

I'm also wondering about the classes representing pictures. Aside from a few extra properties in TagLib.FLAC.Picture (width, height, color depth, and indexed color count), TagLib.Picture and TagLib.FLAC.Picture have a lot in common. But Tag.Pictures being of type IPicture[] this calls for some weird "if (picture is TagLib.FLAC.Picture)" code.

Implementations for 2.i and 2.ii, to see what it looks like:

1. master...vtavernier:metadata-block-picture-2-i
2. master...vtavernier:metadata-block-picture-2-ii

Starwer · 2017-09-14T07:45:44Z

It seems like 2.ii. still stores both COVERART and METADATA_BLOCK_PICTURE...
I think you're right, storing the raw values poses some problems...
In my opinion, the most efficient architecture would be to store only Picture[] (as a field) in the Tag.
Only at the end of the File.CreateFile() call you convert all the base64 COVERART and METADATA_BLOCK_PICTURE to populate Picture[] field (and you dispose the former base64 fields). Only in the Tag.Render() you convert the Picture[] to the METADATA_BLOCK_PICTURE.
This way this is faster, the returned list doesn't change every time you call the Picture Getter, and you gain also in memory usage (as a Base64 is an inefficient way to store data).
That would mean a bit of code-motion but in the end, it gets simpler.

What I mean, is that with current approach, I always find this a bit confusing:
WriteLine("{0}", file.Tag.Count); // --> 1
file.Tag.Picture.Add(myPicOne);
file.Tag.Picture.Add(myPicTwo);
WriteLine("{0}", file.Tag.Count); // --> still 1 !!!

alixinne · 2017-09-14T13:08:06Z

Sorry, my mistake, the branches were swapped, this is fixed now.

👍 for the initial parsing of the COVERART / METADATA_BLOCK_PICTURE and only writing it during the Render method call. The FieldCount property would need changing too, as it is based on the total number of elements in the field_list dictionary values.

Detecting user changes to the target fields is easy as all field access methods are already implemented (if SetField/RemoveField are called for COVERART or METADATA_BLOCK_PICTURE, we would then invalidate the current Picture[] state and parse it again).

The Picture[] array could be loaded lazily though, to prevent a performance hit on scenarios where lot of files are involved, but only simple text-based tags are being read.

alixinne · 2017-09-14T14:44:22Z

Implemented lazy approach in vtavernier:metadata-block-picture-lazy

Starwer · 2017-09-14T14:44:51Z

The Picture[] array could be loaded lazily though, to prevent a performance hit on scenarios where lot of files are involved, but only simple text-based tags are being read.

Hey ! That would be really cool ! Great idea !
If you encapsulate this lazy load into the Picture class, for example by adding a new constructor public Picture (File, long position, long length), a lot of other format may make use of it. I'd happily do the Matroska adaption to make use of it. We just must be careful to make sure to read the Pictures before we write these (when calling File.Save()), otherwise the physical location of the picture we want to write may be already overwritten or shifted in the file under write.

alixinne · 2017-09-19T12:30:48Z

If you encapsulate this lazy load into the Picture class, for example by adding a new constructor public Picture (File, long position, long length), a lot of other format may make use of it.

This would be very interesting in order to reduce the memory stress while scanning a lot of files. Running a memory profiler on taglib-sharp reveals a lot of allocations of byte[] objects from ByteVectors as well as strings. If a solution is implemented for Pictures, other fields should be able to benefit from it. I think this is out of the scope of this particular issue though.

alixinne mentioned this issue Sep 19, 2017

[Ogg] Support for METADATA_BLOCK_PICTURE #73

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ogg] Support for METADATA_BLOCK_PICTURE #70

[Ogg] Support for METADATA_BLOCK_PICTURE #70

alixinne commented Sep 13, 2017

Starwer commented Sep 13, 2017 •

edited

Loading

alixinne commented Sep 13, 2017

Starwer commented Sep 14, 2017 •

edited

Loading

alixinne commented Sep 14, 2017

alixinne commented Sep 14, 2017 •

edited

Loading

Starwer commented Sep 14, 2017

alixinne commented Sep 19, 2017

[Ogg] Support for METADATA_BLOCK_PICTURE #70

[Ogg] Support for METADATA_BLOCK_PICTURE #70

Comments

alixinne commented Sep 13, 2017

Starwer commented Sep 13, 2017 • edited Loading

alixinne commented Sep 13, 2017

Starwer commented Sep 14, 2017 • edited Loading

alixinne commented Sep 14, 2017

alixinne commented Sep 14, 2017 • edited Loading

Starwer commented Sep 14, 2017

alixinne commented Sep 19, 2017

Starwer commented Sep 13, 2017 •

edited

Loading

Starwer commented Sep 14, 2017 •

edited

Loading

alixinne commented Sep 14, 2017 •

edited

Loading