Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ogg] Support for METADATA_BLOCK_PICTURE #70

Open
alixinne opened this issue Sep 13, 2017 · 7 comments
Open

[Ogg] Support for METADATA_BLOCK_PICTURE #70

alixinne opened this issue Sep 13, 2017 · 7 comments

Comments

@alixinne
Copy link
Contributor

According to the xiph.org wiki, the currently recommended way to store pictures as part of an Ogg Vorbis file is through the METADATA_BLOCK_PICTURE field. Values of this field are base64 encoded values following the FLAC specification for embedded pictures.

The wiki also mentions the existence of the deprecated and unofficial COVERART field, which is supported by the current code in taglib-sharp. I can see several approaches to support the newer field, for taglib-sharp to be compatible with other projects.

All following approaches refer to changing the Pictures property

  1. Drop support for COVERART: replace COVERART with METADATA_BLOCK_PICTURE, and TagLib.Picture with TagLib.Flac.Picture. Most direct way, although breaks compatibility with older versions.
  2. Support for both COVERART and METADATA_BLOCK_PICTURE: in the getter, read cover arts from both COVERART and METADATA_BLOCK_PICTURE and return corresponding picture objects. In the setter, two approaches are possible:
    1. Store pictures depending on the object type: TagLib.Picture objects are stored in the COVERART field, and TagLib.FLAC.Picture objects are stored in the METADATA_BLOCK_PICTURE. Obviously the ordering of a file with mixed COVERART and METADATA_BLOCK_PICTURE pictures would be undefined.
    2. Convert old TagLib.Picture objects from the COVERART field to TagLib.FLAC.Picture , and store everything in METADATA_BLOCK_PICTURE. This would however enforce migrating to the new format.

I think 2.i. is the most versatile way, however I'm open to suggestions before sending a PR.

@Starwer
Copy link
Contributor

Starwer commented Sep 13, 2017

Good initiative !
I would have a preference for 2.ii. although for 2.ii it is probably simpler to store everything (and convert) in the same TagLib.FLAC.Picture or TagLib.Picture array from the File reading. Then you have only one object class/collection to handle. This should make it also easier for the end-user.

@alixinne
Copy link
Contributor Author

Thanks for the input!
Currently the XiphComment class only stores a list of string[] which represent the raw field values stored in the file. Although this means there is virtually no need for tracking state (especially if the user uses the GetField and SetField methods directly), property getters and setters can be quite expensive (in the case of the Pictures property, this involves decoding the base64 contents, then allocating data structures, and then parsing the picture data).

I'm also wondering about the classes representing pictures. Aside from a few extra properties in TagLib.FLAC.Picture (width, height, color depth, and indexed color count), TagLib.Picture and TagLib.FLAC.Picture have a lot in common. But Tag.Pictures being of type IPicture[] this calls for some weird "if (picture is TagLib.FLAC.Picture)" code.

Implementations for 2.i and 2.ii, to see what it looks like:

    1. master...vtavernier:metadata-block-picture-2-i
    2. master...vtavernier:metadata-block-picture-2-ii

@Starwer
Copy link
Contributor

Starwer commented Sep 14, 2017

It seems like 2.ii. still stores both COVERART and METADATA_BLOCK_PICTURE...
I think you're right, storing the raw values poses some problems...
In my opinion, the most efficient architecture would be to store only Picture[] (as a field) in the Tag.
Only at the end of the File.CreateFile() call you convert all the base64 COVERART and METADATA_BLOCK_PICTURE to populate Picture[] field (and you dispose the former base64 fields). Only in the Tag.Render() you convert the Picture[] to the METADATA_BLOCK_PICTURE.
This way this is faster, the returned list doesn't change every time you call the Picture Getter, and you gain also in memory usage (as a Base64 is an inefficient way to store data).
That would mean a bit of code-motion but in the end, it gets simpler.

What I mean, is that with current approach, I always find this a bit confusing:
WriteLine("{0}", file.Tag.Count); // --> 1
file.Tag.Picture.Add(myPicOne);
file.Tag.Picture.Add(myPicTwo);
WriteLine("{0}", file.Tag.Count); // --> still 1 !!!

@alixinne
Copy link
Contributor Author

Sorry, my mistake, the branches were swapped, this is fixed now.

👍 for the initial parsing of the COVERART / METADATA_BLOCK_PICTURE and only writing it during the Render method call. The FieldCount property would need changing too, as it is based on the total number of elements in the field_list dictionary values.

Detecting user changes to the target fields is easy as all field access methods are already implemented (if SetField/RemoveField are called for COVERART or METADATA_BLOCK_PICTURE, we would then invalidate the current Picture[] state and parse it again).

The Picture[] array could be loaded lazily though, to prevent a performance hit on scenarios where lot of files are involved, but only simple text-based tags are being read.

@alixinne
Copy link
Contributor Author

alixinne commented Sep 14, 2017

Implemented lazy approach in vtavernier:metadata-block-picture-lazy

@Starwer
Copy link
Contributor

Starwer commented Sep 14, 2017

The Picture[] array could be loaded lazily though, to prevent a performance hit on scenarios where lot of files are involved, but only simple text-based tags are being read.

Hey ! That would be really cool ! Great idea !
If you encapsulate this lazy load into the Picture class, for example by adding a new constructor public Picture (File, long position, long length), a lot of other format may make use of it. I'd happily do the Matroska adaption to make use of it. We just must be careful to make sure to read the Pictures before we write these (when calling File.Save()), otherwise the physical location of the picture we want to write may be already overwritten or shifted in the file under write.

@alixinne
Copy link
Contributor Author

If you encapsulate this lazy load into the Picture class, for example by adding a new constructor public Picture (File, long position, long length), a lot of other format may make use of it.

This would be very interesting in order to reduce the memory stress while scanning a lot of files. Running a memory profiler on taglib-sharp reveals a lot of allocations of byte[] objects from ByteVectors as well as strings. If a solution is implemented for Pictures, other fields should be able to benefit from it. I think this is out of the scope of this particular issue though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants