Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Unified dictionary-based tag interface #23

Merged
merged 44 commits into from

3 participants

@supermihi
Collaborator

Hi there,

as discussed on the taglib-devel mailing list some months ago, I have started to implement a dictionary-based abstract interface for reading and writing tags regardless of the specific type. It is meant as an extension of the rather few static functions in TagLib::Tag (like title(), album(), artist()), capable of handling (almost) free-form text tag names, as well as multiple values for the same tag.

Of course this is not possible for all file types. The easiest are vorbis comments which already are of the desired form. ID3v2 is rather complicated due to the huge amount of predefined frames, but due to TXXX frames it is possible to store arbitrary key=[list of values] pairs.

I am aware that this project is far from complete (e.g. there is no abstract way to read and write binary picture data yet), but I would be happy about comments regarding my approach, the implementation, and chances of inclusion in the main taglib trunk, before I put too much into following the wrong direction.

Thanks in advance,
Michael

Michael Helm... added some commits
Michael Helmling Some preliminary work for unified dictionary tag interface support.
- toDict() and fromDict() for XiphComments
- toDict() for ID3v2 Tags
b262180
Michael Helmling More support for the unified dictionary interface.
Addded fromDict() function to ID3v2Tag. Added fromDict() and
toDict() functions to the TagUnion class (uses the first non-empty tag).
Added fromDict() and toDict() functions for the generic Tag class, only
handling common tags without duplicates. Addded preliminary mp3 test
case. Python3 bindings now available on my github site.
58db919
Michael Helmling Added toDict and fromDict methods for APE tags. fa8159a
Michael Helmling Made im/export functions nonvirtual. Added similar functions to File and
its subclasses. TagLib::File contains a bunch of dynamic_casts to call
the correct specializations.
5647b2e
Michael Helmling Merge remote-tracking branch 'official/master' 0356249
Michael Helmling Splitted ID3v2Tag::toDict() into several functions.
This should simplify future transition to virtual functions.
2d31075
Michael Helmling Restructured and simplified ID3v2Tag::fromDict(). 0c2ca20
Michael Helmling Further cleanup and simplification in id3v2dicttools 772bc9f
Michael Helmling Merge remote-tracking branch 'official/master' 292a377
Michael Helmling Implemented dict interface for more formats.
Now supported: MOD files (IT, MOD, S3M, XM), RIFF files
(AIFF, WAV), TrueAudio, WavPack.
0eaf3a3
Michael Helmling Added tests and information about ignored id3 frames.
The ID3v2::toDict() function now has an optional
StringList* argument which will contain information
about frames that could not be converted to the dict
interface.
There are some dict tests for APE and FLAC now, and the
ID3v2 test was enlarged.
c4cef55
@scotchi
Owner

So just throwing in some general comments, mostly on the API:

  • Dict is a bad name for usage in the API. It's the data-type, not descriptive of the content. Property / properties would be better.

  • I don't think having toDict and similar methods in the File class is the right place. Properties should belong to the tags and then, in the case of there being a file with multiple tags, the file can handle presenting a unified interface (as it currently does with some types). Again, API-wise, I believe Tag::properties and Tag::setProperties would be better naming. Since it's also unclear (both in the current version and in my suggestion) if that would overwrite and / or clear existing properties, it'd probably be good to have e.g. setProperties(SomeObviousEnumExplainingWhatHappens).

  • Semantically there needs to be a difference between is not set and cannot be set. I'm pretty sure there will need to be a special class that deals with those semantics rather than just a typedef to the Map class. Otherwise the API will just silently throw away data.

  • There should probably also be a variant (similar to QVariant) class that works with the mapped data.

  • It might be a good idea for the key of the map to be an enum. I'm not sure on this one. Enums can be extended without breaking binary compatibility (at the end) and are less susceptible to bugs introduced by typos.

  • The API presented in id3v2dicttools.h is bad. Having function signatures like TagLib::ID3v2::ignored(const ByteVector &) is totally ambiguous. Those functions belong in ID3v2::Frame and need better names.

  • There are a couple of coding style issues. The two that I noticed were: (a) there's nowhere else in TagLib (as far as I'm aware) that prefixes functions with underscores and (b) TagLib doesn't use spaces between conditionals (if, while) and the opening parenthesis -- so it should be if(foo) not if (foo).

I know Lukas has thought a lot about this sort of interface, so he can probably say if this is close to the direction that he's had in mind.

@scotchi
Owner

One other minor comment -- I noticed sometimes that there was !(someString == String::null) a few places. First that specific instance should use isNull(), but I also noticed that somehow we never had a operator!= defined for String and added that.

@lalinsky
Owner

I'll have more comments later, but regarding the enum for the keys, I'd like to have the ability to store arbitrary string keys in the map for tag formats that support such keys, but have constants with well-known keys that TagLib officially supports.

@supermihi
Collaborator
taglib/ape/apefile.cpp
((5 lines not shown))
+{
+ if (d->hasAPE)
+ return d->tag.access<APE::Tag>(APEIndex, false)->toDict();
+ if (d->hasID3v1)
+ return d->tag.access<ID3v1::Tag>(ID3v1Index, false)->toDict();
+ return TagLib::TagDict();
+}
+
+void APE::File::fromDict(const TagDict &dict)
+{
+ if (d->hasAPE)
+ d->tag.access<APE::Tag>(APEIndex, false)->fromDict(dict);
+ else if (d->hasID3v1)
+ d->tag.access<ID3v1::Tag>(ID3v1Index, false)->fromDict(dict);
+ else
+ d->tag.access<APE::Tag>(APE, true)->fromDict(dict);
@lalinsky Owner
lalinsky added a note

I think that fromDict() should not modify the file in any way.

@supermihi Collaborator

I don't see how this modifies the file … doesn't access(APE, true) just create an APE tag in memory and the file is not touched until save() is called on the File object?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/ape/apetag.cpp
((8 lines not shown))
+ for (; it != itemListMap().end(); ++it) {
+ String tagName = it->first.upper();
+ // These two tags need to be handled specially; in APE tags the track number is usually
+ // named TRACK instead of TRACKNUMBER, the date tag is YEAR instead of DATE
+ //
+ if (tagName == "TRACK")
+ tagName = "TRACKNUMBER";
+ else if (tagName == "YEAR")
+ tagName = "DATE";
+ if (it->second.type() == Item::Text)
+ dict[tagName].append(it->second.toStringList());
+ }
+ return dict;
+}
+
+void APE::Tag::fromDict(const TagDict &orig_dict)
@lalinsky Owner
lalinsky added a note

Should be called origDict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/ape/apetag.cpp
((29 lines not shown))
+ dict.insert("TRACK", dict["TRACKNUMBER"]);
+ dict.erase("TRACKNUMBER");
+ }
+ if (dict.contains("DATE")) {
+ dict.insert("YEAR", dict["DATE"]);
+ dict.erase("DATE");
+ }
+
+ // first check if tags need to be removed completely
+ StringList toRemove;
+ ItemListMap::ConstIterator remIt = itemListMap().begin();
+ for (; remIt != itemListMap().end(); ++remIt) {
+ if (remIt->second.type() != APE::Item::Text)
+ // ignore binary and locator APE items
+ continue;
+ if (!dict.contains(remIt->first.upper()))
@lalinsky Owner
lalinsky added a note

Hm, it probably makes sense for TagDict to be case-insensitive.

@supermihi Collaborator

Yes, by various other comments here I am almost convinced that it makes sense to define a dedicated class for TagDict instead of using a simple typedef ... that class should be case-insensitive on the keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/it/itfile.h
@@ -61,6 +61,18 @@
Mod::Tag *tag() const;
/*!
+ * Implements the unified tag dictionary interface -- export function.
+ * Forwards to Mod::Tag::toDict().
+ */
+ TagDict toDict() const;
+
+ /*!
+ * Implements the unified tag dictionary interface -- import function.
+ * Forwards to Mod::Tag::fromDict().
+ */
+ void fromDict(const TagDict &);
@lalinsky Owner
lalinsky added a note

There should be a BIC: xxx comment here saying that these functions should be removed when they are virtual, as they can be inherited from TagLib::File.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2dicttools.cpp
((55 lines not shown))
+ }
+
+ /*!
+ * A map of translations frameID <-> tag used by the unified dictionary interface.
+ */
+ static const uint numid3frames = 54;
+ static const char *id3frames[][2] = {
+ // Text information frames
+ { "TALB", "ALBUM"},
+ { "TBPM", "BPM" },
+ { "TCOM", "COMPOSER" },
+ { "TCON", "GENRE" },
+ { "TCOP", "COPYRIGHT" },
+ { "TDEN", "ENCODINGTIME" },
+ { "TDLY", "PLAYLISTDELAY" },
+ { "TDOR", "ORIGINALRELEASETIME" },
@lalinsky Owner
lalinsky added a note

I'd prefer to call this ORIGINALDATE.

@supermihi Collaborator

I agree because ORIGINALDATE is quite popular and ORIGINALRELEASETIME ugly and lengthy, though it's a bit of an inconsistency, since the id3 specs speak of "release time" for both TDRL and TDOR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2dicttools.cpp
((62 lines not shown))
+ // Text information frames
+ { "TALB", "ALBUM"},
+ { "TBPM", "BPM" },
+ { "TCOM", "COMPOSER" },
+ { "TCON", "GENRE" },
+ { "TCOP", "COPYRIGHT" },
+ { "TDEN", "ENCODINGTIME" },
+ { "TDLY", "PLAYLISTDELAY" },
+ { "TDOR", "ORIGINALRELEASETIME" },
+ { "TDRC", "DATE" },
+ // { "TRDA", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ // { "TDAT", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ // { "TYER", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ // { "TIME", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ { "TDRL", "RELEASETIME" },
+ { "TDTG", "TAGGINGTIME" },
@lalinsky Owner
lalinsky added a note

I know that the ID3 standard calls these frames 'time', but I think it would be better to call them 'date'.

@supermihi Collaborator

Agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2dicttools.cpp
((66 lines not shown))
+ { "TCON", "GENRE" },
+ { "TCOP", "COPYRIGHT" },
+ { "TDEN", "ENCODINGTIME" },
+ { "TDLY", "PLAYLISTDELAY" },
+ { "TDOR", "ORIGINALRELEASETIME" },
+ { "TDRC", "DATE" },
+ // { "TRDA", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ // { "TDAT", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ // { "TYER", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ // { "TIME", "DATE" }, // id3 v2.3, replaced by TDRC in v2.4
+ { "TDRL", "RELEASETIME" },
+ { "TDTG", "TAGGINGTIME" },
+ { "TENC", "ENCODEDBY" },
+ { "TEXT", "LYRICIST" },
+ { "TFLT", "FILETYPE" },
+ { "TIPL", "INVOLVEDPEOPLE" },
@lalinsky Owner
lalinsky added a note

TIPL can't be mapped like this, it contains an array of key/value pairs. Same for TMCL.

@supermihi Collaborator

Okay, that's interesting. Any suggestion on how to handle this? I think I'd prefer

instrument=[persons]

though this is losing the information which type of frame the values come from.

@lalinsky Owner

As I said, I'm biased because I wrote something like this before, but I'd use a mapping like described here - http://wiki.musicbrainz.org/Picard_Tag_Mapping

You should be able to consistently reverse the mapping, so I'd use a prefix like performer: for TMCL and only white-listed items from TIPL. That way you know that everything prefixed performer: goes to TMCL and the selected fields to to TIPL.

@supermihi Collaborator

Sounds good, though it's odd that vorbis has another convention here. :-/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2dicttools.cpp
((82 lines not shown))
+ { "TIT1", "CONTENTGROUP" },
+ { "TIT2", "TITLE"},
+ { "TIT3", "SUBTITLE" },
+ { "TKEY", "INITIALKEY" },
+ { "TLAN", "LANGUAGE" },
+ { "TLEN", "LENGTH" },
+ { "TMCL", "MUSICIANCREDITS" },
+ { "TMED", "MEDIATYPE" },
+ { "TMOO", "MOOD" },
+ { "TOAL", "ORIGINALALBUM" },
+ { "TOFN", "ORIGINALFILENAME" },
+ { "TOLY", "ORIGINALLYRICIST" },
+ { "TOPE", "ORIGINALARTIST" },
+ { "TOWN", "OWNER" },
+ { "TPE1", "ARTIST"},
+ { "TPE2", "PERFORMER" },
@lalinsky Owner
lalinsky added a note

I'd say that this is now de-facto the ALBUMARTIST field. The convention was started by WMP, followed by iTunes, but now almost everybody uses it.

@supermihi Collaborator

That's odd, I am using "performer" frequently to specify the orchestra for on classical works ... anyway, I agree that it's pointless to boycott a de-facto standard set by Microsoft/Apple. ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2dicttools.cpp
((84 lines not shown))
+ { "TIT3", "SUBTITLE" },
+ { "TKEY", "INITIALKEY" },
+ { "TLAN", "LANGUAGE" },
+ { "TLEN", "LENGTH" },
+ { "TMCL", "MUSICIANCREDITS" },
+ { "TMED", "MEDIATYPE" },
+ { "TMOO", "MOOD" },
+ { "TOAL", "ORIGINALALBUM" },
+ { "TOFN", "ORIGINALFILENAME" },
+ { "TOLY", "ORIGINALLYRICIST" },
+ { "TOPE", "ORIGINALARTIST" },
+ { "TOWN", "OWNER" },
+ { "TPE1", "ARTIST"},
+ { "TPE2", "PERFORMER" },
+ { "TPE3", "CONDUCTOR" },
+ { "TPE4", "ARRANGER" },
@lalinsky Owner
lalinsky added a note

I'd like to call this REMIXER, but don't have a strong opinion.

@supermihi Collaborator

Neither do I … I have choosen ARRANGER here because I'm more into the classical stuff where arrangers are more common, but I understand that's not the mainstream. :-) It seems that you have done more investigation on this, so I will stick to REMIXER.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2dicttools.cpp
((91 lines not shown))
+ { "TOAL", "ORIGINALALBUM" },
+ { "TOFN", "ORIGINALFILENAME" },
+ { "TOLY", "ORIGINALLYRICIST" },
+ { "TOPE", "ORIGINALARTIST" },
+ { "TOWN", "OWNER" },
+ { "TPE1", "ARTIST"},
+ { "TPE2", "PERFORMER" },
+ { "TPE3", "CONDUCTOR" },
+ { "TPE4", "ARRANGER" },
+ { "TPOS", "DISCNUMBER" },
+ { "TPRO", "PRODUCEDNOTICE" },
+ { "TPUB", "PUBLISHER" },
+ { "TRCK", "TRACKNUMBER" },
+ { "TRSN", "RADIOSTATION" },
+ { "TRSO", "RADIOSTATIONOWNER" },
+ { "TSOA", "ALBUMSORT" },
@lalinsky Owner
lalinsky added a note

Can you please map TSO2 to ALBUMARTISTSORT (non-standard frame, but commonly used -- iTunes' invention)

@supermihi Collaborator

Ok, Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/ape/apetag.cpp
@@ -174,6 +174,71 @@ void APE::Tag::setTrack(uint i)
addValue("TRACK", String::number(i), true);
}
+TagDict APE::Tag::toDict() const
+{
+ TagDict dict;
+ ItemListMap::ConstIterator it = itemListMap().begin();
+ for (; it != itemListMap().end(); ++it) {
+ String tagName = it->first.upper();
+ // These two tags need to be handled specially; in APE tags the track number is usually
+ // named TRACK instead of TRACKNUMBER, the date tag is YEAR instead of DATE
@lalinsky Owner
lalinsky added a note

I'd add one more conversion here, ALBUM ARTIST to ALBUMARTIST (used by foobar2000).

@supermihi Collaborator

Should I also do the other direction on exporting, or do other programs understand ALBUMARTIST here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lalinsky
Owner

Some comments:

  • In general, the patch looks good to me (I haven't completely reviewed the id3v2 code yet). I also don't particularly like the Dict naming, but don't have a better suggestion.

  • I'd like the mapping to be based on this document, but I'm a little biased here. I'm especially missing MusicBrainz IDs. :)

  • WMA and MP4 should be implemented as well, see the spreadsheet. I can do this.

  • I'm not sure about automatically importing all TXXX frames. I'd rather have a white-list of known good values that TXXX can contain.

  • I'd definitely like to see a number of constants defined and to use these where possible. Literal strings should be used only for dynamic fields.

@supermihi
Collaborator
  • do you agree with Scott on properties / setProperties ?
  • very helpful document … I was looking for such a complete list all the time! I don't mind adding the musicbrainz stuff, though a large part of those frames should already be translated correctly because of the general TXXX handling
  • that would be great … I'll be busy enough with ID3 for a while :). Is it ok to ignore ID3v1?
  • I disagree in this point. My main goal in implementing this dictionary-based interface is to free users from any restrictions on what software developers think are "good" tag keys. If someone wants a tag foo=[bar1,bar2], he should be able to do that. And since I expect that music software will prefer the dict-based taglib interface once it's complete and stable, features not supported by that interface will become more and more unlikely to be used at all. On the other hand I know that some programs create a lot of proprietary TXXX spam, so I understand your point.
  • I'll try to revise my code. Do you also refer to e.g. the conversions in APE::Tag?
@supermihi
Collaborator

Regarding TXXX frames, what do you think about the following:

  • map underscores in property dict to spaces in TXXX description and vice versa
  • uppercase first letter of each word in TXXX description (dict keys are case insensitive)

This would include a lot of the musicbrainz frames in your table. By the way, which of the columns would you suggest for the translation?

Michael Helm... added some commits
Michael Helmling Basic implementation of a PropertyMap.
Implemented key/valuelist property map with
case-insensitive ASCII keys and StringList values.

Todo:
- subclass StringList to add flags indicating whether a value could
be written to the specific file format
- add member attribute indicating list of frames that could not be
parsed into the PropertyMap representation.
d11189b
Michael Helmling Add unsupportedData() to PropertyMap, simplified [] behavior. 18ae797
Michael Helmling Migration to new PropertyMap ... done ape to mod. e4d955d
Michael Helmling Started to work on ID3v2. a5e45f1
Michael Helmling Implemented asProperties() in all relevant textual frames. 0c8e5bb
Michael Helmling More progress in ID3 ... setProperties() will get messy :( a8632f7
@lalinsky
Owner

Sorry about not answering the questions. I think I've changed my mind about the TXXX frames, I guess it doesn't do harm to load all of them.

@supermihi
Collaborator

Glad to hear that! :-) As you can see, I have reworked large parts of the patch, especially the ID3 part is now much deeper integrated into the existing classes.
I still need to do a bit more testing and refinement, I'll give notice as soon as I think it's time for another pull request.

@supermihi
Collaborator

There we go ... I hereby re-request to pull in the updated version of my patch, and/or comment on what further modifications are necessary.

CMakeLists.txt
@@ -16,8 +16,8 @@ if(VISIBILITY_HIDDEN)
add_definitions (-fvisibility=hidden)
endif()
-option(BUILD_TESTS "Build the test suite" OFF)
@lalinsky Owner

This change shouldn't be included in the patch.

@supermihi Collaborator

Corrected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/ape/apefile.cpp
((13 lines not shown))
+void APE::File::removeUnsupportedProperties(const StringList &properties)
+{
+ if(d->hasAPE)
+ d->tag.access<APE::Tag>(APEIndex, false)->removeUnsupportedProperties(properties);
+ if(d->hasID3v1)
+ d->tag.access<ID3v1::Tag>(ID3v1Index, false)->removeUnsupportedProperties(properties);
+}
+
+PropertyMap APE::File::setProperties(const PropertyMap &properties)
+{
+ if(d->hasAPE)
+ return d->tag.access<APE::Tag>(APEIndex, false)->setProperties(properties);
+ else if(d->hasID3v1)
+ return d->tag.access<ID3v1::Tag>(ID3v1Index, false)->setProperties(properties);
+ else
+ return d->tag.access<APE::Tag>(APE, true)->setProperties(properties);
@lalinsky Owner

APEIndex instead of APE?

@supermihi Collaborator

Thanks, corrected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/frames/textidentificationframe.cpp
@@ -238,6 +377,25 @@ void UserTextIdentificationFrame::setDescription(const String &s)
TextIdentificationFrame::setText(l);
}
+PropertyMap UserTextIdentificationFrame::asProperties() const
+{
+ String tagName = description();
+ // Quodlibet/Exfalso use QuodLibet::<tagname> if you set an arbitrary ID3 tag.
@lalinsky Owner

I'm not sure if it's a good idea to remove the prefixes, because load/save will drop them and QL will not see the tags anymore.

@supermihi Collaborator

I'm not sure either. The main reason why I've done it this way is that I currently use ex falso as my main tagging tool. :-) Is the "::" style used by other programs, too?
I would also be ok with including the prefixes, and let the application using Taglib decide how to handle them.

@lalinsky Owner

I don't know about any other program. But specifically because you are using Ex Falso, this makes TagLib basically destroy the tags for you if you save the file, no?

@supermihi Collaborator

True–however once taglib supports free-form keys I'll stop using Ex Falso. :-) Anyway, I agree that the issue is not worth breaking read/write symmetry, so I'll remove the prefix detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lalinsky lalinsky commented on the diff
taglib/mpeg/id3v2/id3v2frame.cpp
@@ -95,10 +100,54 @@ ByteVector Frame::textDelimiter(String::Type t)
return d;
}
+const String Frame::instrumentPrefix("PERFORMER:");
@lalinsky Owner

Normally I'd say to use regular const char* here, it would break parallel calls to TagLib, but now with atomic refcounts I guess it should be ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lalinsky lalinsky commented on the diff
taglib/mpeg/id3v2/id3v2frame.cpp
((26 lines not shown))
+ }
+ // now we check if it's one of the "special" cases:
+ // -LYRICS: depending on the number of values, use USLT or TXXX (with description=LYRICS)
+ if(key == "LYRICS" && values.size() == 1){
+ UnsynchronizedLyricsFrame *frame = new UnsynchronizedLyricsFrame();
+ frame->setText(values.front());
+ return frame;
+ }
+ // -URL: depending on the number of values, use WXXX or TXXX (with description=URL)
+ if((key == "URL" || key.startsWith(urlPrefix)) && values.size() == 1){
+ UserUrlLinkFrame *frame = new UserUrlLinkFrame(String::UTF8);
+ frame->setDescription(key == "URL" ? key : key.substr(urlPrefix.size()));
+ frame->setUrl(values.front());
+ return frame;
+ }
+ // -COMMENT: depending on the number of values, use COMM or TXXX (with description=COMMENT)
@lalinsky Owner

This seems misleading, you use COMM in both cases.

@supermihi Collaborator

I don't understand your comment. If the condition "values.size() == 1" fails, then the last line of the method is executed, creating a TXXX frame.

@lalinsky Owner

Ah, thanks, I didn't notice that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lalinsky lalinsky commented on the diff
taglib/mpeg/id3v2/id3v2frame.cpp
((29 lines not shown))
+ if(key == "LYRICS" && values.size() == 1){
+ UnsynchronizedLyricsFrame *frame = new UnsynchronizedLyricsFrame();
+ frame->setText(values.front());
+ return frame;
+ }
+ // -URL: depending on the number of values, use WXXX or TXXX (with description=URL)
+ if((key == "URL" || key.startsWith(urlPrefix)) && values.size() == 1){
+ UserUrlLinkFrame *frame = new UserUrlLinkFrame(String::UTF8);
+ frame->setDescription(key == "URL" ? key : key.substr(urlPrefix.size()));
+ frame->setUrl(values.front());
+ return frame;
+ }
+ // -COMMENT: depending on the number of values, use COMM or TXXX (with description=COMMENT)
+ if((key == "COMMENT" || key.startsWith(commentPrefix)) && values.size() == 1){
+ CommentsFrame *frame = new CommentsFrame(String::UTF8);
+ frame->setDescription(key == "COMMENT" ? key : key.substr(commentPrefix.size()));
@lalinsky Owner

Aren't only comments with empty description loaded as "COMMENT"? If so, it seems that adding comment will convert it to "COMMENT:COMMENT" in the next read.

@supermihi Collaborator

No, in the current implementation either description="COMMENT" or description="" are parsed as comment. Similar applies for LYRICS and URL (see the documentation of properties() in id3v2tag.h).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2frame.cpp
((58 lines not shown))
+ { "WOAF", "FILEWEBPAGE" },
+ { "WOAR", "ARTISTWEBPAGE" },
+ { "WOAS", "AUDIOSOURCEWEBPAGE" },
+ { "WORS", "RADIOSTATIONWEBPAGE" },
+ { "WPAY", "PAYMENTWEBPAGE" },
+ { "WPUB", "PUBLISHERWEBPAGE" },
+ //{ "WXXX", "URL"}, handled specially
+ // Other frames
+ { "COMM", "COMMENT" },
+ //{ "USLT", "LYRICS" }, handled specially
+};
+
+Map<ByteVector, String> &idMap()
+{
+ static Map<ByteVector, String> m;
+ if(m.isEmpty())
@lalinsky Owner

Indentation.

@supermihi Collaborator

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
taglib/mpeg/id3v2/id3v2frame.cpp
((108 lines not shown))
+{
+ static Map<String, ByteVector> m;
+ if(m.isEmpty())
+ for(size_t i = 0; i < frameTranslationSize; ++i)
+ m[frameTranslation[i][1]] = frameTranslation[i][0];
+ if(m.contains(s.upper()))
+ return m[s];
+ return ByteVector::null;
+}
+
+PropertyMap Frame::asProperties() const
+{
+ const ByteVector &id = frameID();
+ // workaround until this function is virtual
+ if(id == "TXXX")
+ return dynamic_cast< const UserTextIdentificationFrame* >(this)->asProperties();
@lalinsky Owner

Unfortunately you can't assume that TXXX is always UserTextIdentificationFrame. It can be UnknownFrame if the frame was compressed and TagLib was not compiled with zlib. This is a common source of bugs. You need to check if the result is not NULL.

@supermihi Collaborator

Thanks for that hint. I've now added such a test which adds "UNKNOWN" to unsupportedData if that happens. Then, removeUnsupportedProperties removes all frames that successfully cast to UnknownFrame, if the given list includes the keyword "UNKNOWN". This doesn't seem super elegant to me, though ... perhaps there is a better way to deal with the issue.

@lalinsky Owner

Well, the same problem can happen for the other frame types.

@supermihi Collaborator

Oh, I misunderstood ... okay, I'll try to find a better solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lalinsky lalinsky commented on the diff
taglib/toolkit/tpropertymap.cpp
((183 lines not shown))
+
+StringList &PropertyMap::unsupportedData()
+{
+ return unsupported;
+}
+
+const StringList &PropertyMap::unsupportedData() const
+{
+ return unsupported;
+}
+
+String PropertyMap::prepareKey(const String &proposed) {
+ if(proposed.isEmpty())
+ return String::null;
+ for (String::ConstIterator it = proposed.begin(); it != proposed.end(); it++)
+ // forbid non-printable, non-ascii, '=' (#61) and '~' (#126)
@lalinsky Owner

Is there any special reason for excluding non-ASCII characters and =/~?

@lalinsky Owner

Ok, I see that the requirement for ASCII with = excluded comes from Vorbis Comments, but what about ~?

@supermihi Collaborator

I got this from wikipedia: "The field name can be composed of printable ASCII characters, 0x20 (space) through 0x7D ('}'), with 0x3D ('=') and 0x7E ('~') excluded."
I am uncertain about this point. Perhaps it'd be better to impose no restrictions on the PropertyMap level, and check the keys in the vorbis implementation. Appearantly APE has different restrictions (2-16 ASCII chars, not "ID3", "TAG", "OggS", "MP+"), so there's no real reason for using the vorbis convention (except that I have mostly Ogg/FLAC stuff).
I don't have a strong opinion in either direction ... I guess none of those restrictions will ever be relevant in practice. ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lalinsky
Owner

Generally, I like this very much. I still want to review some of the ID3 code, but I'll merge it soon.

@supermihi
Collaborator

I'm happy to hear that you like my patch. :-) Some general thougts of mine:

  • I am not completely satisfied with having a "unsupported" list allocated in every PropertyMap object, since it's gonna be rarely used in practice. But I really like the idea of being able to remove any "non-standard" stuff from the tag (which I found is often information written by proprietary applications).
  • in the long term I'd like to have picture support, but I'm not sure how to add this nicely. I was thinking of something like a list of tuples (description, binary data) as another field of PropertyMap.
  • I have partially adopted the ID3v2 ID/key mapping from the MusicBrainz document you referenced above, but you should probably have a look at it, you seem more experienced in this area (in particular, I didn't yet bother with musicbrainz-specific tags).
@supermihi supermihi commented on the diff
taglib/mpeg/id3v2/frames/textidentificationframe.cpp
((4 lines not shown))
+PropertyMap UserTextIdentificationFrame::asProperties() const
+{
+ String tagName = description();
+
+ PropertyMap map;
+ String key = map.prepareKey(tagName);
+ if(key.isNull()) // this frame's description is not a valid PropertyMap key -> add to unsupported list
+ map.unsupportedData().append(L"TXXX/" + description());
+ else {
+ StringList v = fieldList();
+ for(StringList::ConstIterator it = v.begin(); it != v.end(); ++it)
+ if(*it != description())
+ map.insert(key, *it);
+ }
+ return map;
+}
@supermihi Collaborator

Do you know why taglib stores a copy the description() in the actual text() part of TXXX frames? This is what makes this check necessary, and is a bit ugly since it forbids writing a tag where the value equals the key ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Michael Helmling Fixed handling of UnknownFrames in ID3v2.
- If an unknown frame with id "XXXX" occurs, an entry
"UNKNOWN/XXXX" is added to unsupportedData().
The removeUnsupportedProperties() method in turn
removes all unknown frames with id "XXXX" if it
encounters a string "UNKNOWN/XXXX" in the given list.

- Implemented findByDescription() to UnsynchronizedLyricsFrame
in order to support removal of lyrics frames with unsupported
keys.

- Adapted id3v2 test case to new QuodLibet policy.
f5a2518
@lalinsky lalinsky merged commit f5a2518 into taglib:master
@lalinsky
Owner

I've merged the changes, we can deal with other related issues separately. Thanks a lot for working on this!

@supermihi
Collaborator

Well, thank you for supporting my work, and for the helpful criticism, without which my original "taglib wrapper library" would have forever remained a buggy experimental piece of code. :-)
I'll be happy to stay in touch regarding further development of the new interface (especially support for cover art is what I'm interested in).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 26, 2011
  1. Some preliminary work for unified dictionary tag interface support.

    Michael Helmling authored
    - toDict() and fromDict() for XiphComments
    - toDict() for ID3v2 Tags
  2. More support for the unified dictionary interface.

    Michael Helmling authored
    Addded fromDict() function to ID3v2Tag. Added fromDict() and
    toDict() functions to the TagUnion class (uses the first non-empty tag).
    Added fromDict() and toDict() functions for the generic Tag class, only
    handling common tags without duplicates. Addded preliminary mp3 test
    case. Python3 bindings now available on my github site.
Commits on Aug 27, 2011
  1. Added toDict and fromDict methods for APE tags.

    Michael Helmling authored
Commits on Aug 28, 2011
  1. Made im/export functions nonvirtual. Added similar functions to File and

    Michael Helmling authored
    its subclasses. TagLib::File contains a bunch of dynamic_casts to call
    the correct specializations.
Commits on Sep 1, 2011
  1. Merge remote-tracking branch 'official/master'

    Michael Helmling authored
Commits on Sep 11, 2011
  1. Splitted ID3v2Tag::toDict() into several functions.

    Michael Helmling authored
    This should simplify future transition to virtual functions.
  2. Restructured and simplified ID3v2Tag::fromDict().

    Michael Helmling authored
Commits on Sep 12, 2011
  1. Further cleanup and simplification in id3v2dicttools

    Michael Helmling authored
Commits on Oct 30, 2011
  1. Merge remote-tracking branch 'official/master'

    Michael Helmling authored
Commits on Nov 2, 2011
  1. Implemented dict interface for more formats.

    Michael Helmling authored
    Now supported: MOD files (IT, MOD, S3M, XM), RIFF files
    (AIFF, WAV), TrueAudio, WavPack.
Commits on Jan 1, 2012
  1. Added tests and information about ignored id3 frames.

    Michael Helmling authored
    The ID3v2::toDict() function now has an optional
    StringList* argument which will contain information
    about frames that could not be converted to the dict
    interface.
    There are some dict tests for APE and FLAC now, and the
    ID3v2 test was enlarged.
Commits on Jan 4, 2012
  1. Merge remote-tracking branch 'official/master'

    Michael Helmling authored
Commits on Jan 14, 2012
  1. Merge remote-tracking branch 'official/master'

    Michael Helmling authored
  2. Implemented the most easy comments on the pull request.

    Michael Helmling authored
Commits on Jan 16, 2012
  1. Basic implementation of a PropertyMap.

    Michael Helmling authored
    Implemented key/valuelist property map with
    case-insensitive ASCII keys and StringList values.
    
    Todo:
    - subclass StringList to add flags indicating whether a value could
    be written to the specific file format
    - add member attribute indicating list of frames that could not be
    parsed into the PropertyMap representation.
Commits on Jan 17, 2012
  1. Add unsupportedData() to PropertyMap, simplified [] behavior.

    Michael Helmling authored
Commits on Jan 21, 2012
  1. Migration to new PropertyMap ... done ape to mod.

    Michael Helmling authored
  2. Started to work on ID3v2.

    Michael Helmling authored
Commits on Jan 22, 2012
  1. Implemented asProperties() in all relevant textual frames.

    Michael Helmling authored
  2. More progress in ID3 ... setProperties() will get messy :(

    Michael Helmling authored
Commits on Feb 14, 2012
  1. ID3 interface complete; vorbis done; wav done

    Michael Helmling authored
  2. Ported s3m; removed old id3v2dicttools.

    Michael Helmling authored
  3. Ported trueaudio.

    Michael Helmling authored
  4. Ported wavpack.

    Michael Helmling authored
  5. Ported xm.

    Michael Helmling authored
  6. fixed lots of bugs found by 'make'

    Michael Helmling authored
Commits on Feb 15, 2012
  1. Added lots of missing includes

    Michael Helmling authored
  2. Fixed id3v2 test

    Michael Helmling authored
  3. fixed tests

    Michael Helmling authored
Commits on Feb 19, 2012
  1. Added some functions, started to fix bugs.

    Michael Helmling authored
  2. fixed bugs preventing tests from running

    Michael Helmling authored
  3. removed debug messages

    Michael Helmling authored
Commits on Feb 25, 2012
  1. added APE tag PropertyMap test

    Michael Helmling authored
  2. Moved APE test to correct place; added MOD tag test.

    Michael Helmling authored
  3. Merge remote-tracking branch 'official/master'

    Michael Helmling authored
  4. some cosmetic changes

    Michael Helmling authored
  5. Added ID3v2 PropertyMap interface documentation.

    Michael Helmling authored
Commits on Feb 26, 2012
  1. remove Tests/Examples build from CMakeLists

    Michael Helmling authored
  2. Fix USLT frame creation in Frame::createTextualFrame()

    Michael Helmling authored
  3. Fixed identation

    Michael Helmling authored
  4. Add support for Unknown TXXX frames.

    Michael Helmling authored
  5. Removed quodlibet special case handling

    Michael Helmling authored
  6. Fixed handling of UnknownFrames in ID3v2.

    Michael Helmling authored
    - If an unknown frame with id "XXXX" occurs, an entry
    "UNKNOWN/XXXX" is added to unsupportedData().
    The removeUnsupportedProperties() method in turn
    removes all unknown frames with id "XXXX" if it
    encounters a string "UNKNOWN/XXXX" in the given list.
    
    - Implemented findByDescription() to UnsynchronizedLyricsFrame
    in order to support removal of lyrics frames with unsupported
    keys.
    
    - Adapted id3v2 test case to new QuodLibet policy.
Something went wrong with that request. Please try again.