Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mp3 tags and info #114

Closed
PAEz opened this issue Dec 15, 2014 · 18 comments
Closed

Mp3 tags and info #114

PAEz opened this issue Dec 15, 2014 · 18 comments
Milestone

Comments

@PAEz
Copy link

PAEz commented Dec 15, 2014

First up that code I put up for reading header info failed on a couple of files, but pretty sure I know why and it will be fixed in the next couple of days. Plus Ill be extending it and making it do more like error checking (sooooooo boring). I wrote that just by looking at the specs and not other peoples code (except the lookup tables, im lazy) so as to avoid having to use their licences....plus it was a real fun challenge. But now Ill go and look around and see what else needs adding to make it more complete.

At some point your going to need this stuff and was wondering if you still think....
https://github.com/aadsm/JavaScript-ID3-Reader
...is what you want to use?
Ive been looking at it and its real good, but I cant test on other devices as I dont own any and so cant comment on that.
The only real problem I saw for your needs was that it was using a string to hold its binary data and that meant youd have to first convert your mp3 to a string which is dumb considering your already have it as arraybuffer. So Ive patched it to be able to use an arraybuffer (Ill send them a pull request later), so now it doesnt have to reproduce the file once again, plus itll be quicker. On speed, Ive noticed a couple of things in that code that could be speed up and Ill prolly do them at some time. And if its going to use an arraybuffer I could add even more optimisations if they want.
If this is the one you want to use then Ill get my info stuff more robust and then Ill see if/how they want to add it in.

@Akamaozu
Copy link

PAEz is godsend on this project. Somebody give this (wo?)man a cookie!

@PAEz
Copy link
Author

PAEz commented Dec 15, 2014

hahaha...me like cookies!

Just be glad its a great project with an awesome lead or I would have left ages ago. :P
And I have no doubt that captbaritone would have overcome any issue himself given time. I often wonder and hope Im not going to far or spoiling his fun/learning.....if I ever do please tell me and Ill back off, its his baby.

@Akamaozu
Copy link

I think he's an awesome lead too. Three cheers for great leadership. Humble, focused and driven.

Cookies for everyone!
(except me, who is yet to make a single commit)

@captbaritone
Copy link
Owner

@PAEz You are not spoiling anything. I'm actually learning more, because your research and experience are bringing things to my attention that I hadn't know about before. Thank you.

@Akamaozu Get to work on that first PR ;) (and thanks for the 🍪)

@PAEz I didn't get a chance to dig into your header reading code. I noticed that it was giving different numbers than OS X in a few cases, and was waiting until I could get a third opinion (hopefully Winamp) to be the tie breaker. Let me know when you have an update and I'll look again.

Picking an id3 reader

This is defiantly a task for a library, the question is which one. Browsers can play audio from many different types of media. I think of Winamp as an mp3 player, but it really plays all kinds of media. Ideally we would use a library that supports reading metadata from different types of media files and gives us a common interface.

So far https://github.com/leetreveil/musicmetadata looks like the library we want. It's under active development and seems to offer the functionality we need. Any objections?

...as I write this, maybe I have one :) How did Winamp handle meta data from other types of media? I know if had mp3 meta data editing built in, but what did it do with other kinds of media? Was that left up to a media specific plugin? Maybe somebody wants to look into that before we choose.

Keep in mind, we will probably need meta data to always be an optional feature since we won't be able to support reading meta data for all possible types of media.

@captbaritone captbaritone added this to the v2.0 milestone Dec 16, 2014
@PAEz
Copy link
Author

PAEz commented Dec 16, 2014

That thing looks really good, but its huge! It 500k++ un minified.
Uglified it was something like 300k.
Its meant for node and then has replacements for node stuff for the browser. With an emphasis on stream style coz that makes great sense for node.
Then there the string encoding stuff, its huge! It has very large lookup tables. From a quick look it only comes up in id3V2 as anything but utf8 utf16. Both utfs can be implemented with a simple function for each. Thens theres iso-something-other. I dont know jack about that stuff except not a great deal of people use it. If you dumped that support or just added simple support like that other library then the size could be reduced a great deal.
Im still thinking of improving that other library coz its small and I might add ogg and what not support to it. The only one that looked like real work was asf and thats prolly the one I care about the least.
Dont get me wrong, that library is complete....dam nice work....but its huge!

I just tried ogg and flac as 2 mates care about those, I only care about mp3.
Winamp recognised and played the ogg and showed tags in the info box, flac wasnt recognised at all.
Never heard of monkeysaudio, but tried it and winamp doesn't do it
Chrome only played mp3 and ogg.

And my header info reader would fail by getting a non frame on some files but other wise the numbers should be right unless a vbr in which case the bitrate will prolly be wrong as its the bitrate for that frame only. But Ill be getting that all right later, going to be busy this week being near chrissy but Ill still be playing when I can.

@captbaritone
Copy link
Owner

Thanks for checking, that's pretty huge. I realized yesterday our entire payload including demo mp3 and skin is just 190KB after gzip and minification. That makes this library look pretty big :)

I'd rather not build this ourselves since it seems like a complicated problems with lots of edge cases. I think it would be best to start with the library which just does id3 and then try to get ogg support merged into that project. I think if we can support mp3 that will cover 90% of use cases, and give us pretty good parity with Winamp. Ogg would be nice to get full parity, but it's not worth including that huge library.

This conversation lead me to research and file #117.

@PAEz
Copy link
Author

PAEz commented Dec 21, 2014

I played with that musicmetadata today (first day Ive had a chance in a while) and was able to reduce it a bit....
Removing the iconv and replacing it with the one used from the other library...
Original - 541,204
Replace iconv - 250,250
...this will make it not be able to handle all the exotic character sets but utf-woteva is handled fine. Handling ALL the exotice sets is just going to take a butt load of tables, theres not much you can do about that. iconv has more than we need Im sure, but knowing what ones and how to rip the rest out might be a real pain.
Then I replaced all its file and stream handling stuff to something real simple (just fine for our case)....
Replace iconv - 250,250
Replace file/stream/stuffs - 156,070 - minified - 65,638 - gziped - 22.22k (according to some site, I dont do server stuff....yet ;))
...so I got it down quite a lot and there could be more spots to do safe reductions, these where just the two most obvious spots.
The cool thing is I can do these changes without changing the originals source and just using browserify stuff. I changed a couple of things by hand, but I know browserify can do it I just havent got around to learning that yet (just saw it in the docs). So with this even if he updates it shouldnt be any problem.
And its all still working just fine.

Id like to support all those exotic charsets tho, maybe load them in after everything else has loaded and replace the initial decoder, thatd be doable.

@captbaritone
Copy link
Owner

Currently we can only display characters that are in the TEXT.BMP file, so it might not be worth going out of our way to support non-ascii characters.

That being said, I think the text rendering in the playlist (which I'm currently looking into), may be different. PLEDIT.TXT contains color and font information.

@PAEz
Copy link
Author

PAEz commented Dec 21, 2014

Yeah, the playlist uses windows fonts if I remember right.

EDIT: Oooops. Its getting the duration just fine :\

@PAEz
Copy link
Author

PAEz commented Dec 21, 2014

Oh and it does do a little of that odder stuff now.
I added in this....
podviaznikov/JavaScript-ID3-Reader@ac78fc5
...for reading some iso encodings and it reads the id3v2-duration-allframes.mp3 file in the samples directory of musicmetadata now.
Now Ive got to learn how to make my changes cleanly using browsify and a grunt script or something. All Im doing is replacing requires with simple versions.

EDIT:
Oh again.
That iso set contains english characters and not just foreign symbols.
So even if your just showing the name in the maquee then your still going to want that pull requests stuff or you just get odd characters, not even the english ones.

@PAEz
Copy link
Author

PAEz commented Jan 16, 2015

Back again ;)
Got distracted and wanted a break....atleast one distraction turned out to be cool, Nuclides/github-highlight-selected#4 , nice when you get something you wanted at the end of it :P

Found another node based bunch of librarys...
https://github.com/jquense/node-audio-info
...also look at....
https://github.com/KenanY/id3v2-in-browser
...and he has repos for the individual formats aswell.

This one is alot smaller than the other one and seems to handle mp3's fine (including extended headers which a couple failed on, like the one you originally mentioned). This one is about 95k minnimized for just the id3v2 stuff. I could get that a little smaller by then removing underscore that he used twice for two things that done need underscore, which would get about 20k off....but its always going to be to big for my liking as its all buffer, stream, node stuff that the browser dont need.
So I know you didnt like the idea of me reinventing the wheel again, but dam I know I could do it with less spokes ;P.....
http://plnkr.co/edit/9ahBhEpArMqrlWsMJ3hq?p=preview
...I rewrote it to be more browser like. Minified its 8.2k!!!!....thats more like it :P
So Im going to convert id3v1 and ogg tomorrow, now Ive done this one it shouldnt be to bad. It also needs some stuff to handle tags and some extra header stuff (its simple) but even with that and the other formats I dont see this thing getting up to even 15k. Bytes (my buffer alternative) is one of the bigger things and thats not getting much bigger now, its got lots of nice stuff.

@captbaritone
Copy link
Owner

Welcome back. 8K is pretty convincing, and at first glance the code looks like it makes sense.

Whether we use your code or an existing library, I think we should keep it as a separate project. Parsing id3 tags in the browser is a useful problem to solve. If we believe that the current solutions are not good enough (aka, too bloated), then we should make our (your) alternative available as an alternative for other projects.

@PAEz, would you be interested in developing this fully and maintaining it as a repository under your GitHub account? If not, how would you feel about me putting your code in a repository under my account?

I really like the idea of a much leaner library, but I have a few concerns:

  1. Including your code means I have to either depend on you to maintain it as we find bugs, or be willing to maintain it myself. I'm not 100% sure I'm qualified or interested enough to maintain such a library.
  2. Mp3 have probably been generated with all kinds of tools that generate poorly formed id3 tags. Adopting a new library (like yours) as opposed to a more "battle tested" one, means we are going to notice and either accommodate, or decide to ignore those mp3s. I'm not sure we are really in a position where we have a broad user base that will help by reporting issues when their mp3s behave poorly.
  3. Using an existing library might give us the advantage of future extensibility. What if we want to support some other format(s) in the future? (That being said, I think id3 versions 1 and 2 plus ogg/flac should be enough for the foreseeable future).
  4. As much as I've been interested in controlling the file size, because it presents a fun and interesting challenge, I have to remind myself that we are really prematurely optimizing here. We are currently at 180K which is TINY compared to most sites today, AND given what we are doing, I think people would be just as pleased if it 1 or 2 MB.

So, keeping it lean and without dependencies fits nicely with the aesthetic of this project, but it comes with at a cost. Do you think it's worth it?

@PAEz
Copy link
Author

PAEz commented Jan 21, 2015

Id be cool with sticking up a repo for it, Im not totally cool on maintaining it though as Im to fickle for that. But then it wouldnt be tHaT much hassle to maintain as its just a matter of keeping this up to date with musicmetadata.
I just converted musicmetadata over to using Bytes (I love Bytes;).....
http://plnkr.co/edit/PPOvkq?p=preview
...well Ive done most of the main files. I cant do one bit yet as I cant find a file to test it on, I need an mp3 encoded with vbr but not containing an extended/info/lame header...bumma. I converted musicmetadata coz it does more than the other one, it can get the duration and can scan all frames (which I might need soon). It was a little harder coz of its streaming tokenizer, but I faked that with something Im calling Chunks. If you have a look at the original and this one youll see that their code looks extremely similar which is why I think mainlining wouldnt be that bad. This way your basically getting a battle hardened tested library, but with the bloat gone....this was about 17.5k minified and I could get that down a bit more but would need to change the code and wanted you to see how close it is to the original first.

Point 1
Its going to be so close to musicmetadata I dont think thatll be much of a hassle.
Point 2
Screw Em. Anyone not sticking to the standards can just go away and die. This came up with the text encodings stuff. The specs say to support utf8,16 and iso-somethingorother only, but that some stuff (jap) used other encodings....tuff, should have used utf16. Dealing with that would have just been nuts and why if its not even in the specs?...Im not supporting that, use software that sticks to the specs.
Point 3
Ill prolly end up doing all of musicmetadata.
Point 4
Size counts and Ill be dammed if Ill ever be one of those people that think it doesnt coz of broadband and what not....it sickens me how many simple pages weigh in more than a meg, bleh. And one of winamp 2.95s joy was its complete lack of bloat.

@Akamaozu
Copy link

PAEz you really are godsend!

I've been looking for a way to read and edit id3 tags for a music project I'm working on. Searching on npm keeps yielding projects that rely on editing the id3 of a file already on disk using wrappers to c++ packages like ffmpeg or libav.

I already have the mp3's binary passing through my server and I would rather not have to save it to update it. You're working at a low-enough level that reading (and hopefully editing) them at that level shouldn't be a challenge, it seems.

I can't wait to check out what you've done here and see if it's possible to make a package on npm to do the same thing and save others from having to work with old, clunky tools.

Case in point: http://blog.kaiserapps.com/2014/01/nodejs-id3-tag-libraries-which-is-best.html (note the date)

@captbaritone
Copy link
Owner

Alright! Sounds like we have a plan. Let's use your code.

Thanks @Akamaozu. Looks like we already have another customer for this stand-alone id3 library. :)

I think the only question left unresolved is where is this library is going to live.

  1. You host it on your GitHub account. I send pull requests for things I need/want. If you lose interest, I fork it.
  2. I host it on my GitHub account. I can manage all the Git stuff. (I've noticed you prefer comments to pull requests). I would also do my best to manage any potential community contributions.

Let me know which way you want to go.

@captbaritone
Copy link
Owner

It's also worth mentioning that musicmetadata is MIT licensed, so we can freely look at/steal any of their code which we find useful.

@abritinthebay
Copy link

Whatever happened to this?

@captbaritone
Copy link
Owner

Not sure. Any interest in looking into this? We would need to re-evaluate our options, since I'm sure things have evolved since 2015.

@parrotgeek1 proposed looking into https://github.com/audiocogs/mp3.js/ obviously that's a bit heavy for us (we just need bitrate and id3, not actual decoding) but it might be a starting place. Maybe some of that stuff is/could be modularize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants