Add bitrate information to format output #59

tifroz · 2015-01-30T01:03:33Z

The PR adds a bitrate field to the format output when the data is available. An example output would look like this as a result:

{ itag: 34, filetype: 'flv', resolution: '360x640' }
{ itag: 18, filetype: 'mp4', resolution: '360x640', bitrate: '1127k' }
{ itag: 43, filetype: 'webm', resolution: '360x640' }
{ itag: 5, filetype: 'flv', resolution: '240x400' }
{ itag: 17, filetype: 'mp4', resolution: '144x176', bitrate: '113k' }

fent · 2015-01-30T17:36:48Z

lib/youtube-dl.js

@@ -234,7 +234,9 @@ ytdl.getInfo = function(url, args, options, callback) {


 var formatsRegex =
-  /^(\d+)\s+([a-z0-9]+)\s+(\d+x\d+|\d+p|audio only|\?x\d+|unknown)/;
+  /^(\d+)\s+([a-z0-9]+)\s+(\d+x\d+|\d+p|audio only|\?x\d+|unknown)(.*)$/;
+var extendedRegex =


this can be put into the first regex

or maybe it would be better to split the rest of the string by commas, and parse all of those fields not just the bitrate

tifroz · 2015-01-30T18:18:10Z

So, yes and yes on both comments, however here are the reasons I did it the way I did it:

Need to split because as you parse additional fields, a single regex would become difficult to read & debug.
I have not parsed any additional fields for now because I don't fully understand the columns structure from the raw output (in particular entry 278 seems to have extra columns with the data webm container, VP9). I could not find any doc that would explain to my satisfaction so I elected to make my regex conservative and only capture the part I understand & need.

...this said, if you are more knowledgeable about the ytdl output and can point me to the doc that describes the columns structure, I would be happy to expand on my initial effort and parse additional fields. The ideal way to proceed would be to split on , and then parse each column individually I think

Example raw output:

[ '[youtube] 0k2Zzkw_-0I: Downloading webpage',
  '[youtube] 0k2Zzkw_-0I: Extracting video information',
  '[youtube] 0k2Zzkw_-0I: Downloading DASH manifest',
  '[info] Available formats for 0k2Zzkw_-0I:',
  'format code extension resolution  note ',
  '140         m4a       audio only  DASH audio  128k , aac  @128k (44100Hz), 4.44M... (length: 90)',
  '171         webm      audio only  DASH audio  185k , audio@128k (44100Hz), 5.36M... (length: 82)',
  '141         m4a       audio only  DASH audio  255k , aac  @256k (44100Hz), 8.90M... (length: 82)',
  '278         webm      192x144     DASH video  117k , webm container, VP9, 15fps,... (length: 100)',
  '160         mp4       192x144     DASH video  120k , 15fps, video only, 3.90MiB',
  '242         webm      320x240     DASH video  207k , 30fps, video only, 5.96MiB',
  '133         mp4       320x240     DASH video  248k , 30fps, video only, 8.53MiB',
  '243         webm      480x360     DASH video  358k , 30fps, video only, 10.21MiB',
  '134         mp4       480x360     DASH video  605k , 30fps, video only, 13.12MiB',
  '244         webm      640x480     DASH video  792k , 30fps, video only, 20.51MiB',
  '135         mp4       640x480     DASH video 1105k , 30fps, video only, 26.55MiB',
  '17          3gp       176x144     ',
  '36          3gp       320x240     ',
  '5           flv       400x240     ',
  '43          webm      640x360     ',
  '18          mp4       640x360     (best)' ]

fent · 2015-01-31T01:47:18Z

Can't find documentation on the output for this. From the output above, it looks like sometimes it has extra info, sometimes it doesn't, and when it does the number of fields is not constant. But it looks like we would be able to get the bitrate, fps, and size of the format.

tifroz · 2015-01-31T03:36:21Z

Happy to add the format (e.g. DASH video, DASH audio), fps (videos only) in addition to the bitrate. Not sure what you are referring to with the size of the format though?

fent · 2015-01-31T04:15:37Z

Isn't the rightmost part the size? ie 26.55MiB for 135

tifroz · 2015-02-01T08:00:33Z

I coded the parsing of the extra fields & I ran into a bug when testing. Happy to fix it in this PR however it would change the existing data structure returned by getFormat() because the itag would need to be a string (currently an integer)
I believe this is the field used by youtube-dl to id a particular version of a video (value used with the --format option), so it's likely people are using this value.

To avoid any regression, I think the best would be to keep the itag unchanged for now (but deprecate it), and add a new field of type string (call it tag?). The field tag would always be present but itag would only be present when the value can be parsed into an integer (youtube).

I will let you weigh-in before I make the fix

To reproduce, just try getting the format from http://vimeo.com/91089952 for example. I am creating a new issue for tracking purposes

fent · 2015-02-01T08:15:44Z

How about naming the field code since the output from youtube-dl refers to it as format code?

tifroz · 2015-02-03T20:59:30Z

Updated to fix #60 as well.

Note that the regex is still split into multiple patterns, primarily because some attributes don't always show up, and the code is much more readable using if (exists) {//...} v. a long regex pattern

fent · 2015-02-04T08:40:56Z

lib/youtube-dl.js

          filetype   : result[2],
          resolution : result[3],
-        });
+        }
+        if (/^\d+$/.test(compiledResult.code)) {


this would only be set for youtube videos right?

The itag attributes will be set for any video with an integer code value. So the itag would be set for youtube and any of the other ~500+ web sites supported by youtube dl in theory that use integer code values (if any).

fent · 2015-02-11T03:06:24Z

tests still fail :(

fent · 2015-02-11T03:23:25Z

There's a new PR #62

It uses the --dump-json object which prints the data that both getInfo() and getFormats() would parse. Which would deprecate this PR :(. Sorry for not realizing to use that earlier.

tifroz · 2015-02-11T18:01:48Z

--dump-json is a much, much better option if it can be made to work.

I played with --dump-json, but couldn't get it to work initially - but I do remember thinking ah! this must be why @fent wrote the getFormats() API. I am glad I was wrong...much happier with the output & stability of the --dump-json option.

Let's close this PR after you accept #62

Not related but I am planning to work on a PR to allow submitting multiple urls at a time. Let me know if you have any thoughts on the topic?

fent · 2015-02-12T04:01:02Z

when I first made this, that option didn't exist. when it was first added, IIRC it didn't include a downloadable url from the formats.

fent reviewed Jan 30, 2015
View reviewed changes

tifroz mentioned this pull request Feb 1, 2015

getFormat() doesn't work with most (all?) non-youtube videos #60

Closed

fent reviewed Feb 4, 2015
View reviewed changes

Added bitrate information to format output

0e8786e

fent mentioned this pull request Feb 11, 2015

Use the '--dump-json' option #62

Merged

fent closed this Feb 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bitrate information to format output #59

Add bitrate information to format output #59

tifroz commented Jan 30, 2015

fent Jan 30, 2015

fent Jan 30, 2015

tifroz commented Jan 30, 2015

fent commented Jan 31, 2015

tifroz commented Jan 31, 2015

fent commented Jan 31, 2015

tifroz commented Feb 1, 2015

fent commented Feb 1, 2015

tifroz commented Feb 3, 2015

fent Feb 4, 2015

tifroz Feb 4, 2015

fent commented Feb 11, 2015

fent commented Feb 11, 2015

tifroz commented Feb 11, 2015

fent commented Feb 12, 2015

Add bitrate information to format output #59

Add bitrate information to format output #59

Conversation

tifroz commented Jan 30, 2015

fent Jan 30, 2015

Choose a reason for hiding this comment

fent Jan 30, 2015

Choose a reason for hiding this comment

tifroz commented Jan 30, 2015

fent commented Jan 31, 2015

tifroz commented Jan 31, 2015

fent commented Jan 31, 2015

tifroz commented Feb 1, 2015

fent commented Feb 1, 2015

tifroz commented Feb 3, 2015

fent Feb 4, 2015

Choose a reason for hiding this comment

tifroz Feb 4, 2015

Choose a reason for hiding this comment

fent commented Feb 11, 2015

fent commented Feb 11, 2015

tifroz commented Feb 11, 2015

fent commented Feb 12, 2015