Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-F does not give FILESIZE of ID 22 #1400

Closed
5 of 6 tasks
barkoder opened this issue Oct 24, 2021 · 15 comments
Closed
5 of 6 tasks

-F does not give FILESIZE of ID 22 #1400

barkoder opened this issue Oct 24, 2021 · 15 comments
Assignees
Labels
enhancement New feature or request

Comments

@barkoder
Copy link

Checklist

Region

No response

Example URLs

https://www.youtube.com/watch?v=BaW_jenozKc

Description

Requesting yt-dlp to also be able to extract file size of -f 22


[debug] Command-line config: ['-v', 'BaW_jenozKc', '-F']
[debug] Encodings: locale cp1252, fs utf-8, out cp1252 (No ANSI), err cp1252 (No ANSI), pref cp1252[debug] yt-dlp version 2021.10.22 (win_exe)
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
[debug] [youtube] Extracting URL: BaW_jenozKc
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading android player API JSON
[debug] Sort order given by extractor: quality, res, fps, hdr:12, source, codec:vp9.2, lang
[debug] Formats sorted by: hasvid, ie_pref, quality, res, fps, hdr:12(7), source, vcodec:vp9.2(10), acodec, lang, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, id
[info] Available formats for BaW_jenozKc:
ID  EXT  RESOLUTION FPS |  FILESIZE    TBR PROTO | VCODEC        VBR ACODEC     ABR  ASR    MORE INFO
--- ---- ---------- --- - ---------- ----- ----- - ----------- ----- --------- ---- ------- -----------------
139 m4a  audio only     |  58.59KiB    48k https |                   mp4a.40.5  48k 22050Hz low, m4a_dash
249 webm audio only     |  58.17KiB    48k https |                   opus       48k 48000Hz low, webm_dash
250 webm audio only     |  76.07KiB    63k https |                   opus       63k 48000Hz low, webm_dash
140 m4a  audio only     |  154.06KiB  127k https |                   mp4a.40.2 127k 44100Hz medium, m4a_dash
251 webm audio only     |  138.96KiB  115k https |                   opus      115k 48000Hz medium, webm_dash
17  3gp  176x144    12  |  55.79KiB    44k https | mp4v.20.3     44k mp4a.40.2   0k 22050Hz 144p
160 mp4  256x144    15  |  135.08KiB  112k https | avc1.4d400c  112k                        144p, mp4_dash
278 webm 256x144    30  |  52.22KiB    43k https | vp9           43k                        144p, webm_dash
133 mp4  426x240    30  |  294.27KiB  245k https | avc1.4d4015  245k                        240p, mp4_dash
242 webm 426x240    30  |  33.27KiB    27k https | vp9           27k                        240p, webm_dash
134 mp4  640x360    30  |  349.59KiB  292k https | avc1.4d401e  292k                        360p, mp4_dash
18  mp4  640x360    30  |  354.29KiB  293k https | avc1.42001E  293k mp4a.40.2   0k 44100Hz 360p
243 webm 640x360    30  |  75.55KiB    62k https | vp9           62k                        360p, webm_dash
135 mp4  854x480    30  |  849.41KiB  710k https | avc1.4d401f  710k                        480p, mp4_dash
244 webm 854x480    30  |  165.49KiB  137k https | vp9          137k                        480p, webm_dash
136 mp4  1280x720   30  |  1.60MiB   1365k https | avc1.4d401f 1365k                        720p, mp4_dash
22  mp4  1280x720   30  |            1493k https | avc1.64001F 1493k mp4a.40.2   0k 44100Hz 720p
247 webm 1280x720   30  |  504.68KiB  420k https | vp9          420k                        720p, webm_dash
137 mp4  1920x1080  30  |  2.11MiB   1803k https | avc1.640028 1803k                        1080p, mp4_dash
248 webm 1920x1080  30  |  965.31KiB  804k https | vp9          804k                        1080p, webm_dash
@barkoder barkoder added site-enhancement Feature request for some website triage Untriaged issue labels Oct 24, 2021
@pukkandan
Copy link
Member

YouTube does not provide the filesize for these formats. The only way to get it is to make try fetching the file to get the size from either the HTTP headers, or using ffprobe - which #613 already requests

@barkoder
Copy link
Author

I see what you mean @pukkandan.

From the view-source:, I can see that contentLength(the filesize) is missing for "itag":22

However "bitrate":1493329 is still available and "approxDurationMs":"9891" is also available.

We can derive a very close approximation of the contentLength(the filesize) for itag=22 from this.

Bitrate = Total Size of -f 22 / Duration of -f 22

Or,

Total Size = Bitrate * Duration

So plugging in some values,

Total Size = ("bitrate":1493329 in bits) * ("approxDurationMs":"9891" in milliseconds)

Total Size = (1493329 bits per second) * (9891 milliseconds)

Total Size = (182.2911376953125 kiloBytes per second) * (9.891 seconds)

Therefore,

Total Size = ~1803.041642944336 KB or ~1.76MB Which is very close to the actual size of 1.74MB

Please don't close this?
I'd rather have an approximate value of -f 22 than nothing.
You could even add ~ to show that the -f 22 size value is an approximation.

-f 22 is only non 360p(f 18) format that has both video+audio muxed in.
It's the one I most frequently use.

Thanks.

@pukkandan pukkandan added enhancement New feature or request and removed site-enhancement Feature request for some website triage Untriaged issue labels Oct 24, 2021
@pukkandan pukkandan reopened this Oct 24, 2021
@pukkandan pukkandan self-assigned this Oct 24, 2021
@pukkandan
Copy link
Member

We can derive a very close approximation of the contentLength(the filesize) for itag=22 from this.

This is a good idea. I should be able to do this in general, and not just for youtube

gaming-hacker added a commit to gaming-hacker/yt-dlp that referenced this issue Oct 28, 2021
* master: (43 commits)
  [MLSScoccer] Add extractor (yt-dlp#1452)
  [itv] Add support for ITV News (yt-dlp#1456)
  [viewlift] Fix typo in 5be76d1
  [utils] Add `jwt_decode_hs256` Code from yt-dlp#1340 Authored by: Ashish0804
  [viewlift] Add cookie-based login and series support Closes yt-dlp#1340, yt-dlp#1316 Authored by: Ashish0804, pukkandan
  [sky] Add `SkyNewsStoryIE` (yt-dlp#1443)
  [wakanim] Detect geo-restriction (yt-dlp#1429)
  [wakanim] Add support for MPD manifests (yt-dlp#1428)
  [compat] Don't create console in `windows_enable_vt_mode` Closes yt-dlp#1420
  [3speak] Add extractors (yt-dlp#1430)
  [twitter] Do not sort by codec Closes yt-dlp#1431
  [extractor] Fix some errors being converted to `ExtractorError`
  [utils] Create `DownloadCancelled` exception as super-class of ExistingVideoReached, RejectedVideoReached, MaxDownloadsReached
  [downloader/ffmpeg] Fix vtt download with ffmpeg
  [outtmpl] Add type `link` for internet shortcut files and refactor related code Closes yt-dlp#1405
  [utils] Sanitize URL when determining protocol Closes yt-dlp#1406
  [DiscoveryPlus] Allow language codes in URL Closes yt-dlp#1425
  [Bilibili:comments] Fix infinite loop (yt-dlp#1423)
  [instagram] Fix bug in ab2ffab (yt-dlp#1403)
  Approximate filesize from bitrate Closes yt-dlp#1400
  ...

# Conflicts:
#	Makefile
#	supportedsites.md
@barkoder
Copy link
Author

@pukkandan is it possible to make this more precise? I feel like there's too much rounding off happening in the code.

yt-dlp's f22 estimated file size is particularly inaccurate for larger files.

I do the f22 size math for this video on paper,

"bitrate":1080610, "approxDurationMs":"17499974",

[135076.25 bytes/second] * [17499.974 seconds] = 2,363,830,863.0175 bytes OR 2.201489045300987 GiB

and I get Paper Math size = 2.201489045300987 GiB

Here's the actual size

curl -s -I "$(yt-dlp -q https://www.youtube.com/watch?v=B9bn1ooLZJk -f 22 -g)" | grep 'Content-Length'

Content-Length: 2363820486

which is 2254.31488609314 MiB OR 2.201479380950332 GiB which is 4 decimal points accurate to the Paper Math size calculated above.

But,

$ yt-dlp -F --youtube-skip-hls-manifest --compat-options list-formats --compat-options format-sort -- B9bn1ooLZJk | grep -E '^22'
22           mp4        1280x720    720p, 1080k, avc1.64001F, 30fps, mp4a.40.2 (44100Hz), ~2.25GiB

which is a full 50 MiB away from Math size. yt-dlp should say ~2.20GiB

I like to embed the filesize(upto 2 decimal points) in filename, but the filesize_approx for -f 22 is way too inaccurate at the moment and it is possible to have it be significantly more accurate.

Please make this more accurate, if possible.

Thanks @pukkandan

@pukkandan

This comment was marked as outdated.

@barkoder
Copy link
Author

barkoder commented Mar 15, 2024

@pukkandan I just Updated yt-dlp to nightly@2024.03.14.232657 from yt-dlp/yt-dlp-nightly-builds

I'm still getting the wrong size for -f 22.

Was 2024.03.14.232657 built with the fix for 140a13f ?

@bashonly
Copy link
Member

@barkoder 140a13f is the commit where the bug was introduced, not fixed. The above patch to fix the bug has not been merged yet

@barkoder
Copy link
Author

Thanks @bashonly . Comment edited.

@barkoder

This comment was marked as spam.

@Rikk
Copy link

Rikk commented Mar 18, 2024

Regarding 22e4dfa
When we talk about Bit Rate, it is historically used the Metric notation (base-10), instead of the Binary one (base-2). This is different from file sizes where Binary is the most traditional. More info: https://en.wikipedia.org/wiki/Bit_rate

While this commit fixes the conversion of file sizes (in MiB [base-2]), the bit rates (now in base-2) doesn't match the values displayed on players or MediaInfo. And it increases the discrepancy of ABR values that where already not matching before; for example, YT, audio-only format 140 has 128kb/s, but YT-DL traditionally display 129k for it, but now with this commit it became 126k.

Imo, in order to fix the calculation of file sizes in base-2 should not involve changing the display of all bit rates to base-2.

@pukkandan
Copy link
Member

pukkandan commented Mar 19, 2024

All our bitrates have always been displayed in base 2 afaik for all sites, (except for this bug). Do you have examples where it is not? Wrong

@Rikk
Copy link

Rikk commented Mar 19, 2024

@pukkandan
When I posted I compared bit rates displayed in both 2024.03.10 and 2024.03.18.170739 for the same video and all of them are different. (--list-formats)
Tbh, I don't know if they were base2 or base10 before, but they are different now (and doesn't match MediaInfo). Per the commit I suppose it went base10->2

@barkoder
Copy link
Author

Thanks for the update.

However, there are still inaccuracies in the size estimates of some videos.

VIDEOID=EpLxHWg_-hk

-f 22 Actual Size(curl -I)	= 164.8265342712402 MiB

-f 22 Paper Math Size           = 164.8297197656631 MiB

-f 22 yt-dlp size               = 164.85MiB

VIDEOID=LMDO3tt8enM

-f 22 Actual Size(curl -I)	= 50.71321105957031 MiB

-f 22 Paper Math Size           = 50.71664184522629 MiB

yt-dlp size                     = 50.69MiB

VIDEOID=oonE_eLmKZ8

-f 22 Actual Size(curl -I)	= 48.34338283538818 MiB

-f 22 Paper Math Size           = 48.34841357386112 MiB

yt-dlp size                     = 48.36MiB

VIDEOID=BaW_jenozKc

-f 22 Actual Size(curl -I)	= 214.5966796875 KiB

-f 22 Paper Math Size           = 216.462216796875 KiB

-f 22 yt-dlp size               = 218.85KiB

Admittedly in the last one even the Paper Math Size estimate doesn't quite match the actual size, but the other three could be more accurate.

Thanks.

@Rikk
Copy link

Rikk commented Mar 22, 2024

@pukkandan Here is a comparison to illustrate what I compared. In this example, I'm comparing yt-dlp-nightly build 2024.03.17.232657 (before the commit) with 2024.03.18.232707 (after the commit). Then I got formats 137+140 so we can compare Bit Rates with the values MediaInfo provide; versions before the commit are much closer.

yt-dlp-nightly-2024.03.18.232707__bitrates

MediaInfo:

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L4
Format settings                          : CABAC / 3 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 3 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 3 min 56 s
* Bit rate                                 : 2 347 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 30.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.038
Stream size                              : 66.0 MiB (95%)
Title                                    : ISO Media file produced by Google Inc.
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709
Menus                                    : 3
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 3 min 56 s
Bit rate mode                            : Constant
* Bit rate                                 : 128 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 44.1 kHz
Frame rate                               : 43.066 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 3.60 MiB (5%)
Title                                    : ISO Media file produced by Google Inc.
Language                                 : English
Default                                  : Yes
Alternate group                          : 1
Menus                                    : 3

pukkandan added a commit to pukkandan/yt-dlp-dev that referenced this issue Mar 31, 2024
YouTube provides slightly different duration for each format.
Calculating file-size based on this duration instead of the
global video duration is shown to give more accurate results.

Ref: yt-dlp#1400 (comment)
pukkandan added a commit to pukkandan/yt-dlp-dev that referenced this issue Mar 31, 2024
YouTube provides slightly different duration for each format.
Calculating file-size based on this duration instead of the
global video duration is shown to give more accurate results.

Ref: yt-dlp#1400 (comment)
pukkandan added a commit that referenced this issue Mar 31, 2024
YouTube provides slightly different duration for each format.
Calculating file-size based on this duration instead of the
video duration gives more accurate results.

Ref: #1400 (comment)
@pukkandan
Copy link
Member

Both issues fixed in a25a424 86e3b82

aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
YouTube provides slightly different duration for each format.
Calculating file-size based on this duration instead of the
video duration gives more accurate results.

Ref: yt-dlp#1400 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants