Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some formats not filterable because of characters not specified in str_operator_rex regex #6858

Closed
aktau opened this issue Sep 14, 2015 · 2 comments
Labels
bug

Comments

@aktau
Copy link

@aktau aktau commented Sep 14, 2015

Example format:

youtube-dl --list-formats "https://www.youtube.com/watch?v=bxnd_gWhWB0"                       
[youtube] bxnd_gWhWB0: Downloading webpage
[youtube] bxnd_gWhWB0: Downloading video info webpage
[youtube] bxnd_gWhWB0: Extracting video information
[youtube] bxnd_gWhWB0: Downloading DASH manifest
[youtube] bxnd_gWhWB0: Downloading DASH manifest
[info] Available formats for bxnd_gWhWB0:
format code  extension  resolution note
249          webm       audio only DASH audio   52k , opus @ 50k, 8.58MiB
250          webm       audio only DASH audio   63k , opus @ 70k, 10.10MiB
171          webm       audio only DASH audio   84k , vorbis@128k (44100Hz), 13.92MiB
140          m4a        audio only DASH audio  130k , m4a_dash container, aac  @128k (44100Hz), 22.42MiB
251          webm       audio only DASH audio  133k , opus @160k, 21.92MiB
141          m4a        audio only DASH audio  256k , m4a_dash container, aac  @256k (44100Hz), 44.53MiB
278          webm       256x144    DASH video   89k , webm container, vp9, 15fps, video only, 12.55MiB
160          mp4        256x144    DASH video  115k , avc1.4d400c, 15fps, video only, 19.05MiB
242          webm       426x240    DASH video  158k , vp9, 30fps, video only, 16.48MiB
133          mp4        426x240    DASH video  250k , avc1.4d4015, 30fps, video only, 42.59MiB
243          webm       640x360    DASH video  265k , vp9, 30fps, video only, 31.04MiB
134          mp4        640x360    DASH video  365k , avc1.4d401e, 30fps, video only, 45.30MiB
244          webm       854x480    DASH video  579k , vp9, 30fps, video only, 60.18MiB
135          mp4        854x480    DASH video  896k , avc1.4d401f, 30fps, video only, 99.03MiB
13           3gp        unknown    small 
17           3gp        176x144    small ,  mp4a.40.2, mp4v.20.3
36           3gp        320x240    small ,  mp4a.40.2, mp4v.20.3
5            flv        400x240    small 
43           webm       640x360    medium ,  vorbis, vp8.0
18           mp4        640x360    medium ,  mp4a.40.2, avc1.42001E (best)

Say I wanted to prefer the best possible avc format, I'm not sure how I could. I tried:

$ youtube-dl -f "bestvideo[vcodec=avc1.4d401f]+bestaudio" "https://www.youtube.com/watch?v=bxnd_gWhWB0

But couldn't because the dot (.) is not part of the regex:

            str_operator_rex = re.compile(r'''(?x)
                \s*(?P<key>ext|acodec|vcodec|container|protocol)
                \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
                \s*(?P<value>[a-zA-Z0-9_-]+)
                \s*$

Which only seems to allow alphanumeric, dash and underscore for some reason.

What I'm actually looking for is some sort of wildcard:

$ youtube-dl -f "bestvideo[vcodec=avc1.*]+bestaudio" "https://www.youtube.com/watch?v=bxnd_gWhWB0

Or, better yet, some sort of normalization, since as far as I understand it, avc1 is h264. Can anybody tell me why normalization would be undesirable?

@dyn888
Copy link
Contributor

@dyn888 dyn888 commented Jan 28, 2016

@aktau I updated the regex pattern to match more codecs. (#8346)
About normalization, recently i added vcodec, acodec, and even bitrate info where applicable, for all itags, more info: #8130. With this patch 'avc1.xxxxxx' would be 'h264', however, there is a new default behavior to use itag data only in case YouTube does not provide it, the old behavior was to overwrite it, so despite this patch mp4 container vcodec is still avc1.xxxxxx, in your case:

249   webm   audio only DASH audio   52k , opus @ 50k, 8.58MiB
250   webm   audio only DASH audio   63k , opus @ 70k, 10.10MiB
171   webm   audio only DASH audio   84k , vorbis@128k (44100Hz), 13.92MiB
140   m4a    audio only DASH audio  130k , m4a_dash container, mp4a.40.2@128k (44100Hz), 22.42MiB
251   webm   audio only DASH audio  133k , opus @160k, 21.92MiB
141   m4a    audio only DASH audio  256k , m4a_dash container, mp4a.40.2@256k (44100Hz), 44.53MiB
278   webm   256x144    DASH video   89k , webm container, vp9, 15fps, video only, 12.55MiB
160   mp4    256x144    DASH video  115k , avc1.4d400c, 15fps, video only, 19.05MiB
242   webm   426x240    DASH video  158k , vp9, 30fps, video only, 16.48MiB
133   mp4    426x240    DASH video  250k , avc1.4d4015, 30fps, video only, 42.59MiB
243   webm   640x360    DASH video  265k , vp9, 30fps, video only, 31.04MiB
134   mp4    640x360    DASH video  365k , avc1.4d401e, 30fps, video only, 45.30MiB
244   webm   854x480    DASH video  579k , vp9, 30fps, video only, 60.18MiB
135   mp4    854x480    DASH video  896k , avc1.4d401f, 30fps, video only, 99.03MiB
17    3gp    176x144    small , mp4v.20.3,  mp4a.40.2@ 24k
36    3gp    320x240    small , mp4v.20.3,  mp4a.40.2@ 32k
5     flv    400x240    small , h263, mp3  @ 64k
43    webm   640x360    medium , vp8.0,  vorbis@128k
18    mp4    640x360    medium , avc1.42001E,  mp4a.40.2@ 96k (best)

Luckily there was another patch (#8218) which added 3 CSS-like selectors:
^= -- starts with string: [vcodec^=avc1]
*= -- contains string: [vcodec*=c1]
$= -- ends with string: [vcodec$=01f]

In your case, this would get best video with avc1/h264 codec:
youtube-dl -f "bestvideo[vcodec^=avc1]+bestaudio" bxnd_gWhWB0

Overwriting avc1 with h264 would make things simpler, but it would require an update for any new format. Keeping the defaults can have its advantages as well, for example, narrower wildcard selection to match a couple of codec sub-variants only: vcodec*=4d401 would match: 4d4015, 4d401e, 4d401f, but not 4d400c. Perhaps useful in some obscure case.

@aktau
Copy link
Author

@aktau aktau commented Jan 28, 2016

Awesome, thanks a lot and keep up the great work!

@aktau aktau closed this Jan 28, 2016
yan12125 added a commit that referenced this issue Jan 28, 2016
Regex pattern update to match more codecs (fixes #6858)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.