Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ie/Twitch] Fix video formats extraction #8960

Merged
merged 2 commits into from Jan 9, 2024

Conversation

DmitryScaletta
Copy link
Contributor

@DmitryScaletta DmitryScaletta commented Jan 9, 2024

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

For some users VOD downloads doesn't work right now (for me as well).
I think it's because of outdated query parameters nauth and nauthsig in the m3u8 request.
Probably they changed something because they added AV1 support yesterday:
https://blog.twitch.tv/en/2024/01/08/introducing-the-enhanced-broadcasting-beta/
https://blogs.nvidia.com/blog/twitch-multiencode-av1-livestreaming/

There are two separate places to get m3u8_formats

For streams

query = {
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'segment_preference': '4',
'sig': access_token['signature'].encode('utf-8'),
'token': token.encode('utf-8'),
}
formats = self._extract_m3u8_formats(
'%s/api/channel/hls/%s.m3u8' % (self._USHER_BASE, channel_name),
stream_id, 'mp4', query=query)
self._prefer_source(formats)

For videos (outdated now)

formats = self._extract_m3u8_formats(
'%s/vod/%s.m3u8?%s' % (
self._USHER_BASE, vod_id,
compat_urllib_parse_urlencode({
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'nauth': access_token['value'],
'nauthsig': access_token['signature'],
})),
vod_id, 'mp4', entry_protocol='m3u8_native')

Real query params for these requests currently look like this:

For streams

const params = {
  acmb: 'e30=',
  allow_source: 'true',
  browser_family: 'chrome',
  browser_version: '120.0',
  cdm: 'wv',
  fast_bread: 'true',
  os_name: 'Windows',
  os_version: 'NT 10.0',
  p: '5814863',
  platform: 'web',
  play_session_id: 'b4aeb3d2ceb3a2927434d6ae296085c6',
  player_backend: 'mediaplayer',
  player_version: '1.24.0-rc.1.3',
  playlist_include_framerate: 'true',
  reassignments_supported: 'true',
  sig: accessToken.signature,
  supported_codecs: 'av1,h265,h264',
  token: accessToken.value,
  transcode_mode: 'cbr_v1',
}

For videos

const params = {
  acmb: 'e30=',
  allow_source: 'true',
  browser_family: 'chrome',
  browser_version: '120.0',
  cdm: 'wv',
  os_name: 'Windows',
  os_version: 'NT 10.0',
  p: '7864577',
  platform: 'web',
  play_session_id: 'afa47edbc62f8c35ac2d91818bb36e62',
  player_backend: 'mediaplayer',
  player_version: '1.24.0-rc.1.3',
  playlist_include_framerate: 'true',
  reassignments_supported: 'true',
  sig: accessToken.signature,
  supported_codecs: 'av1,h265,h264',
  token: accessToken.value,
  transcode_mode: 'cbr_v1',
}

About 'segment_preference': '4',.
The only thing that this line changes is base url of playlist urls:
https://gist.github.com/DmitryScaletta/eab5d3e93d678c6f4cd89e678779f006/revisions

https://video-weaver.fra05.hls.ttvnw.net // with
https://video-weaver.fra06.hls.ttvnw.net // without

This line was added 9 years ago. And it's missing in real requests in the browser. I don't think twitch uses it now.

About entry_protocol='m3u8_native'.
I checked both stream and video formats and the protocol there is m3u8_native by default so it can be removed.

After this fix:

Formats for VODs

$ ./yt-dlp.cmd https://www.twitch.tv/videos/2026571206 -F
[twitch:vod] Extracting URL: https://www.twitch.tv/videos/2026571206
[twitch:vod] 2026571206: Downloading stream metadata GraphQL
[twitch:vod] 2026571206: Downloading video access token GraphQL
[twitch:vod] 2026571206: Downloading m3u8 information
[twitch:vod] 2026571206: Downloading storyboard metadata JSON
WARNING: [twitch:vod] Unable to download JSON metadata: HTTP Error 403: Forbidden 
[info] Available formats for v2026571206:
ID         EXT RESOLUTION FPS │   FILESIZE   TBR PROTO │ VCODEC      ACODEC     ABR MORE INFO
─────────────────────────────────────────────────────────────────────────────────────────────
Audio_Only mp4 audio only     │ ~621.49MiB  208k m3u8  │ audio only  mp4a.40.2 208k
160p       mp4 284x160     30 │ ~687.22MiB  230k m3u8  │ avc1.4D400C mp4a.40.2
360p       mp4 640x360     30 │ ~  1.84GiB  630k m3u8  │ avc1.4D401E mp4a.40.2
480p       mp4 852x480     30 │ ~  3.50GiB 1200k m3u8  │ avc1.4D401E mp4a.40.2
720p       mp4 1280x720    30 │ ~  6.13GiB 2100k m3u8  │ avc1.4D401F mp4a.40.2
720p60     mp4 1280x720    60 │ ~  9.05GiB 3100k m3u8  │ avc1.4D401F mp4a.40.2
1080p60    mp4 1920x1080   60 │ ~ 24.58GiB 8424k m3u8  │ avc1.42C02A mp4a.40.2      Source

Formats for streams

$ ./yt-dlp.cmd https://www.twitch.tv/lirik -F
[twitch:stream] Extracting URL: https://www.twitch.tv/lirik
[twitch:stream] lirik: Downloading stream GraphQL
[twitch:stream] lirik: Downloading stream access token GraphQL
[twitch:stream] lirik: Downloading m3u8 information
[info] Available formats for 43367080459:
ID               EXT RESOLUTION FPS │   TBR PROTO │ VCODEC      ACODEC     ABR 
──────────────────────────────────────────────────────────────────────────────
audio_only       mp4 audio only     │  160k m3u8  │ audio only  mp4a.40.2 160k
160p             mp4 284x160     30 │  230k m3u8  │ avc1.4D401F mp4a.40.2
360p             mp4 640x360     30 │  630k m3u8  │ avc1.4D401F mp4a.40.2
480p             mp4 852x480     30 │ 1428k m3u8  │ avc1.4D401F mp4a.40.2
720p             mp4 1280x720    30 │ 2373k m3u8  │ avc1.4D401F mp4a.40.2
720p60           mp4 1280x720    60 │ 3423k m3u8  │ avc1.4D401F mp4a.40.2
1080p60__source_ mp4 1920x1080   60 │ 9053k m3u8  │ avc1.42C02A mp4a.40.2

Fixes #8958

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Comment on lines 193 to 209
def _extract_twitch_m3u8_formats(self, type, id, token, signature):
URLS_MAP = {
'video': '%s/vod/%s.m3u8',
'stream': '%s/api/channel/hls/%s.m3u8'
}
query = {
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'sig': signature,
'token': token,
}
return self._extract_m3u8_formats(
URLS_MAP[type] % (self._USHER_BASE, id), id, 'mp4', query=query)
Copy link
Member

@bashonly bashonly Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be simpler to do it like this:

Suggested change
def _extract_twitch_m3u8_formats(self, type, id, token, signature):
URLS_MAP = {
'video': '%s/vod/%s.m3u8',
'stream': '%s/api/channel/hls/%s.m3u8'
}
query = {
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'sig': signature,
'token': token,
}
return self._extract_m3u8_formats(
URLS_MAP[type] % (self._USHER_BASE, id), id, 'mp4', query=query)
def _extract_twitch_m3u8_formats(self, video_id, token, signature):
"""Subclasses must define _M3U8_PATH"""
return self._extract_m3u8_formats(
f'{self._USHER_BASE}/{self._M3U8_PATH}/{video_id}.m3u8', video_id, 'mp4', query={
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'sig': signature,
'token': token,
})

and then in the respective concrete IE classes:

    _M3U8_PATH = 'vod'
    _M3U8_PATH = 'api/channel/hls'

(and remove the 'video' and 'stream' args from the method calls)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just pass the m3u8 path itself as an argument? Each case is only used once

Copy link
Member

@bashonly bashonly Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashonly bashonly added the site-bug Issue with a specific website label Jan 9, 2024
@bashonly bashonly added the pending-review PR needs a review label Jan 9, 2024
@bashonly bashonly merged commit 5b8c69a into yt-dlp:master Jan 9, 2024
6 checks passed
@bashonly bashonly removed the pending-review PR needs a review label Jan 9, 2024
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Twitch VOD downloads are broken
4 participants