Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c13cl] Adding support for www.13.cl and rudo.video #8664

Merged
merged 16 commits into from
Dec 22, 2023

Conversation

nicodato
Copy link
Contributor

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

Adds extractor for www.13.cl and rudo.video.
13.cl is a TV channel from Chile, and it uses rudo.video.

With this PR, yt-dlp now supports (georestricted to Chile):
https://www.13.cl/en-vivo
https://www.13.cl/en-vivo-2
https://rudo.video/live/c13
https://rudo.video/live/t13-13cl
https://rudo.video/live/bbtv

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

@seproDev seproDev added the site-request Request to support a new website label Nov 27, 2023
@garret1317 garret1317 added the geo-blocked Content is geo-blocked label Nov 27, 2023
Copy link
Collaborator

@seproDev seproDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also revert the changes to supportedsites.md. The file will get updated with the next release.

yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Show resolved Hide resolved
yt_dlp/extractor/c13cl.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Show resolved Hide resolved
@seproDev seproDev added the pending-fixes PR has had changes requested label Nov 30, 2023
* implementing PR suggestions
* removing c13cl extractor in favor of RudoVideo with _EMBED_REGEX
* supporting VODs and Podcasts from rudo.video
* supporting embeded youtube
* adding tests
@nicodato
Copy link
Contributor Author

nicodato commented Dec 2, 2023

Hello @seproDev , I have just implemented your suggestions.
I added support for VODs and podcast. Some interesting data:
Both https://rudo.video/podcast/cz2wrUy8l0o and https://rudo.video/vod/cz2wrUy8l0o gives you the same site (a podcast). Same thing happens with a real podcast. It looks like doesn't work with /live/ URLs.
And I found this content https://rudo.video/vod/czfvKUuULV8 with in reality is an embedded youtube

yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
@bashonly bashonly self-requested a review December 6, 2023 18:42
@nicodato
Copy link
Contributor Author

nicodato commented Dec 9, 2023

No problem :) Thanks for your advice @seproDev
I have just split the class into two clasess, RudoVideoIE and RudoVideoLiveIE. Also, I used those _og_search_* methods.

@nicodato
Copy link
Contributor Author

@seproDev I have used @pukkandan 's commit and modified just a little. The line to obtain the m3u8_url must first check for the streamURL variable, and only if it fails, search for the source tag.
This is because with https://rudo.video/live/bbtv (this is not geo-restrictred) the source tag has a generic m3u8 that doesn't work. That site also has the streamURL variable and that's the url that works. So using one regex to search for either streamURL or source tag failed. It was returning the generic m3u8 from the source instead of the streamURL variable.
IIRC podcasts have the m3u8 in the soruce tag.

Everything else should be just like pukkandan's suggestion

@pukkandan pukkandan removed the pending-fixes PR has had changes requested label Dec 11, 2023
yt_dlp/extractor/rudovideo.py Show resolved Hide resolved
Comment on lines 92 to 94
m3u8_url = update_url_query(m3u8_url, {
'auth-token': traverse_obj(access_token, ('data', 'authToken'))
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be fatal?

Suggested change
m3u8_url = update_url_query(m3u8_url, {
'auth-token': traverse_obj(access_token, ('data', 'authToken'))
})
m3u8_url = update_url_query(m3u8_url, {
'auth-token': access_token['data']['authToken'],
})

Copy link
Member

@bashonly bashonly Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this instead

        if token_array:
            token_url = traverse_obj(token_array, (..., {url_or_none}), get_all=False)
            if not token_url:
                raise ExtractorError('Invalid access token array')
            access_token = self._download_json(
                token_url, video_id, note='Downloading access token')['data']['authToken']
            m3u8_url = update_url_query(m3u8_url, {'auth-token': access_token})

Edit:

My first PR instead of using authToken and auth-token, it used the token_array elements.

are you saying the request to download token is unnecessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @bashonly , I have just pushed your suggestion
(somehow I removed my previous comment by mistake)

Copy link
Contributor Author

@nicodato nicodato Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you saying the request to download token is unnecessary?

No. We need to download the token.
I meant that the array has something like this:

["https://example.com/tokenapi", ..., "authToken", "auth-token", ...]

So previously, I was downloading the token, and then using the access_token and the token_array elements to construct the query string this:

access_token_webpage = self._download_webpage(token_array[0], video_id)
access_token = self._parse_json(access_token_webpage, video_id)
if "data" not in access_token or token_array[3] not in access_token.get("data"):
raise ExtractorError('Couldnt get access token', video_id=video_id)
query_string = token_array[5] + traverse_obj(access_token, ("data", token_array[3]))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah alright. LGTM then

access_token = self._download_json(
token_url, video_id, note='Downloading access token')['data']['authToken']
m3u8_url = update_url_query(m3u8_url, {'auth-token': access_token})

Copy link
Collaborator

@seproDev seproDev Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Podcasts such as https://rudo.video/podcast/b42ZUznHX0 are sometimes served as direct mp3 files, which currently break the extractor. A simple solution would be to check by extension:

Suggested change
if determine_ext(media_url) == 'm3u8':
formats = self._extract_m3u8_formats(media_url, video_id, live=is_live)
else:
formats = [{'url': media_url}]

I'd also rename the variable to media_url.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the extension is mp3, we may also want to add 'vcodec': 'none' to the format dict

yt_dlp/extractor/rudovideo.py Outdated Show resolved Hide resolved
@nicodato
Copy link
Contributor Author

@seproDev I pushed your suggestions, adding support for the MP3 podcast, including a test

@seproDev seproDev merged commit 0d531c3 into yt-dlp:master Dec 22, 2023
15 checks passed
@nicodato nicodato deleted the c13cl branch December 26, 2023 17:37
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
geo-blocked Content is geo-blocked site-request Request to support a new website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants