New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extractor/sbs] Overhaul extractor for new APIs #6839
Conversation
* Use media['name'] for title; * Support urls including 'tv-program', and add a test url; * Remove deprecated sort_formats() call; * Support for capture and return of subtitles, with UTF-16 coding issue work-around when converting dfxp subs to srt.
…rought over from upstream PR: * Inline single-use helper functions and remove unneeded temporary objects * Leverage more powerful traverse_obj() features and eliminate extra helper functions * Stream-line and improve geo-blocking handling * Correctly set 'episode' for named episodes, instead of 'Episode 1', where available. * Re-lint and clean up unneeded imports.
* More inlining of single items, and extra blank line removed * Simplify episode setting code * Simplify season number setting; removed extraneous partOfSeries path
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
Updates from bashonly: * sort class above function * for livestreams, `livestream` is `True` in catalogue Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested my fixes to the geo-bypass/geo-restricted code and everything's working now
Thank you very much for this, things working very well again. |
@vidiot720 or @bashonly
The program pages says AD and other shows that is supposed to support it is Home Is Where The Art Is Series 1 (Ep.11) but --list-formats doesn't show any other audio. `[SBS] Extracting URL: https://www. sbs. com.au /ondemand/movie/ the-lost-city-of-melbourne/2264088643618
|
The on-demand program pages are a bit misleading to include the AD symbol; per https://www.sbs.com.au/aboutus/audio-description-services/#faqs-about-audio-description,
I'd keep an eye on the FAQ to see if AD becomes available on On Demand; it's probably non-trivial for SBS to add support for multiple audio streams since an interface for stream selection would need to be added. There may be some hope given the use of "Currently, ...". |
Closes yt-dlp#6543 Authored by: vidiot720, dirkf, bashonly
Description of your pull request and other information
From dirkf's PR: Australian provider SBS has changed its hosting arrangements and APIs, breaking the existing extractor.
The PR is intended to deal with the new APIs.
Some specialisations of extractor methods are included:
_download_webpage_handle()
detects geo-restriction_extract_m3u8_formats()
defaults to the native downloader.That PR was developed for upstream, so adjusted for deprecations and other legacy compatibility inclusions. It also addresses issue with extracting and converting dxfp subtitles to other formats, since SBS deliver their UTF-8 encoded files with
encoding='UTF-16'
, in error. The workaround is included in utils.py. Note that it will not be triggered where a file actually encoded with UTF-16 is downloaded.Fixes #6543. Adapt dirkf's PR ytdl-org/youtube-dl#31880, with adjustments:
See also ytdl-org/youtube-dl#31841.
There are a few TODOs such as handling metadata for episode titles more nicely, and the downloaded subtitles (if not converted) will still have the wrong encoding. However these can be worked-around and don't cause exceptions during yt-dlp processing, so have raised this PR in order to get core extractor functions back ASAP.
Boilerplate: bug fix, derived code, own tweaks.
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:
What is the purpose of your pull request?