New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBCPlayer #7484
CBCPlayer #7484
Conversation
…at finding the direct mp4 file, but as far as I can tell, there is no direct VTT file accessible on theplatform. As an alternative, this uses the subtitles from the m3u8 file while still keeping the direct mp4 file instead of stitching together the bits declared in the m3u8 file for the video.
… but also further formats available via m3u8. Also added a few comments to clarify things.
…s and duration info
…umentation and don't work anymore. Someone who knows their way around these sites that use theplatform should fix this, so that proper regression testing can be done for theplatform.
…ound it works much better, and as far as I can tell there are no regressions anymore.
… it. It will only download the file if the link actually links to an m3u8 file.
Updated some things.
I've also done some regression tests - at least as far as I could without messing with other IEs. As tested, the following tests worked previously:
They all still work with the changes in this PR. Mind you, many of these tests are skipped or not very in-depth. Also, this is 6 tests out of a total of 36, as far as I can tell, so not great coverage. I've noticed a number of the failed tests are just because some fields are missing in the test data. So... should I fix these other tests before it gets merged? |
It'd be great to fix them, but you don't have to. |
OK, I'll try to see what I can do. |
Should I maybe make a seperate PR for fixing up the tests on other sites? They are really unrelated to this PR other than that I'm trying to get it fixed to improve regression testing for this PR. |
Stupid question, I'm trying to get the generic extractor tests up and running. It gives me the error |
It's a core bug. Apply this patch to the run tests, but don't commit it diff --git a/yt_dlp/YoutubeDL.py b/yt_dlp/YoutubeDL.py
index 1a2f42fe9..f4a80ea11 100644
--- a/yt_dlp/YoutubeDL.py
+++ b/yt_dlp/YoutubeDL.py
@@ -572,7 +572,7 @@ class YoutubeDL:
'width', 'height', 'aspect_ratio', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns',
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start',
- 'preference', 'language', 'language_preference', 'quality', 'source_preference',
+ 'preference', 'language', 'language_preference', 'quality', 'source_preference', 'cookies',
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'hls_aes', 'downloader_options',
'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time'
} |
Let it be in this. I can split it later if needed. |
Aha OK, thanks! |
I replied under the assumption it is easier for you to have everything in one PR. But if the other way is easier for you, I don't mind that either - you can open another PR with all the test fixes if you want to. That aside, is there a reason you haven't addressed the requested changes for this PR yet? A Ive aid before, you don't need to necessarily fix the other tests to get this merged. |
I thought I caught all the points in https://github.com/yt-dlp/yt-dlp/pull/7484#issuecomment-1634724162? Or did I miss something? |
…tion, since a 404 otherwise would cause a crash. If no errors ensue checking the extension, it proceeds processing the data." This reverts commit 36253f2.
Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
… the test fixing PR.
you can ignore the failing linter, doesn't have anything to do with this PR |
OK. Doing a final regression test. On CPython 3.8 everything worked, currently testing CPython 3.7 then PyPy and a final flake8 check. |
OK, all tests are through, including with f82aa89, no regressions anymore, with CPython 3.8, CPython 3.7 and PyPy 3.7. Flake8 also doesn't flag anything anymore again. |
imo it's a given that HLS bitrate info is unreliable and comment could be shortened to what was suggested the rest LGTM |
OK. Let me know if you need anything else. |
Here's a stupid question: in commit 9e34e40 you replaced the truncated query with a more exact query. How did you figure out that |
@trainman261 It's how the NBC sites get direct m3u8 responses instead of SMIL/etc. But it doesn't work for all sites, hence the error/warning with the ScrippsNetworks test and why we needed to make sure it is completely non-fatal (and not requested unnecessarily)
I don't think ThePlatform provides a direct subtitles link. It's either via m3u8, mpd or SMIL |
BTW check |
@dirkf Thanks for answering a question I was too embarrassed to ask 😄 Isn't necessary for this PR, but for other things I was trying to figure out how to do just that. |
I noticed the pending-review tag. Is there something that still needs to be done? |
Authored by: trainman261
IMPORTANT: PRs without the template will be CLOSED
Description of your pull request and other information
Main Goal
Add subtitle and additional format support.
Details of the changes
Currently, the ThePlatformIE just tries to extract the video directly, as well as the JSON information. This sometimes misses certain formats and subtitles: since often the direct video is the maximum quality, it doesn't currently allow downloading lower quality formats in some cases. The information about subtitles also sometimes doesn't exist here.
As a solution to this, the m3u8 playlist file is also analyzed. This can help find additional formats as well as subtitles.
Related issues that came out of this improvement
For one, there's the issue of changing ThePlatformIE, as it's an IE that other IEs rely on. There is no overly difficult way to systematically ensure that these changes don't break other IE's. I've done my best to make sure that the changes don't affect the functionality, but it's hard to be sure. See #7483 .
Then there the issue of checking m3u8 files. In this PR, it is handled by the ThePlatformIE, but it seems to me like this should be in the parent InfoExtractor class. Mind you, here, I'm doing it by checking the file extension, but for the parent class, it would make sense to check the first line of the m3u8 file. See #7482 .
Fixes #
Template
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:
What is the purpose of your pull request?
Copilot Summary
🤖 Generated by Copilot at 9f5d60f
Summary
🧪🎬🚫
Improved extraction of CBC videos and added subtitles and chapters support. Added m3u8 and subtitles support for ThePlatform videos. Updated some test cases in
cbc.py
andtheplatform.py
.Walkthrough
thumbnail
,chapters
, andduration
fields to a test case and override format sorting forCBCPlayerIE
(link, link)CBCPlayerIE
(link)skip
messages for outdated or unavailable videos inCBCPlayerIE
andThePlatformIE
tests (link, link, link, link)ThePlatformIE._real_extract
(link)