Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/abc] Fix #6433 by add 2 re patterns #7434

Merged
merged 3 commits into from Jun 27, 2023

Conversation

meliber
Copy link
Contributor

@meliber meliber commented Jun 26, 2023

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

ADD DESCRIPTION HERE

Fixes #6433

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Copilot Summary

馃 Generated by Copilot at 1e57797

Summary

馃帴馃啎馃И

Enhance ABCIE extractor to handle YouTube-embedded videos. Add a new test case for yt_dlp/extractor/abc.py.

Walkthrough

  • Add logic to handle ABC-embedded videos in ABCIE extractor ([link](https://github.com/yt-dlp/yt-dlp/pull/7434/files?diff=unified&w=0#diff-65fb45ff5464c45c0963dd9a56a6ea7b3f85fab95e15cbd3fe360391c073051bR108-R116))

@bashonly
Copy link
Member

instead of adding more if/else statements, I think this could be done like this

diff --git a/yt_dlp/extractor/abc.py b/yt_dlp/extractor/abc.py
index 0ca76b85a..b45e4d12c 100644
--- a/yt_dlp/extractor/abc.py
+++ b/yt_dlp/extractor/abc.py
@@ -12,6 +12,7 @@
     int_or_none,
     parse_iso8601,
     str_or_none,
+    traverse_obj,
     try_get,
     unescapeHTML,
     update_url_query,
@@ -107,7 +108,7 @@ def _real_extract(self, url):
                 video = True
 
         if mobj is None:
-            mobj = re.search(r'(?P<type>)"sources": (?P<json_data>\[[^\]]+\]),', webpage)
+            mobj = re.search(r'(?P<type>)"(?:sources|files|renditions)":\s*(?P<json_data>\[[^\]]+\])', webpage)
             if mobj is None:
                 mobj = re.search(
                     r'inline(?P<type>Video|Audio|YouTube)Data\.push\((?P<json_data>[^)]+)\);',
@@ -121,7 +122,7 @@ def _real_extract(self, url):
             urls_info = self._parse_json(
                 mobj.group('json_data'), video_id, transform_source=js_to_json)
             youtube = mobj.group('type') == 'YouTube'
-            video = mobj.group('type') == 'Video' or urls_info[0]['contentType'] == 'video/mp4'
+            video = mobj.group('type') == 'Video' or traverse_obj(urls_info, (0, 'contentType')) == 'video/mp4'
 
         if not isinstance(urls_info, list):
             urls_info = [urls_info]

@bashonly bashonly added site-bug Issue with a specific website pending-fixes PR has had changes requested labels Jun 26, 2023
@meliber
Copy link
Contributor Author

meliber commented Jun 26, 2023

instead of adding more if/else statements, I think this could be done like this

diff --git a/yt_dlp/extractor/abc.py b/yt_dlp/extractor/abc.py
index 0ca76b85a..b45e4d12c 100644
--- a/yt_dlp/extractor/abc.py
+++ b/yt_dlp/extractor/abc.py
@@ -12,6 +12,7 @@
     int_or_none,
     parse_iso8601,
     str_or_none,
+    traverse_obj,
     try_get,
     unescapeHTML,
     update_url_query,
@@ -107,7 +108,7 @@ def _real_extract(self, url):
                 video = True
 
         if mobj is None:
-            mobj = re.search(r'(?P<type>)"sources": (?P<json_data>\[[^\]]+\]),', webpage)
+            mobj = re.search(r'(?P<type>)"(?:sources|files|renditions)":\s*(?P<json_data>\[[^\]]+\])', webpage)
             if mobj is None:
                 mobj = re.search(
                     r'inline(?P<type>Video|Audio|YouTube)Data\.push\((?P<json_data>[^)]+)\);',
@@ -121,7 +122,7 @@ def _real_extract(self, url):
             urls_info = self._parse_json(
                 mobj.group('json_data'), video_id, transform_source=js_to_json)
             youtube = mobj.group('type') == 'YouTube'
-            video = mobj.group('type') == 'Video' or urls_info[0]['contentType'] == 'video/mp4'
+            video = mobj.group('type') == 'Video' or traverse_obj(urls_info, (0, 'contentType')) == 'video/mp4'
 
         if not isinstance(urls_info, list):
             urls_info = [urls_info]

That's way more better. Thank you.

yt_dlp/extractor/abc.py Outdated Show resolved Hide resolved
@bashonly bashonly added pending-review PR needs a review and removed pending-fixes PR has had changes requested labels Jun 26, 2023
Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
@bashonly bashonly merged commit 8f05fba into yt-dlp:master Jun 27, 2023
11 checks passed
@bashonly bashonly removed the pending-review PR needs a review label Jun 27, 2023
@meliber meliber deleted the fix_#6433 branch June 27, 2023 21:24
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[abc.net.au] Unable to extract video urls
2 participants