Skip to content

Commit

Permalink
[ie] Make _search_nuxt_data more lenient
Browse files Browse the repository at this point in the history
Authored by: std-move

Co-authored-by: std-move <26625259+std-move@users.noreply.github.com>
  • Loading branch information
bashonly and std-move committed Sep 21, 2023
1 parent 52414d6 commit 904a19e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion yt_dlp/extractor/common.py
Expand Up @@ -1687,7 +1687,7 @@ def _search_nextjs_data(self, webpage, video_id, *, transform_source=None, fatal
def _search_nuxt_data(self, webpage, video_id, context_name='__NUXT__', *, fatal=True, traverse=('data', 0)):
"""Parses Nuxt.js metadata. This works as long as the function __NUXT__ invokes is a pure function"""
rectx = re.escape(context_name)
FUNCTION_RE = r'\(function\((?P<arg_keys>.*?)\){return\s+(?P<js>{.*?})\s*;?\s*}\((?P<arg_vals>.*?)\)'
FUNCTION_RE = r'\(function\((?P<arg_keys>.*?)\){(?:.*?)return\s+(?P<js>{.*?})\s*;?\s*}\((?P<arg_vals>.*?)\)'
js, arg_keys, arg_vals = self._search_regex(
(rf'<script>\s*window\.{rectx}={FUNCTION_RE}\s*\)\s*;?\s*</script>', rf'{rectx}\(.*?{FUNCTION_RE}'),
webpage, context_name, group=('js', 'arg_keys', 'arg_vals'),
Expand Down

2 comments on commit 904a19e

@dirkf
Copy link
Contributor

@dirkf dirkf commented on 904a19e Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

-        FUNCTION_RE = r'\(function\((?P<arg_keys>.*?)\){return\s+(?P<js>{.*?})\s*;?\s*}\((?P<arg_vals>.*?)\)'
+        FUNCTION_RE = r'\(function\((?P<arg_keys>.*?)\){(?:.*?)\breturn\s+(?P<js>{.*?})\s*;?\s*}\((?P<arg_vals>.*?)\)'

Also, should DOT_ALL be in effect here (or . --> [\s\S])?

@dirkf
Copy link
Contributor

@dirkf dirkf commented on 904a19e Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also part 2, the non-capturing parens are unnecessary, though that could be a style issue.

Please sign in to comment.