Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support catchall for sub language without country #2167

Open
gradha opened this issue Jan 17, 2014 · 1 comment
Open

Support catchall for sub language without country #2167

gradha opened this issue Jan 17, 2014 · 1 comment
Labels

Comments

@gradha
Copy link

@gradha gradha commented Jan 17, 2014

When the --sub-lang switch is specified with a value of en it doesn't download subs labelled with a country code like en-US. See this session log:

  $ youtube-dl  -U
  Updating to version 2014.01.17.2...
  Updated youtube-dl. Restart youtube-dl to use the new version.
  $ youtube-dl  --skip-download --write-sub --sub-lang en WR9-GRQfEkU
  [youtube] Setting language
  [youtube] WR9-GRQfEkU: Downloading webpage
  [youtube] WR9-GRQfEkU: Downloading video info webpage
  [youtube] WR9-GRQfEkU: Extracting video information
  WARNING: no closed captions found in the specified language "en"
  [youtube] WR9-GRQfEkU: Encrypted signatures detected.
  $ youtube-dl  --skip-download --list-subs WR9-GRQfEkU
  [youtube] Setting language
  [youtube] WR9-GRQfEkU: Downloading webpage
  [youtube] WR9-GRQfEkU: Downloading video info webpage
  [youtube] WR9-GRQfEkU: Extracting video information
  [youtube] WR9-GRQfEkU: Looking for automatic captions
  [youtube] WR9-GRQfEkU: Downloading XML
  WARNING: Video doesn't have automatic captions
  [youtube] WR9-GRQfEkU: Available subtitles for video: en-US
  [youtube] WR9-GRQfEkU: Available automatic captions for video: 
  $

It would be great if failing a perfect match the program downloaded whatever first entry matched partially the user specified value, changing the warning to:

WARNING: no closed captions found in the specified language "en", found partial match for "en-US"

Or maybe add the partial match behaviour as an additional switch if it breaks behaviour people depend on (aka: specifying a value and failing even if partial matches are found).

@gradha
Copy link
Author

@gradha gradha commented Jan 17, 2014

Untested prototype algorithm follows:

diff --git a/youtube-dl b/youtube-dl
index e3eb877..1e06aaa 100755
Binary files a/youtube-dl and b/youtube-dl differ
diff --git a/youtube_dl/extractor/subtitles.py b/youtube_dl/extractor/subtitles.py
index 4b4c523..7ab94e6 100644
--- a/youtube_dl/extractor/subtitles.py
+++ b/youtube_dl/extractor/subtitles.py
@@ -51,8 +51,13 @@ class SubtitlesInfoExtractor(InfoExtractor):
             sub_lang_list = {}
             for sub_lang in requested_langs:
                 if not sub_lang in available_subs_list:
-                    self._downloader.report_warning(u'no closed captions found in the specified language "%s"' % sub_lang)
-                    continue
+                    partial_dict = dict([(x[:len(sub_lang)], x) for x in reversed(available_subs_list)])
+                    if sub_lang in partial_dict:
+                        self._downloader.report_warning(u'no closed captions found in the specified language "%s", found partial match for "%s"' % (sub_lang, sub_lang))
+                        sub_lang = partial_dict[sub_lang]
+                    else:
+                        self._downloader.report_warning(u'no closed captions found in the specified language "%s"' % sub_lang)
+                        continue
                 sub_lang_list[sub_lang] = available_subs_list[sub_lang]

         subtitles = {}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.