Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--match-title or --reject-title doesn't work with cyrillic (non ASCII) coded argument #25507

Closed
greyboxgt opened this issue Jun 2, 2020 · 2 comments
Labels

Comments

@greyboxgt
Copy link

@greyboxgt greyboxgt commented Jun 2, 2020

Checklist

  • I'm asking a question
  • I've looked through the README and FAQ for similar questions
  • I've searched the bugtracker for similar questions including closed ones

Question

I am trying to filter out some videos with non-latin Cyrillic characters are used in the title. The key word is also in Cyrillic. Here is my example script that doesn't work properly:

youtube-dl -v --playlist-items 3 --match-title "Смешарики" "https://www.youtube.com/user/TVSmeshariki/videos"

Verbose log

c:\bat>youtube-dl -v --playlist-items 3 --match-title "Смешарики" "https://www.youtube.com/user/TVSmeshariki/videos"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', '--playlist-items', '3', '--match-title', 'Смешарики', 'https://www.youtube.com/user/TVSmeshariki/videos']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2020.05.08
[debug] Python version 3.4.4 (CPython) - Windows-7-6.1.7601-SP1
[debug] exe versions: ffmpeg git-2020-05-15-b18fd2b
[debug] Proxy map: {}
[youtube:user] TVSmeshariki: Downloading channel page
[youtube:playlist] UU5A-Wp9ujcr5g9sYagAafEA: Downloading webpage
[download] Downloading playlist: Uploads from TVSmeshariki
[youtube:playlist] playlist Uploads from TVSmeshariki: Downloading 1 videos
[download] Downloading video 1 of 1
[download] "Свободный обмен - Смешарики Пинкод. Азбука финансовой грамотности |
ПРЕМЬЕРА 2020!" title did not match pattern "╨í╨╝╨╡╤ê╨░╤Ç╨╕╨║╨╕"
[download] Finished downloading playlist: Uploads from TVSmeshariki

What should I do to make --match-title or --reject-title working with Cyrillic characters?

@greyboxgt greyboxgt added the question label Jun 2, 2020
@dstftw
Copy link
Collaborator

@dstftw dstftw commented Jun 2, 2020

Again: it does work perfectly with Cyrrillic characters if you pass them properly.
You pass: --match-title "Смешарики" as clearly seen from the log. Fix that.
Also if you're using it in a batch file (though you did not mention anything like that) then batch file encoding must match with your cmd's active code page or you must setup it with chcp.

@dstftw dstftw closed this Jun 2, 2020
@dstftw dstftw added the invalid label Jun 2, 2020
@greyboxgt
Copy link
Author

@greyboxgt greyboxgt commented Jun 7, 2020

@dstftw: Thanks for the tip!

In case somebody else runs into a similar problem, here is my batch file for Windows:
TVSmeshariki.bat

chcp 65001
youtube-dl ^
--cookies cookies.txt ^
--download-archive archive-TVSmeshariki.txt ^
--playlist-items 1-5 ^
--reject-title "трейлер|сборник|серии" ^
-f bestvideo+bestaudio ^
--merge-output-format mkv ^
-o %%(title)s.%%(ext)s ^
-i ^
"https://www.youtube.com/user/TVSmeshariki/videos"
pause

And also to avoid "OSError: Failed to write string" error, make sure to configure your cmd (command prompt) properties to use the font that supports Cyrillic (Consolas TTF or Lucida Console TTF).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.