New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extractor/biliIntl] Fix subtitle extraction and add more subtitles #7077
Conversation
This code adapted from @dirkf code in yt-dlp#6664 (comment)
ee280c7
to
7aeda6c
Compare
`[debug] Command-line config: ['--downloader', 'aria2c', '--add-header', 'Referer [download] Finished downloading playlist: Keijo!!!!!!!!` |
Merged with @itachi-19's PR, but my changes are untested. Pls test |
Seems fine in this link Log: Successful subtitle extraction and merge process> python -m yt_dlp -v --embed-subs --sub-langs all "https://www.bilibili.tv/en/play/34580/340314?bstar_from=bstar-web.pgc-video-detail.episode.all"
[debug] Command-line config: ['-v', '--embed-subs', '--sub-langs', 'all', 'https://www.bilibili.tv/en/play/34580/340314?bstar_from=bstar-web.pgc-video-detail.episode.all']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.06.21 [42f2d40b4] (source)
[debug] Lazy loading extractors is disabled
[debug] Git HEAD: 274b282d0
[debug] Python 3.11.1 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1q 5 Jul 2022)
[debug] exe versions: ffmpeg 5.1.2-full_build-www.gyan.dev (setts), ffprobe 5.1.2-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.16.0, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1849 extractors
[BiliIntl] Extracting URL: https://www.bilibili.tv/en/play/34580/340314?bstar_from=bstar-web.pgc-video-detail.episode.all
[BiliIntl] 340314: Downloading JSON metadata
[BiliIntl] 340314: Downloading webpage
[BiliIntl] 340314: Downloading JSON metadata
[BiliIntl] 340314: Downloading video formats
[BiliIntl] 340314: Downloading subtitles list
[BiliIntl] 340314: Downloading subtitlesfor English (en)
[BiliIntl] 340314: Downloading subtitlesfor ภาษาไทย (th)
[BiliIntl] 340314: Downloading subtitlesfor Tiếng Việt (vi)
[BiliIntl] 340314: Downloading subtitlesfor Bahasa Indonesia (id)
[BiliIntl] 340314: Downloading subtitlesfor Bahasa Melayu (ms)
[BiliIntl] 340314: Downloading subtitlesfor 中文(简体) (zh-Hans)
[BiliIntl] 340314: Downloading subtitlesfor 中文(繁体) (zh-Hant)
[BiliIntl] 340314: Downloading subtitlesfor English (en)
[BiliIntl] 340314: Downloading subtitlesfor ภาษาไทย (th)
[BiliIntl] 340314: Downloading subtitlesfor Tiếng Việt (vi)
[BiliIntl] 340314: Downloading subtitlesfor Bahasa Indonesia (id)
[BiliIntl] 340314: Downloading subtitlesfor Bahasa Melayu (ms)
[BiliIntl] 340314: Downloading subtitlesfor 中文(简体) (zh-Hans)
[BiliIntl] 340314: Downloading subtitlesfor 中文(繁体) (zh-Hant)
[info] 340314: Downloading subtitles: en, th, vi, id, ms, zh-Hans, zh-Hant
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] 340314: Downloading 1 format(s): 10+2
[info] Writing video subtitles to: E2 - Trainer Sakonji Urokodaki [340314].en.srt
[info] Writing video subtitles to: E2 - Trainer Sakonji Urokodaki [340314].th.srt
[info] Writing video subtitles to: E2 - Trainer Sakonji Urokodaki [340314].vi.srt
[info] Writing video subtitles to: E2 - Trainer Sakonji Urokodaki [340314].id.srt
[info] Writing video subtitles to: E2 - Trainer Sakonji Urokodaki [340314].ms.srt
[info] Writing video subtitles to: E2 - Trainer Sakonji Urokodaki [340314].zh-Hans.srt
[info] Writing video subtitles to: E2 - Trainer Sakonji Urokodaki [340314].zh-Hant.srt
[debug] Invoking http downloader on "https://upos-bstar1-mirrorakam.akamaized.net/iupxcodeboss/q1/zp/n230227qn3lkhfd7mx2w4v243y7xzpq1-1-231220110000.m4s?e=ig8euxZM2rNcNbdlhoNvNC8BqJIzNbfqXBvEqxTEto8BTrNvN0GvT90W5JZMkX_YN0MvXg8gNEV4NC8xNEV4N03eN0B5tZlqNxTEto8BTrNvNeZVuJ10Kj_g2UB02J0mN0B5tZlqNCNEto8BTrNvNC7MTX502C8f2jmMQJ6mqF2fka1mqx6gqj0eN0B599M=&uipk=5&nbs=1&deadline=1687407609&gen=playurlv2&os=akam&oi=2107831674&trid=ecfddd48a36f4ad1ab5c87c497535cb1i&mid=0&platform=pc&upsig=2274ada6e4a0c3fa7e5ff281980230da&uparams=e,uipk,nbs,deadline,gen,os,oi,trid,mid,platform&hdnts=exp=1687407609~hmac=1b46a61bcf939b73ec66fb5c196d8e619604b991d27559a8f2026a9127f5e3c1&bvc=vod&nettype=0&orderid=0,2&logo=00000000"
[debug] File locking is not supported. Proceeding without locking
[download] Destination: E2 - Trainer Sakonji Urokodaki [340314].f10.mp4
[download] 100% of 24.13MiB in 00:00:09 at 2.64MiB/s
[debug] Invoking http downloader on "https://upos-bstar1-mirrorakam.akamaized.net/iupxcodeboss/q1/zp/n230227qn3lkhfd7mx2w4v243y7xzpq1-1-2d1301000023.m4s?e=ig8euxZM2rNcNbdlhoNvNC8BqJIzNbfqXBvEqxTEto8BTrNvN0GvT90W5JZMkX_YN0MvXg8gNEV4NC8xNEV4N03eN0B5tZlqNxTEto8BTrNvNeZVuJ10Kj_g2UB02J0mN0B5tZlqNCNEto8BTrNvNC7MTX502C8f2jmMQJ6mqF2fka1mqx6gqj0eN0B599M=&uipk=5&nbs=1&deadline=1687407609&gen=playurlv2&os=akam&oi=2107831674&trid=0dc6754df6d347f8b08d544c9f36a09fi&mid=0&platform=pc&upsig=c0a6c9f52e7276a74f11bfc22450858b&uparams=e,uipk,nbs,deadline,gen,os,oi,trid,mid,platform&hdnts=exp=1687407609~hmac=c1635f1348f053e1cd113db09b5af9cba190b1573d74ab074ede0de9d33149c0&bvc=vod&nettype=0&orderid=0,2&logo=00000000"
[download] Destination: E2 - Trainer Sakonji Urokodaki [340314].f2.mp4
[download] 100% of 29.12MiB in 00:00:09 at 2.92MiB/s
[Merger] Merging formats into "E2 - Trainer Sakonji Urokodaki [340314].mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel "repeat+info" -i "file:E2 - Trainer Sakonji Urokodaki [340314].f10.mp4" -i "file:E2 - Trainer Sakonji Urokodaki [340314].f2.mp4" -c copy -map "0:v:0" -map "1:a:0" -movflags "+faststart" "file:E2 - Trainer Sakonji Urokodaki [340314].temp.mp4"
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].f2.mp4 (pass -k to keep)
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].f10.mp4 (pass -k to keep)
[EmbedSubtitle] Embedding subtitles in "E2 - Trainer Sakonji Urokodaki [340314].mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel "repeat+info" -i "file:E2 - Trainer Sakonji Urokodaki [340314].mp4" -i "file:E2 - Trainer Sakonji Urokodaki [340314].en.srt" -i "file:E2 - Trainer Sakonji Urokodaki [340314].th.srt" -i "file:E2 - Trainer Sakonji Urokodaki [340314].vi.srt" -i "file:E2 - Trainer Sakonji Urokodaki [340314].id.srt" -i "file:E2 - Trainer Sakonji Urokodaki [340314].ms.srt" -i "file:E2 - Trainer Sakonji Urokodaki [340314].zh-Hans.srt" -i "file:E2 - Trainer Sakonji Urokodaki [340314].zh-Hant.srt" -map 0 -dn -ignore_unknown -c copy "-c:s" mov_text -map "-0:s" -map "1:0" "-metadata:s:s:0" "language=eng" -map "2:0" "-metadata:s:s:1" "language=tha" -map "3:0" "-metadata:s:s:2" "language=vie" -map "4:0" "-metadata:s:s:3" "language=ind" -map "5:0" "-metadata:s:s:4" "language=msa" -map "6:0" "-metadata:s:s:5" "language=zho" -map "7:0" "-metadata:s:s:6" "language=zho" -movflags "+faststart" "file:E2 - Trainer Sakonji Urokodaki [340314].temp.mp4"
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].zh-Hant.srt (pass -k to keep)
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].ms.srt (pass -k to keep)
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].id.srt (pass -k to keep)
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].vi.srt (pass -k to keep)
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].en.srt (pass -k to keep)
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].zh-Hans.srt (pass -k to keep)
Deleting original file E2 - Trainer Sakonji Urokodaki [340314].th.srt (pass -k to keep) However the subtitle failed to merged in this url. > curl "https://s.bstarstatic.com/ogv/subtitle/565ec14f73d7a761df88333ae7a9e3165831fb4d.json?auth_key=1687572088-0-0-21e803cee53bbce11612344f34c90043"
{"font_size":0.4,"font_color":"#FFFFFF","background_alpha":0.5,"background_color":"#9C27B0","Stroke":"none","body":[{"from":0,"to":5,"location":2,"content":"."}]} This subtitle failed to merged in ffmpeg Log: Failed subtitle merge processpython -m yt_dlp -v --embed-subs --sub-langs all "https://www.bilibili.tv/en/play/2079519/12766944"
[debug] Command-line config: ['-v', '--embed-subs', '--sub-langs', 'all', 'https://www.bilibili.tv/en/play/2079519/12766944']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.06.21 [42f2d40b4] (source)
[debug] Lazy loading extractors is disabled
[debug] Git HEAD: 274b282d0
[debug] Python 3.11.1 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1q 5 Jul 2022)
[debug] exe versions: ffmpeg 5.1.2-full_build-www.gyan.dev (setts), ffprobe 5.1.2-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.16.0, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1849 extractors
[BiliIntl] Extracting URL: https://www.bilibili.tv/en/play/2079519/12766944
[BiliIntl] 12766944: Downloading JSON metadata
[BiliIntl] 12766944: Downloading webpage
[BiliIntl] 12766944: Downloading JSON metadata
[BiliIntl] 12766944: Downloading video formats
[BiliIntl] 12766944: Downloading subtitles list
[BiliIntl] 12766944: Downloading subtitlesfor English (en)
[BiliIntl] 12766944: Downloading subtitlesfor ภาษาไทย (th)
[BiliIntl] 12766944: Downloading subtitlesfor Tiếng Việt (vi)
[BiliIntl] 12766944: Downloading subtitlesfor Bahasa Indonesia (id)
[BiliIntl] 12766944: Downloading subtitlesfor 中文(简体) (zh-Hans)
[BiliIntl] 12766944: Downloading subtitlesfor English (en)
[BiliIntl] 12766944: Downloading subtitlesfor ภาษาไทย (th)
[BiliIntl] 12766944: Downloading subtitlesfor Tiếng Việt (vi)
[BiliIntl] 12766944: Downloading subtitlesfor Bahasa Indonesia (id)
[BiliIntl] 12766944: Downloading subtitlesfor 中文(简体) (zh-Hans)
[info] 12766944: Downloading subtitles: en, th, id, vi, zh-Hans
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] 12766944: Downloading 1 format(s): 10+2
Deleting existing file E3 - Fluffy and Sparkly [12766944].en.srt
[info] Writing video subtitles to: E3 - Fluffy and Sparkly [12766944].en.srt
Deleting existing file E3 - Fluffy and Sparkly [12766944].th.srt
[info] Writing video subtitles to: E3 - Fluffy and Sparkly [12766944].th.srt
Deleting existing file E3 - Fluffy and Sparkly [12766944].id.srt
[info] Writing video subtitles to: E3 - Fluffy and Sparkly [12766944].id.srt
Deleting existing file E3 - Fluffy and Sparkly [12766944].vi.srt
[info] Writing video subtitles to: E3 - Fluffy and Sparkly [12766944].vi.srt
Deleting existing file E3 - Fluffy and Sparkly [12766944].zh-Hans.srt
[info] Writing video subtitles to: E3 - Fluffy and Sparkly [12766944].zh-Hans.srt
[download] E3 - Fluffy and Sparkly [12766944].mp4 has already been downloaded
[EmbedSubtitle] Embedding subtitles in "E3 - Fluffy and Sparkly [12766944].mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel "repeat+info" -i "file:E3 - Fluffy and Sparkly [12766944].mp4" -i "file:E3 - Fluffy and Sparkly [12766944].en.srt" -i "file:E3 - Fluffy and Sparkly [12766944].th.srt" -i "file:E3 - Fluffy and Sparkly [12766944].id.srt" -i "file:E3 - Fluffy and Sparkly [12766944].vi.srt" -i "file:E3 - Fluffy and Sparkly [12766944].zh-Hans.srt" -map 0 -dn -ignore_unknown -c copy "-c:s" mov_text -map "-0:s" -map "1:0" "-metadata:s:s:0" "language=eng" -map "2:0" "-metadata:s:s:1" "language=tha" -map "3:0" "-metadata:s:s:2" "language=ind" -map "4:0" "-metadata:s:s:3" "language=vie" -map "5:0" "-metadata:s:s:4" "language=zho" -movflags "+faststart" "file:E3 - Fluffy and Sparkly [12766944].temp.mp4"
[debug] ffmpeg version 5.1.2-full_build-www.gyan.dev Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 12.1.0 (Rev2, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 57. 28.100 / 57. 28.100
libavcodec 59. 37.100 / 59. 37.100
libavformat 59. 27.100 / 59. 27.100
libavdevice 59. 7.100 / 59. 7.100
libavfilter 8. 44.100 / 8. 44.100
libswscale 6. 7.100 / 6. 7.100
libswresample 4. 7.100 / 4. 7.100
libpostproc 56. 6.100 / 56. 6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'file:E3 - Fluffy and Sparkly [12766944].mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf59.27.100
description : Packed by Bilibili XCoder v2.0.2
Duration: 00:23:40.18, start: 0.000000, bitrate: 364 kb/s
Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 852x480 [SAR 640:639 DAR 16:9], 157 kb/s, 23.98 fps, 23.98 tbr, 16k tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 199 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Input #1, srt, from 'file:E3 - Fluffy and Sparkly [12766944].en.srt':
Duration: N/A, bitrate: N/A
Stream #1:0: Subtitle: subrip
Input #2, srt, from 'file:E3 - Fluffy and Sparkly [12766944].th.srt':
Duration: N/A, bitrate: N/A
Stream #2:0: Subtitle: subrip
Input #3, srt, from 'file:E3 - Fluffy and Sparkly [12766944].id.srt':
Duration: N/A, bitrate: N/A
Stream #3:0: Subtitle: subrip
file:E3 - Fluffy and Sparkly [12766944].vi.srt: Invalid data found when processing input
ERROR: Postprocessing: file:E3 - Fluffy and Sparkly [12766944].vi.srt: Invalid data found when processing input
Traceback (most recent call last):
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\YoutubeDL.py", line 3361, in process_info
replace_info_dict(self.post_process(dl_filename, info_dict, files_to_move))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\YoutubeDL.py", line 3541, in post_process
info = self.run_all_pps('post_process', info, additional_pps=info.get('__postprocessors'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\YoutubeDL.py", line 3523, in run_all_pps
info = self.run_pp(pp, info)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\YoutubeDL.py", line 3501, in run_pp
files_to_delete, infodict = pp.run(infodict)
^^^^^^^^^^^^^^^^
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\postprocessor\common.py", line 24, in run
ret = func(self, info, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\postprocessor\common.py", line 129, in wrapper
return func(self, info)
^^^^^^^^^^^^^^^^
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\postprocessor\ffmpeg.py", line 662, in run
self.run_ffmpeg_multiple_files(input_files, temp_filename, opts)
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\postprocessor\ffmpeg.py", line 329, in run_ffmpeg_multiple_files
return self.real_run_ffmpeg(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\HobbyistDev\Documents\Program\python_proj\yt-dlp\yt_dlp\postprocessor\ffmpeg.py", line 367, in real_run_ffmpeg
raise FFmpegPostProcessorError(stderr.strip().splitlines()[-1])
yt_dlp.postprocessor.ffmpeg.FFmpegPostProcessorError: file:E3 - Fluffy and Sparkly [12766944].vi.srt: Invalid data found when processing input |
Is the failure due to my changes, or unrelated? |
I believe it happen not because of your change, the same error appear even in older commit |
I've testing @HobbyistDev yt-dlp bilibili-new-subtitle branch, and here's the output.
It's looks good for me... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a slight merge conflict due to the imports.
Closes yt-dlp#7075, Closes yt-dlp#6664 Authored by: HobbyistDev, itachi-19, dirkf, seproDev Co-authored-by: itachi-19 <16500619+itachi-19@users.noreply.github.com> Co-authored-by: dirkf <fieldhouse@gmx.net> Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
IMPORTANT: PRs without the template will be CLOSED
Description of your pull request and other information
This PR is fix the BOM issue for subtitle by adapting code that have been posted by @dirkf at #6664 (comment) (Thanks a lot) and 403 Error when downloading by adding referer header. This PR also added more subtitle in
video_subtitle
as mentioned in #6664 (comment) and #7075. At the moment, i didn't write any new test because i didn't know how to write test to check the available subtitle formatFixes #7075 #6664 #6640
Template
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:
What is the purpose of your pull request?