Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Embed JSON subtitles as file attachments #9957

Open
8 of 9 tasks
Riteo opened this issue May 18, 2024 · 3 comments · May be fixed by #9991
Open
8 of 9 tasks

Proposal: Embed JSON subtitles as file attachments #9957

Riteo opened this issue May 18, 2024 · 3 comments · May be fixed by #9991
Labels
enhancement New feature or request

Comments

@Riteo
Copy link
Contributor

Riteo commented May 18, 2024

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Provide a description that is worded well enough to be understood

Right now, at least with YouTube, live chats are exported in a JSON "subtitle" file. While IMO the subtitle definition is not wrong per-se, it obviously can't be embedded into the subtitle track of a video, as the format is YouTube-specific.

That said, embedding it in the "attachment" track would still be pretty useful IMO for reasons such as archival (which is my usecase), as AFAICT it's the only aux file keeping us to have full-fledged single-file downloads, which I believe would be excellent for the aforementioned usecase.

Right now, the file is kept alongside the video, announced by a warning:

if sub_ext == 'json':
self.report_warning('JSON subtitles cannot be embedded')

I propose in this ticket instead to special-case this file type to be embedded as a regular file attachment, supposing the container being used supports it (Matroska does AFAIK).

I have no idea of the actual codebase structure but hopefully it wouldn't be too messy of an affair. I'm willing to try this myself it there's consensus, as I need it and I would otherwise have to write a post-processor anyways.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

[debug] Command-line config: ['-vU', '--no-config', '--embed-subs', '--sub-lang', 'all', 'https://www.youtube.com/watch?v=EmqFoojMeD8']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2024.04.09 from yt-dlp/yt-dlp [ff0779267] (zip)
[debug] Python 3.12.3 (CPython x86_64 64bit) - Linux-6.9.0-x86_64-with-libc (OpenSSL 3.3.0 9 Apr 2024, libc)
[debug] exe versions: ffmpeg 6.1.1 (setts), ffprobe 6.1.1
[debug] Optional libraries: certifi-2024.02.02, curl_cffi-0.6.4 (unsupported), mutagen-1.47.0, requests-2.28.2, sqlite3-3.45.3, urllib3-1.26.15
[debug] Proxy map: {}
[debug] Request Handlers: urllib
[debug] Loaded 1803 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: stable@2024.04.09 from yt-dlp/yt-dlp
yt-dlp is up to date (stable@2024.04.09 from yt-dlp/yt-dlp)
[youtube] Extracting URL: https://www.youtube.com/watch?v=EmqFoojMeD8
[youtube] EmqFoojMeD8: Downloading webpage
[youtube] EmqFoojMeD8: Downloading ios player API JSON
[debug] Loading youtube-nsig.018e9916 from cache
[debug] [youtube] Decrypted nsig za2abuZT6gFDyjgm => pMKSTacJbOpPtw
[debug] Loading youtube-nsig.018e9916 from cache
[debug] [youtube] Decrypted nsig L4KlgskfSYGw8Z2g => u8knrSqo5fD9GQ
[youtube] EmqFoojMeD8: Downloading m3u8 information
[info] EmqFoojMeD8: Downloading subtitles: en, live_chat
[debug] Sort order given by extractor: quality, res, fps, hdr:12, source, vcodec:vp9.2, channels, acodec, lang, proto
[debug] Formats sorted by: hasvid, ie_pref, quality, res, fps, hdr:12(7), source, vcodec:vp9.2(10), channels, acodec, lang, proto, size, br, asr, vext, aext, hasaud, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] EmqFoojMeD8: Downloading 1 format(s): 303+251
[info] Writing video subtitles to: SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].en.vtt
[debug] Invoking http downloader on "https://www.youtube.com/api/timedtext?v=EmqFoojMeD8&ei=VuZMZpPCG6y_6dsPnfGyiAk&caps=asr&opi=112496729&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=1716340934&sparams=ip%2Cipbits%2Cexpire%2Cv%2Cei%2Ccaps%2Copi%2Cxoaf&signature=675EAE5541E8B166C2CC3FCCF385AD0E04D8879D.6A4FF529A4DBBACD5C49C757BF37AF908AD68B00&key=yt8&lang=en&fmt=vtt"
[download] Destination: SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].en.vtt
[download] 100% of   19.72KiB in 00:00:01 at 16.15KiB/s
[info] Writing video subtitles to: SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].live_chat.json
[debug] Invoking youtube_live_chat downloader on "https://www.youtube.com/watch?v=EmqFoojMeD8&bpctr=9999999999&has_verified=1"
[youtube_live_chat] Downloading live chat
[youtube_live_chat] Total fragments: unknown (live)
[download] Destination: SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].live_chat.json
[download] 100% of    4.80MiB in 00:00:20 at 234.36KiB/s
[debug] Invoking http downloader on "https://rr3---sn-uxaxpu5ap5-o52l.googlevideo.com/videoplayback?expire=1716337336&ei=WOZMZpWjFJ7Qi9oPxK-Y8AU&ip=RE.DA.CT.ED&id=o-APRd-OSyd9KVO89lChHTf-kQynTEtWw-1HNfSUb8a-Ty&itag=303&source=youtube&requiressl=yes&xpc=EgVo2aDSNQ%3D%3D&mh=UR&mm=31%2C29&mn=sn-uxaxpu5ap5-o52l%2Csn-hpa7knle&ms=au%2Crdu&mv=m&mvi=3&pl=24&initcwndbps=1763750&vprv=1&svpuc=1&mime=video%2Fwebm&rqh=1&gir=yes&clen=249120735&dur=708.132&lmt=1715845906383279&mt=1716315421&fvip=3&keepalive=yes&c=IOS&txp=4535434&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cxpc%2Cvprv%2Csvpuc%2Cmime%2Crqh%2Cgir%2Cclen%2Cdur%2Clmt&sig=AJfQdSswRQIgQ7fV4ahnmgwokLV5Y_unc21QWb0yu2Ohcyo92WQh11kCIQD_3EzaqA-u1eI2AsJO4gHMmH_F1Rq29XUKnJ4-_Z485A%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AHWaYeowRQIgRmOyrasXedsSzXfUlWIed0fqBkWb0ow6OxHDiEW5nLACIQDbf18xSwTi0blOFxaMfN1UkQLKZp1Zcg9tJl0TpbX5gg%3D%3D"
[download] Destination: SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].f303.webm
[download] 100% of  237.58MiB in 00:03:02 at 1.30MiB/s
[debug] Invoking http downloader on "https://rr3---sn-uxaxpu5ap5-o52l.googlevideo.com/videoplayback?expire=1716337334&ei=VuZMZpPCG6y_6dsPnfGyiAk&ip=RE.DA.CT.ED&id=o-ANU1KlIDDHK0UUa1BzTxcU9J9wN9b2Q0lKBjZmxwIlO4&itag=251&source=youtube&requiressl=yes&xpc=EgVo2aDSNQ%3D%3D&mh=UR&mm=31%2C29&mn=sn-uxaxpu5ap5-o52l%2Csn-hpa7znzr&ms=au%2Crdu&mv=m&mvi=3&pl=24&initcwndbps=1763750&bui=AWRWj2RI05STGnFdv1EOmmOo08ifKybu0jpLNCVQwkSSBLY2XKRra08bxIeY4EYm0R_hjg4LhYhZHBhf&spc=UWF9f0Ym1xrbq-g8YEFciHaOx-IG4ueDQUnR2V7xU1Icu98lWqohO-L9h3L7&vprv=1&svpuc=1&mime=audio%2Fwebm&ns=JtJ_PZL15fApM0UPs79fZUIQ&rqh=1&gir=yes&clen=12380580&dur=708.201&lmt=1715836960775213&mt=1716315421&fvip=3&keepalive=yes&c=WEB&sefc=1&txp=4532434&n=u8knrSqo5fD9GQ&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cxpc%2Cbui%2Cspc%2Cvprv%2Csvpuc%2Cmime%2Cns%2Crqh%2Cgir%2Cclen%2Cdur%2Clmt&sig=AJfQdSswRAIgfM2WBLr8aC-nyzmRwUkIvfvYqS4cDQplNMd1uUMbZ9kCICGN7GufHths4hERpW4Ot-qtQ7pAfMlVs-T4lBRd-7Rs&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AHWaYeowRQIgcDSlN-lIdlgnG-pzwkg3n7Qreh-FBVZR41PtuzC674ICIQDlmYMZbdxvtPztk5X8OnWRaYA3YLjy9qGbklkuBR1s-A%3D%3D"
[download] Destination: SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].f251.webm
[download] 100% of   11.81MiB in 00:02:57 at 68.22KiB/s
[Merger] Merging formats into "SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].webm"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].f303.webm' -i 'file:SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].f251.webm' -c copy -map 0:v:0 -map 1:a:0 -movflags +faststart 'file:SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].temp.webm'
Deleting original file SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].f251.webm (pass -k to keep)
Deleting original file SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].f303.webm (pass -k to keep)
WARNING: JSON subtitles cannot be embedded
[EmbedSubtitle] Embedding subtitles in "SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].webm"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].webm' -i 'file:SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].en.vtt' -map 0 -dn -ignore_unknown -c copy -map -0:s -map 1:0 -metadata:s:s:0 language=eng -metadata:s:s:0 handler_name=English -metadata:s:s:0 title=English -movflags +faststart 'file:SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].temp.webm'
Deleting original file SM64’s Unopenable Door Has Finally Been Opened! [EmqFoojMeD8].en.vtt (pass -k to keep)
@Riteo Riteo added enhancement New feature or request triage Untriaged issue labels May 18, 2024
@pukkandan pukkandan removed the triage Untriaged issue label May 18, 2024
@pukkandan
Copy link
Member

It is only possible to do in mkv. You can see --embed-info-json for help with implementation

@Riteo
Copy link
Contributor Author

Riteo commented May 19, 2024

Oh, I see. mkv only is fine for me personally. I'll try whipping up a patch in the coming days if that's ok!

@Riteo
Copy link
Contributor Author

Riteo commented May 19, 2024

Oops, just noticed that the verbose output doesn't show the warning (forgot to add --subs-lang all). Sorry. I'll fix it up later.

Edit: there, done.

@Riteo Riteo linked a pull request May 21, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants