feat(youtube): add video metadata and transcript commands#25
Merged
jackwener merged 3 commits intojackwener:mainfrom Mar 17, 2026
Merged
Conversation
- youtube video: fetch metadata (title, views, description, etc.) from ytInitialPlayerResponse and ytInitialData - youtube transcript: fetch subtitles via Android InnerTube API to bypass PoToken requirement on Web client caption URLs - Two output modes: --mode grouped (sentence merging, speaker detection, chapter headings) and --mode raw (precise sub-second timestamps matching bilibili/subtitle format) - CJK support with 30s time-window fallback for unpunctuated captions - Language selection with --lang and stderr warning on fallback - URL normalization for watch, youtu.be, shorts, embed, live formats
Tests for sentence grouping, CJK punctuation, time-gap flush, speaker detection, chapter insertion, and timestamp formatting.
… transcript-group - Extract shared parseVideoId() into utils.ts to eliminate URL parsing duplication - Modernize page.evaluate code: var→const/let, &&-chains→optional chaining, for→array methods - Move transcript-group.ts and test to src/clis/youtube/ (YouTube-specific logic) - Use replaceAll() and template strings for cleaner entity decoding
Owner
|
Thanks @AlexZhangji for the great contribution! Squash merged with a small refactoring commit. All tests pass. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
youtube video— fetch video metadata (title, channel, views, description, publish date, etc.) fromytInitialPlayerResponseandytInitialDatayoutube transcript— fetch video transcript/subtitles with two output modes via--modeTranscript design
The problem: YouTube's Web client caption URLs now require a PoToken (Proof of Origin Token) generated by BotGuard at runtime. Fetching the
baseUrlfromytInitialPlayerResponsedirectly returns empty responses.The solution: Call the InnerTube
/v1/playerAPI with an Android client context (clientName: ANDROID), which returns caption URLs that work without PoToken. This is the same approach used by youtube-transcript-api and Defuddle.Two output modes via
--mode:grouped(default)>>markers), chapter headings via/v1/nextAPIraw"start": "15.33s"), matchingbilibili/subtitleformatCJK support: Chinese/Japanese auto-captions typically lack punctuation, so sentence-boundary grouping alone produces unbounded paragraphs. Added a 30-second time-window fallback alongside CJK punctuation detection (
。!?).Language selection:
--langspecifies preferred caption language. If unavailable, falls back with a stderr warning showing available tracks.Files changed
src/clis/youtube/video.tssrc/clis/youtube/transcript.tssrc/transcript-group.tssrc/transcript-group.test.tsURL support
Handles all common YouTube URL formats:
watch?v=,youtu.be/,/shorts/,/embed/,/live/,/v/, and bare video IDs. All variants are normalized towatch?v=before navigation.Test plan
youtube video --url <url>returns metadata fieldsyoutube transcript --url <url>returns grouped paragraphs with chaptersyoutube transcript --url <url> --mode rawreturns precise sub-second timestampsyoutube transcript --url <url> --lang zhwarns and falls back for English-only videos