Skip to content

feat(youtube): add video metadata and transcript commands#25

Merged
jackwener merged 3 commits intojackwener:mainfrom
AlexZhangji:feat/youtube-video-transcript
Mar 17, 2026
Merged

feat(youtube): add video metadata and transcript commands#25
jackwener merged 3 commits intojackwener:mainfrom
AlexZhangji:feat/youtube-video-transcript

Conversation

@AlexZhangji
Copy link
Copy Markdown
Contributor

Summary

  • youtube video — fetch video metadata (title, channel, views, description, publish date, etc.) from ytInitialPlayerResponse and ytInitialData
  • youtube transcript — fetch video transcript/subtitles with two output modes via --mode

Transcript design

The problem: YouTube's Web client caption URLs now require a PoToken (Proof of Origin Token) generated by BotGuard at runtime. Fetching the baseUrl from ytInitialPlayerResponse directly returns empty responses.

The solution: Call the InnerTube /v1/player API with an Android client context (clientName: ANDROID), which returns caption URLs that work without PoToken. This is the same approach used by youtube-transcript-api and Defuddle.

Two output modes via --mode:

Mode Use case Format
grouped (default) Human reading, LLM summarization Sentences merged into paragraphs, speaker detection (>> markers), chapter headings via /v1/next API
raw Programmatic consumption, timestamp search Every caption segment with precise sub-second timestamps ("start": "15.33s"), matching bilibili/subtitle format

CJK support: Chinese/Japanese auto-captions typically lack punctuation, so sentence-boundary grouping alone produces unbounded paragraphs. Added a 30-second time-window fallback alongside CJK punctuation detection (。!?).

Language selection: --lang specifies preferred caption language. If unavailable, falls back with a stderr warning showing available tracks.

Files changed

File Description
src/clis/youtube/video.ts New: video metadata command
src/clis/youtube/transcript.ts New: transcript command (Android InnerTube + dual mode)
src/transcript-group.ts New: sentence grouping, speaker detection, chapter insertion
src/transcript-group.test.ts Unit tests (9 tests covering grouping, CJK, speakers, chapters)

URL support

Handles all common YouTube URL formats: watch?v=, youtu.be/, /shorts/, /embed/, /live/, /v/, and bare video IDs. All variants are normalized to watch?v= before navigation.

Test plan

  • youtube video --url <url> returns metadata fields
  • youtube transcript --url <url> returns grouped paragraphs with chapters
  • youtube transcript --url <url> --mode raw returns precise sub-second timestamps
  • youtube transcript --url <url> --lang zh warns and falls back for English-only videos
  • Long video (6h+, 13k segments) and CJK transcripts produce reasonably sized groups
  • Unit tests pass (169/169, including 9 new)
  • Existing tests unaffected

AlexZhangji and others added 3 commits March 16, 2026 17:50
- youtube video: fetch metadata (title, views, description, etc.)
  from ytInitialPlayerResponse and ytInitialData
- youtube transcript: fetch subtitles via Android InnerTube API
  to bypass PoToken requirement on Web client caption URLs
- Two output modes: --mode grouped (sentence merging, speaker
  detection, chapter headings) and --mode raw (precise sub-second
  timestamps matching bilibili/subtitle format)
- CJK support with 30s time-window fallback for unpunctuated captions
- Language selection with --lang and stderr warning on fallback
- URL normalization for watch, youtu.be, shorts, embed, live formats
Tests for sentence grouping, CJK punctuation, time-gap flush,
speaker detection, chapter insertion, and timestamp formatting.
… transcript-group

- Extract shared parseVideoId() into utils.ts to eliminate URL parsing duplication
- Modernize page.evaluate code: var→const/let, &&-chains→optional chaining, for→array methods
- Move transcript-group.ts and test to src/clis/youtube/ (YouTube-specific logic)
- Use replaceAll() and template strings for cleaner entity decoding
@jackwener jackwener merged commit 8e74904 into jackwener:main Mar 17, 2026
4 checks passed
@jackwener
Copy link
Copy Markdown
Owner

Thanks @AlexZhangji for the great contribution! Squash merged with a small refactoring commit. All tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants