Skip to content

7emotions/video-transcript-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video Transcript Skill

OpenCode skill for transcribing online videos to text, enabling AI agents to learn from video content. Works with Bilibili (B站), YouTube, Vimeo, Twitch, and any platform supported by yt-dlp.

What it does

When you send a video link to your AI agent, this skill:

  1. Fetches existing subtitles if available (instant)
  2. Falls back to downloading audio and transcribing with local Whisper (1-2 min for typical videos)
  3. Returns the full transcript your agent can read and summarize

Supported Platforms

Bilibili (bilibili.com, b23.tv) · YouTube (youtube.com, youtu.be) · Vimeo · Twitch · and hundreds more

Setup

See references/setup.md for full installation instructions.

Quick check:

yt-dlp --version && ffmpeg -version | head -1 && whisper --help > /dev/null && echo "All good"

Performance

CPU transcription with faster-whisper small model (~4 cores):

Video length Time
5 min ~1 min
15 min ~3 min
30 min ~6.5 min
1 hour ~13 min

Tools Used

The video-toolkit MCP server (configured in opencode.jsonc) exposes the following native tools:

  • video-toolkit_get-transcript — fetch existing subtitles (platform-dependent, instant)
  • video-toolkit_generate-subtitles — AI transcription via local Whisper
  • video-toolkit_list-transcript-languages — list available subtitle languages
  • video-toolkit_download-video — download video to local storage
  • video-toolkit_list-downloads — list downloaded videos
  • video-toolkit_transcribe-audio — transcribe local audio files

License

MIT

About

OpenCode skill for transcribing online videos (Bilibili, YouTube, Vimeo, etc.) using local Whisper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages