Skip to content

PlayZone30/youtube-transcript-api-js

Repository files navigation

✨ YouTube Transcript API ✨

Current Version Downloads MIT license Supported Node.js Versions Build Status

A powerful JavaScript/TypeScript API which allows you to retrieve the transcript/subtitles for any YouTube video. It supports automatically generated subtitles, translation to multiple languages, advanced proxy support, and works without requiring a headless browser!

πŸš€ Built with modern JavaScript/TypeScript for maximum performance and developer experience!

✨ Features

  • βœ… Auto-Generated Transcript Detection - Automatically detects and fetches auto-generated subtitles
  • βœ… Translation Support - Translate transcripts to 17+ languages
  • βœ… Multiple Format Support - JSON, SRT, WebVTT, Text, Pretty Print
  • βœ… Advanced Proxy Support - HTTP/HTTPS proxy agents with http-proxy-agent and https-proxy-agent
  • βœ… Invidious Fallback - Alternative YouTube frontend support for when YouTube blocks requests
  • βœ… Multiple Instance Failover - Automatic failover between multiple Invidious instances
  • βœ… Dynamic Configuration - Runtime configuration changes for proxy and Invidious settings
  • βœ… TypeScript Support - Full type safety and IntelliSense
  • βœ… CLI Interface - Command-line tool for easy usage (some features may not work as expected)
  • βœ… Error Handling - Comprehensive error handling with specific error types
  • βœ… High Performance - Optimized for speed and efficiency
  • βœ… No Dependencies - No headless browser required
  • βœ… Modern JavaScript - Built with ES6+ and TypeScript

πŸ“¦ Install

Install the package using npm:

npm install @playzone/youtube-transcript

Or using yarn:

yarn add @playzone/youtube-transcript

πŸš€ Quick Start

Basic Usage

const { YouTubeTranscriptApi } = require('@playzone/youtube-transcript');

const api = new YouTubeTranscriptApi();

// Fetch transcript for a video
const transcript = await api.fetch('3bSukjgCGcc');
console.log(transcript.snippets[0].text); // "Hyderabad, capital of Telangana"

TypeScript Usage

import { YouTubeTranscriptApi } from '@playzone/youtube-transcript';

const api = new YouTubeTranscriptApi();
const transcript = await api.fetch('3bSukjgCGcc');
console.log(transcript.snippets[0].text);

Enhanced API with Advanced Proxy Support

const { EnhancedYouTubeTranscriptApi } = require('@playzone/youtube-transcript');

// Basic enhanced API
const api = new EnhancedYouTubeTranscriptApi();

// With proxy support
const apiWithProxy = new EnhancedYouTubeTranscriptApi({
  enabled: true,
  http: 'http://proxy.example.com:8080',
  https: 'https://proxy.example.com:8080'
});

// With Invidious fallback
const apiWithInvidious = new EnhancedYouTubeTranscriptApi({}, {
  enabled: true,
  instanceUrls: ['https://yewtu.be', 'https://invidious.io']
});

const transcript = await apiWithProxy.fetch('3bSukjgCGcc');

πŸ“– API Documentation

Core API

The main YouTubeTranscriptApi class provides the following methods:

fetch(videoId: string, options?: FetchOptions): Promise<FetchedTranscript>

Fetches a transcript for the given video ID.

const api = new YouTubeTranscriptApi();

// Basic fetch
const transcript = await api.fetch('3bSukjgCGcc');

// With language preference
const transcript = await api.fetch('3bSukjgCGcc', {
  languages: ['en', 'es']
});

// Exclude auto-generated transcripts
const transcript = await api.fetch('3bSukjgCGcc', {
  excludeGenerated: true
});

// Exclude manual transcripts
const transcript = await api.fetch('3bSukjgCGcc', {
  excludeManuallyCreated: true
});

list(videoId: string): Promise<TranscriptList>

Lists all available transcripts for a video.

const api = new YouTubeTranscriptApi();
const transcriptList = await api.list('3bSukjgCGcc');

// Find specific transcript
const transcript = await transcriptList.findTranscript(['en']);

// Check if translatable
if (transcript.isTranslatable) {
  const translated = transcript.translate('es');
  const translatedData = await translated.fetch();
}

Enhanced API

The EnhancedYouTubeTranscriptApi provides advanced features:

Constructor Options

const api = new EnhancedYouTubeTranscriptApi(proxyOptions, invidiousOptions);

Proxy Options:

{
  enabled: boolean,        // Enable/disable proxy
  http?: string,          // HTTP proxy URL
  https?: string          // HTTPS proxy URL
}

Invidious Options:

{
  enabled: boolean,                    // Enable/disable Invidious
  instanceUrls: string | string[],    // Invidious instance URLs
  timeout?: number                    // Request timeout (default: 10000ms)
}

Dynamic Configuration

const api = new EnhancedYouTubeTranscriptApi();

// Change proxy settings at runtime
api.setProxyOptions({
  enabled: true,
  http: 'http://new-proxy.com:8080'
});

// Change Invidious settings at runtime
api.setInvidiousOptions({
  enabled: true,
  instanceUrls: ['https://yewtu.be']
});

🌐 Proxy Support

Generic Proxy Configuration

const { YouTubeTranscriptApi, GenericProxyConfig } = require('@playzone/youtube-transcript');

const api = new YouTubeTranscriptApi(new GenericProxyConfig(
  'http://user:pass@proxy.example.com:8080',
  'https://user:pass@proxy.example.com:8080'
));

Webshare Rotating Proxies

const { YouTubeTranscriptApi, WebshareProxyConfig } = require('@playzone/youtube-transcript');

const api = new YouTubeTranscriptApi(new WebshareProxyConfig(
  'your-username',
  'your-password',
  ['us', 'de'], // Optional: filter by country
  5 // retries
));

Enhanced Proxy Support

The enhanced API provides advanced proxy support with better reliability and fallback mechanisms:

const { EnhancedYouTubeTranscriptApi } = require('@playzone/youtube-transcript');

// Basic enhanced proxy
const api = new EnhancedYouTubeTranscriptApi({
  enabled: true,
  http: 'http://proxy.example.com:8080',
  https: 'https://proxy.example.com:8080'
});

// With Invidious fallback
const apiWithFallback = new EnhancedYouTubeTranscriptApi({
  enabled: true,
  http: 'http://proxy.example.com:8080'
}, {
  enabled: true,
  instanceUrls: ['https://yewtu.be', 'https://invidious.io'],
  timeout: 10000
});

// Dynamic configuration
api.setProxyOptions({
  enabled: true,
  http: 'http://new-proxy.com:8080'
});

api.setInvidiousOptions({
  enabled: true,
  instanceUrls: 'https://yewtu.be'
});

Key Enhanced Features

  • HTTP/HTTPS Proxy Agents: Uses http-proxy-agent and https-proxy-agent for better proxy handling
  • Invidious Fallback: Alternative YouTube frontend when YouTube blocks requests
  • Multiple Instance Support: Automatic failover between multiple Invidious instances
  • Dynamic Configuration: Change proxy and Invidious settings at runtime
  • Browser Compatibility: Proxy agents only used in Node.js environments
  • Keep-Alive Optimization: Enhanced connection handling when not using proxies

πŸ–₯️ CLI Usage

Basic Commands

# Fetch transcript
npx @playzone/youtube-transcript 3bSukjgCGcc

# List available transcripts
npx @playzone/youtube-transcript --list-transcripts 3bSukjgCGcc

# Fetch with specific language
npx @playzone/youtube-transcript 3bSukjgCGcc --languages en es

# Exclude auto-generated transcripts
npx @playzone/youtube-transcript 3bSukjgCGcc --exclude-generated

# Exclude manual transcripts
npx @playzone/youtube-transcript 3bSukjgCGcc --exclude-manually-created

# Translate transcript
npx @playzone/youtube-transcript 3bSukjgCGcc --translate es

# Output in specific format
npx @playzone/youtube-transcript 3bSukjgCGcc --format json
npx @playzone/youtube-transcript 3bSukjgCGcc --format srt
npx @playzone/youtube-transcript 3bSukjgCGcc --format webvtt

CLI Options

Option Description Example
--languages Preferred languages (space-separated) --languages en es fr
--exclude-generated Exclude auto-generated transcripts --exclude-generated
--exclude-manually-created Exclude manual transcripts --exclude-manually-created
--translate Translate to specific language --translate es
--format Output format (json, srt, webvtt, text, pretty) --format json
--list-transcripts List available transcripts --list-transcripts
--http-proxy HTTP proxy URL --http-proxy http://proxy:8080
--https-proxy HTTPS proxy URL --https-proxy https://proxy:8080
--webshare-proxy-username Webshare proxy username --webshare-proxy-username user
--webshare-proxy-password Webshare proxy password --webshare-proxy-password pass

⚠️ CLI Limitations

Important Note: Some CLI commands may not work as expected due to implementation issues. However, all programmatic API features work perfectly. For reliable usage, we recommend using the programmatic API instead of the CLI.

Known CLI Issues:

  • --list-transcripts flag may show full transcript instead of language list
  • Some format options may not work correctly
  • Translation commands may not work as expected

Recommended Usage: Use the programmatic API for all features as it provides full functionality and reliability.

🚨 Error Handling

The library provides comprehensive error handling with specific error types:

const { 
  NoTranscriptFound, 
  TranscriptsDisabled, 
  VideoUnavailable, 
  NotTranslatable,
  TranslationLanguageNotAvailable,
  IpBlocked
} = require('@playzone/youtube-transcript');

try {
  const transcript = await api.fetch('invalid-video-id');
} catch (error) {
  if (error instanceof NoTranscriptFound) {
    console.log('No transcript found for this video');
  } else if (error instanceof VideoUnavailable) {
    console.log('Video is unavailable');
  } else if (error instanceof TranscriptsDisabled) {
    console.log('Transcripts are disabled for this video');
  } else if (error instanceof IpBlocked) {
    console.log('IP address is blocked, try using a proxy');
  }
}

πŸ“Š Output Formats

The library supports multiple output formats:

JSON Format

const transcript = await api.fetch('3bSukjgCGcc');
console.log(JSON.stringify(transcript.toRawData(), null, 2));

SRT Format

const { FormatterLoader } = require('@playzone/youtube-transcript');
const formatter = new FormatterLoader().load('srt');
const srtOutput = formatter.formatTranscripts([transcript]);
console.log(srtOutput);

WebVTT Format

const formatter = new FormatterLoader().load('webvtt');
const webvttOutput = formatter.formatTranscripts([transcript]);
console.log(webvttOutput);

πŸ”§ Development

Prerequisites

  • Node.js 16+
  • npm or yarn

Setup

git clone https://github.com/playzone/youtube-transcript-api-js.git
cd youtube-transcript-api-js
npm install

Build

npm run build

Test

npm test

Lint

npm run lint
npm run lint:fix

πŸ“ Project Structure

youtube-transcript-api-js/
β”œβ”€β”€ api/                   # Core API implementation
β”œβ”€β”€ cli/                   # CLI implementation
β”œβ”€β”€ errors/                # Error classes
β”œβ”€β”€ formatters/            # Output formatters
β”œβ”€β”€ proxies/               # Proxy configurations
β”œβ”€β”€ transcripts/           # Transcript models and parsing
β”œβ”€β”€ enhanced-api/          # Enhanced API with advanced features
└── index.ts               # Main exports

🎯 Use Cases

  • Content Analysis: Analyze video content through transcripts
  • Accessibility: Generate subtitles for videos
  • Language Learning: Get transcripts in different languages
  • SEO: Extract text content for search optimization
  • Data Mining: Collect and analyze video content at scale
  • Automation: Integrate transcript fetching into automated workflows

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Built with modern JavaScript/TypeScript
  • Inspired by the need for reliable YouTube transcript access
  • Enhanced with advanced proxy and fallback mechanisms

πŸ“ž Support

If you encounter any issues or have questions:

  1. Check the Issues page
  2. Create a new issue with detailed information
  3. Use the programmatic API for reliable functionality

Made with ❀️ by the PlayZone

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published