# yt-dlp Demonstration Notebook

In this Jupyter notebook, we will explore the capabilities of yt-dlp, a powerful command-line tool that extends the functionality of YouTube-DL, allowing users to download videos from various online sources with additional features and options.

See [here](https://pypi.org/project/yt-dlp/) for more information.

In this notebook, we shall:

1. Install yt-dlp.
2. Showcase how to download videos from different sources.
3. Demonstrate how to fetch subtitles.
4. Discuss some other helpful commands.

## 1. Installation

In [1]:
!pip install yt-dlp;




[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


## 2. Downloading videos from different sources

The resultant video file should be in your session storage if you run this from Google Cloab, or just in the same folder as this script if you run it elsewhere.

The basic usage of the command is:

```
yt-dlp [OPTIONS] [--] URL [URL...]
```

You can explore options [here](https://pypi.org/project/yt-dlp/#video-selection).

### Basic Youtube Example


In [2]:
!yt-dlp https://www.youtube.com/watch?v=dQw4w9WgXcQ

[youtube] Extracting URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] dQw4w9WgXcQ: Downloading webpage
[youtube] dQw4w9WgXcQ: Downloading ios player API JSON
[youtube] dQw4w9WgXcQ: Downloading android player API JSON
[youtube] dQw4w9WgXcQ: Downloading m3u8 information
[info] dQw4w9WgXcQ: Downloading 1 format(s): 18
[download] Destination: Rick Astley - Never Gonna Give You Up (Official Music Video) [dQw4w9WgXcQ].mp4

[download]   0.0% of    8.68MiB at  Unknown B/s ETA Unknown
[download]   0.0% of    8.68MiB at    1.47MiB/s ETA 00:05  
[download]   0.1% of    8.68MiB at    3.43MiB/s ETA 00:02
[download]   0.2% of    8.68MiB at    7.36MiB/s ETA 00:01
[download]   0.3% of    8.68MiB at    3.18MiB/s ETA 00:02
[download]   0.7% of    8.68MiB at    3.83MiB/s ETA 00:02
[download]   1.4% of    8.68MiB at    5.15MiB/s ETA 00:01
[download]   2.9% of    8.68MiB at    7.20MiB/s ETA 00:01
[download]   5.8% of    8.68MiB at   10.06MiB/s ETA 00:00
[download]  11.5% of    8.68MiB at   12.01M



### Downloading from Vimeo, and also setting the format to MP4.

In [3]:
!yt-dlp --format mp4 https://vimeo.com/794492622

[vimeo] Extracting URL: https://vimeo.com/794492622
[vimeo] 794492622: Downloading webpage
[vimeo] 794492622: Downloading JSON metadata
[vimeo] 794492622: Downloading JSON metadata
[vimeo] 794492622: Downloading jwt token
[vimeo] 794492622: Downloading JSON metadata
[vimeo] 794492622: Downloading akfire_interconnect_quic m3u8 information
[vimeo] 794492622: Downloading akfire_interconnect_quic m3u8 information
[vimeo] 794492622: Downloading fastly_skyfire m3u8 information
[vimeo] 794492622: Downloading fastly_skyfire m3u8 information
[vimeo] 794492622: Downloading akfire_interconnect_quic MPD information
[vimeo] 794492622: Downloading akfire_interconnect_quic MPD information
[vimeo] 794492622: Downloading fastly_skyfire MPD information
[vimeo] 794492622: Downloading fastly_skyfire MPD information
[info] 794492622: Downloading 1 format(s): dash-fastly_skyfire_sep-video-4def9485
[dashsegments] Total fragments: 36
[download] Destination: Rick Astley - Never Gonna Give You Up (Official Musi



## 3. Fetching Subtitles
This example wil download a video from Youtube, and if there is a subtitle track available, it will download that also. Depending on the avilability, you may also be able to specify the language of the subtitle track. Also note that we use `--embed-subs` to embed the subtitles into the mp4.

You can explore options [here](https://pypi.org/project/yt-dlp/#video-selection).

In [4]:
!yt-dlp --write-subs --embed-subs --format mp4  https://www.youtube.com/watch?v=020g-0hhCAU&ab_channel=Cocomelon-NurseryRhymes

[youtube] Extracting URL: https://www.youtube.com/watch?v=020g-0hhCAU
[youtube] 020g-0hhCAU: Downloading webpage
[youtube] 020g-0hhCAU: Downloading ios player API JSON
[youtube] 020g-0hhCAU: Downloading android player API JSON
[youtube] 020g-0hhCAU: Downloading m3u8 information
[info] 020g-0hhCAU: Downloading subtitles: en
[info] 020g-0hhCAU: Downloading 1 format(s): 18
[info] Writing video subtitles to: Baby Shark ｜ @CoComelon Nursery Rhymes & Kids Songs [020g-0hhCAU].en.vtt
[download] Destination: Baby Shark ｜ @CoComelon Nursery Rhymes & Kids Songs [020g-0hhCAU].en.vtt

[download]    1.00KiB at  Unknown B/s (00:00:00)
[download]    2.53KiB at  Unknown B/s (00:00:00)
[download] 100% of    2.53KiB in 00:00:00 at 120.07KiB/s
[download] Destination: Baby Shark ｜ @CoComelon Nursery Rhymes & Kids Songs [020g-0hhCAU].mp4

[download]   0.0% of    7.77MiB at  989.46KiB/s ETA 00:08
[download]   0.0% of    7.77MiB at    2.90MiB/s ETA 00:02
[download]   0.1% of    7.77MiB at    3.40MiB/s ETA 00:

ERROR: Postprocessing: ffmpeg not found. Please install or provide the path using --ffmpeg-location
'ab_channel' is not recognized as an internal or external command,
operable program or batch file.


In [6]:
subtitle_file_path = "Baby Shark ｜ @CoComelon Nursery Rhymes & Kids Songs [020g-0hhCAU].en.vtt"

with open(subtitle_file_path, "r", encoding="utf-8") as file:
    subtitle_text = file.read()

print(subtitle_text)


WEBVTT
Kind: captions
Language: en

00:00:08.830 --> 00:00:11.413
(upbeat music)

00:00:24.630 --> 00:00:27.169
♪ Baby Shark do do do do ♪

00:00:27.169 --> 00:00:29.610
♪ Baby Shark do do do do ♪

00:00:29.610 --> 00:00:32.040
♪ Baby Shark do do do do ♪

00:00:32.040 --> 00:00:34.021
♪ Baby shark ♪

00:00:34.021 --> 00:00:36.789
♪ Mommy Shark do do do do ♪

00:00:36.789 --> 00:00:39.250
♪ Mommy Shark do do do do ♪

00:00:39.250 --> 00:00:41.669
♪ Mommy Shark do do do do ♪

00:00:41.669 --> 00:00:43.559
♪ Mommy Shark ♪

00:00:43.559 --> 00:00:46.399
♪ Daddy Shark do do do do ♪

00:00:46.399 --> 00:00:48.840
♪ Daddy Shark do do do do ♪

00:00:48.840 --> 00:00:51.238
♪ Daddy Shark do do do do ♪

00:00:51.238 --> 00:00:53.109
♪ Daddy Shark ♪

00:00:53.109 --> 00:00:56.029
♪ Grandma Shark do do do do do ♪

00:00:56.029 --> 00:00:58.438
♪ Grandma Shark do do do do do ♪

00:00:58.438 --> 00:01:00.840
♪ Grandma Shark do do do do do ♪

00:01:00.840 --> 00:01:02.700
♪ Grandma Shark ♪

00:01:02.

## 4. Other helpful commands

### Modifying Metadata

In [7]:
# Interpret the title as "Artist - Title"
!yt-dlp --parse-metadata "%(series)s S%(season_number)02dE%(episode_number)02d:%(title)s" https://www.youtube.com/watch?v=BaW_jenozKc

[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading ios player API JSON
[youtube] BaW_jenozKc: Downloading android player API JSON
[youtube] BaW_jenozKc: Downloading player 8fc6998a
[youtube] BaW_jenozKc: Downloading m3u8 information
[MetadataParser] Parsed title from '%(series)s S%(season_number)02dE%(episode_number)02d': 'NA SNAENA'
[info] BaW_jenozKc: Downloading 1 format(s): 22
[download] Destination: NA SNAENA [BaW_jenozKc].mp4

[download]   0.5% of  214.71KiB at  937.90KiB/s ETA 00:00
[download]   1.4% of  214.71KiB at    2.75MiB/s ETA 00:00
[download]   3.3% of  214.71KiB at    4.28MiB/s ETA 00:00
[download]   7.0% of  214.71KiB at    6.87MiB/s ETA 00:00
[download]  14.4% of  214.71KiB at    3.49MiB/s ETA 00:00
[download]  29.3% of  214.71KiB at    3.76MiB/s ETA 00:00
[download]  59.1% of  214.71KiB at    4.79MiB/s ETA 00:00
[download] 100.0% of  214.71KiB at    5.98MiB/s ETA 00:00
[d



### Extract information to JSON

In [8]:
import json
import yt_dlp

URL = 'https://www.youtube.com/watch?v=BaW_jenozKc'

# ℹ️ See help(yt_dlp.YoutubeDL) for a list of available options and public functions
ydl_opts = {}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
    info = ydl.extract_info(URL, download=False)

    # ℹ️ ydl.sanitize_info makes the info json-serializable
    print(json.dumps(ydl.sanitize_info(info)))

[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading ios player API JSON
[youtube] BaW_jenozKc: Downloading android player API JSON




[youtube] BaW_jenozKc: Downloading m3u8 information
{"id": "BaW_jenozKc", "title": "youtube-dl test video \"'/\\\u00e4\u21ad\ud835\udd50", "formats": [{"format_id": "sb0", "format_note": "storyboard", "ext": "mhtml", "protocol": "mhtml", "acodec": "none", "vcodec": "none", "url": "https://i.ytimg.com/sb/BaW_jenozKc/storyboard3_L0/default.jpg?sqp=-oaymwENSDfyq4qpAwVwAcABBqLzl_8DBgjTibKxBg==&sigh=rs$AOn4CLABxW_yJUIWenSX90flV2i-46mVtA", "width": 48, "height": 27, "fps": 10.0, "rows": 10, "columns": 10, "fragments": [{"url": "https://i.ytimg.com/sb/BaW_jenozKc/storyboard3_L0/default.jpg?sqp=-oaymwENSDfyq4qpAwVwAcABBqLzl_8DBgjTibKxBg==&sigh=rs$AOn4CLABxW_yJUIWenSX90flV2i-46mVtA", "duration": 10.0}], "resolution": "48x27", "aspect_ratio": 1.78, "filesize_approx": null, "http_headers": {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",

### Extract audio

In [None]:
import yt_dlp

URLS = ['https://www.youtube.com/watch?v=BaW_jenozKc']

ydl_opts = {
    'format': 'm4a/bestaudio/best',
    # ℹ️ See help(yt_dlp.postprocessor) for a list of available Postprocessors and their arguments
    'postprocessors': [{  # Extract audio using ffmpeg
        'key': 'FFmpegExtractAudio',
        'preferredcodec': 'm4a',
    }]
}

with yt_dlp.YoutubeDL(ydl_opts) as ydl:
    error_code = ydl.download(URLS)

### Filter video

In [2]:
import yt_dlp

URLS = ['https://www.youtube.com/watch?v=BaW_jenozKc']

def longer_than_a_minute(info, *, incomplete):
    """Download only videos longer than a minute (or with unknown duration)"""
    duration = info.get('duration')
    if duration and duration < 60:
        return 'The video is too short'

ydl_opts = {
    'match_filter': longer_than_a_minute,
}

with yt_dlp.YoutubeDL(ydl_opts) as ydl:
    error_code = ydl.download(URLS)

[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading ios player API JSON
[youtube] BaW_jenozKc: Downloading android player API JSON




[youtube] BaW_jenozKc: Downloading m3u8 information
[download] The video is too short


### Adding logger and progress hook

In [3]:
import yt_dlp

URLS = ['https://www.youtube.com/watch?v=BaW_jenozKc']

class MyLogger:
    def debug(self, msg):
        # For compatibility with youtube-dl, both debug and info are passed into debug
        # You can distinguish them by the prefix '[debug] '
        if msg.startswith('[debug] '):
            pass
        else:
            self.info(msg)

    def info(self, msg):
        pass

    def warning(self, msg):
        pass

    def error(self, msg):
        print(msg)


# ℹ️ See "progress_hooks" in help(yt_dlp.YoutubeDL)
def my_hook(d):
    if d['status'] == 'finished':
        print('Done downloading, now post-processing ...')


ydl_opts = {
    'logger': MyLogger(),
    'progress_hooks': [my_hook],
}

with yt_dlp.YoutubeDL(ydl_opts) as ydl:
    ydl.download(URLS)

Done downloading, now post-processing ...


### Add a custom PostProcessor

In [4]:
import yt_dlp

URLS = ['https://www.youtube.com/watch?v=BaW_jenozKc']

# ℹ️ See help(yt_dlp.postprocessor.PostProcessor)
class MyCustomPP(yt_dlp.postprocessor.PostProcessor):
    def run(self, info):
        self.to_screen('Doing stuff')
        return [], info


with yt_dlp.YoutubeDL() as ydl:
    # ℹ️ "when" can take any value in yt_dlp.utils.POSTPROCESS_WHEN
    ydl.add_post_processor(MyCustomPP(), when='pre_process')
    ydl.download(URLS)

[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading ios player API JSON
[youtube] BaW_jenozKc: Downloading android player API JSON




[youtube] BaW_jenozKc: Downloading m3u8 information
[MyCustom] Doing stuff
[info] BaW_jenozKc: Downloading 1 format(s): 22
[download] youtube-dl test video ＂'⧸⧹ä↭𝕐 [BaW_jenozKc].mp4 has already been downloaded
[download] 100% of  214.71KiB
