<a href="https://colab.research.google.com/github/gu-ma/hgk-ml-workshop/blob/main/notebooks/youtube_dl.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Download a list of youtube files and save in drive folder

Download a video or playlist from youtube using [youtube-dl](https://youtube-dl.org/). Other [websites are supported as well](https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md) just try to copy paste the url in the field below.

_Note: you can test this notebook by just running the cells in order below_

> We are going to use a fork of youtube-dl, called [yt-dlp](https://github.com/yt-dlp/yt-dlp) because it seems that the speed of youtube-dl is throttled in Colab

## Setup

In [None]:
# Install + Import + Config
try: import yt_dlp
except:
    ! pip install yt-dlp

# from __future__ import unicode_literals

import yt_dlp
import google
import os

## Connect Google drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## Download files from youtube

In [None]:
#@title Video + Playlist Downloader - Simple Mode { vertical-output: true }
url = "https://www.youtube.com/playlist?list=PLOWSbqDrrI5RV1I7JKTWQnvJpjgqC_yzD" #@param {type:"string"}
name = "playlist01" #@param { type: 'string' }
playlist_download_all = False #@param { type: 'boolean' }
playlist_start = 0 #@param { type: 'number' }
playlist_end = 5 #@param { type: 'number' }
clear_dir = False #@param { type: 'boolean' }

# Clear / Make dir
output_dir = name
if clear_dir and os.path.isdir(output_dir):
  for file in os.scandir(output_dir):
    os.remove(file.path)
os.makedirs(output_dir, exist_ok=True)

# Download options
playlist_items = f'{playlist_start}:{playlist_end}' if not playlist_download_all else ''

ydl_opts = {
  'outtmpl': f'{output_dir}/%(title)s.%(ext)s',
  'format': 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best',
  'playlist_items': playlist_items
}

print(ydl_opts)

# Download
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
  ydl.download([url])

In [None]:
#@title Zip and copy to gdrive { vertical-output: true }

#@markdown Path to destination on google drive. Right click your directory on the left side ⬅️ and choose "copy path" then paste it below
gdrive_output_dir = "/content/drive/MyDrive/AI/hgk_workshop" #@param { type: 'string' }

#@markdown Compress the content ? (leave it unchecked)
zip_it =  False #@param { type: 'boolean' }

# Zip
if zip_it:

  if os.path.exists(f'{output_dir}.zip'):
    os.remove(f'{output_dir}.zip')
    
  ! zip {output_dir}.zip {output_dir}/*.*

  # Copy to gdrive folder
  ! cp {output_dir}.zip {gdrive_output_dir}

else:

  # Copy folder to google drive
  ! cp -r {output_dir} {gdrive_output_dir}

## Example on how to do that on the command line

In [None]:
! yt-dlp --playlist-items 0:0 https://www.youtube.com/playlist?list=PLOWSbqDrrI5RV1I7JKTWQnvJpjgqC_yzD

Examples of command:


```
  - Download a video or playlist (with the default options from command below):
    yt-dlp "https://www.youtube.com/watch?v=oHg5SJYRHA0"

  - Download a video with a defined format. In this case merging the best video format with the best audio format (Default):
    yt-dlp --format "bv*+ba/b" "https://www.youtube.com/watch?v=oHg5SJYRHA0"

  - Extract audio from videos (required ffmpeg or ffprobe):
    yt-dlp --extract-audio "https://www.youtube.com/watch?v=oHg5SJYRHA0"

  - Specify audio format of extracted audio (best(default), aac, flac, mp3, m4a, opus, vorbis, wav, alac):
    yt-dlp --extract-audio --audio-format mp3 "https://www.youtube.com/watch?v=oHg5SJYRHA0"

  - Specify audio quality of extracted audio (between 0 (best) and 10 (worst), default = 5):
    yt-dlp --extract-audio --audio-format mp3 --audio-quality 0 "https://www.youtube.com/watch?v=oHg5SJYRHA0"

  - Download all playlists of YouTube channel/user keeping each playlist in separate directory:
    yt-dlp -o "%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "https://www.youtube.com/user/TheLinuxFoundation/playlists"

  - Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home:
    yt-dlp -u user -p password -P "~/MyVideos" -o "%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s" "https://www.udemy.com/java-tutorial"

  - Download entire series season keeping each series and each season in separate directory under C:/MyVideos:
    yt-dlp -P "C:/MyVideos" -o "%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" "https://videomore.ru/kino_v_detalayah/5_sezon/367617"**bold text**
```

More examples:
```
Output template examples
$ yt-dlp --get-filename -o "test video.%(ext)s" BaW_jenozKc
test video.webm    # Literal name with correct extension

$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc
youtube-dl test video ''_ä↭𝕐.webm    # All kinds of weird characters

$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
youtube-dl_test_video_.webm    # Restricted file name

# Download YouTube playlist videos in separate directory indexed by video order in a playlist
$ yt-dlp -o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"

# Download YouTube playlist videos in separate directories according to their uploaded year
$ yt-dlp -o "%(upload_date>%Y)s/%(title)s.%(ext)s" "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"

# Prefix playlist index with " - " separator, but only if it is available
$ yt-dlp -o '%(playlist_index|)s%(playlist_index& - |)s%(title)s.%(ext)s' BaW_jenozKc "https://www.youtube.com/user/TheLinuxFoundation/playlists"

# Download all playlists of YouTube channel/user keeping each playlist in separate directory:
$ yt-dlp -o "%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "https://www.youtube.com/user/TheLinuxFoundation/playlists"

# Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home
$ yt-dlp -u user -p password -P "~/MyVideos" -o "%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s" "https://www.udemy.com/java-tutorial"

# Download entire series season keeping each series and each season in separate directory under C:/MyVideos
$ yt-dlp -P "C:/MyVideos" -o "%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" "https://videomore.ru/kino_v_detalayah/5_sezon/367617"

# Download video as "C:\MyVideos\uploader\title.ext", subtitles as "C:\MyVideos\subs\uploader\title.ext"
# and put all temporary files in "C:\MyVideos\tmp"
$ yt-dlp -P "C:/MyVideos" -P "temp:tmp" -P "subtitle:subs" -o "%(uploader)s/%(title)s.%(ext)s" BaW_jenoz --write-subs

# Download video as "C:\MyVideos\uploader\title.ext" and subtitles as "C:\MyVideos\uploader\subs\title.ext"
$ yt-dlp -P "C:/MyVideos" -o "%(uploader)s/%(title)s.%(ext)s" -o "subtitle:%(uploader)s/subs/%(title)s.%(ext)s" BaW_jenozKc --write-subs

# Stream the video being downloaded to stdout
$ yt-dlp -o - BaW_jenozKc
```
