# Parallelism

Multiprocessing vs Multithreading

### 1. `pytube` overview

> `pytube` is a lightweight, Pythonic, dependency-free, library (and command-line utility) for downloading YouTube Videos.

- link: https://pytube.io/en/latest/index.html

In [None]:
from pytube import YouTube

# YouTube URL to be downloaded
url = "https://www.youtube.com/watch?v=tWFejQSKIYg"
youtube_clip = YouTube(url)

In [None]:
for item in youtube_clip.streams:
    print(item)

In [None]:
youtube_clip.title

In [None]:
youtube_clip.thumbnail_url

- Using callback method from `pytube` library

In [None]:
def on_complete(stream, file_handle):
    print("Downloaded!")

youtube_clip = YouTube(url,
                       on_complete_callback=on_complete)

In [None]:
youtube_stream = youtube_clip.streams.filter(progressive=True, file_extension='mp4').first().download("file")

### 2. `tqdm` Overview

Instantly make your loops show a smart progress meter - just wrap any iterable with tqdm(iterable), and you're done!
- link: https://github.com/tqdm/tqdm

In [None]:
from tqdm import tqdm
from tqdm import trange
from time import sleep

In [None]:
for i in trange(100):
    sleep(0.01)

In [None]:
pbar = tqdm(total=100)
for i in range(10):
    sleep(0.1)
    pbar.update(10)
pbar.close()

### 3. Multiprocessing

Message Queue

In [16]:
import multiprocessing
from tqdm import tqdm
from functools import partial
import os
from pytube import YouTube
import psutil

- UI function

In [20]:
def draw_ui(message_queue):
    print("UI process starting ... PID:{}, PPID:{}".format(os.getpid(), psutil.Process(os.getpid()).ppid()), flush=True)
    prev = 0
    tqdm_bar = None
    while True:
        message = message_queue.get()
        if message["type"] == "on_progress":
            if tqdm_bar is None:
                tqdm_bar = tqdm(total=100, desc="Downloading...")
            cur_rate = message["progress_rate"]
            tqdm_bar.update(int(cur_rate-prev))
            prev = int(cur_rate)
        elif message["type"] == "on_complete":
            if tqdm_bar is None:
                tqdm_bar = tqdm(total=100, desc="Downloading...")
            tqdm_bar.update(100-prev)
            tqdm_bar.close()
            break

- Downloading function

In [21]:
def on_progress(stream, chunk, bytes_remaining, message_queue):
    total_size = stream.filesize
    bytes_downloaded = total_size - bytes_remaining
    progress = (bytes_downloaded / total_size) * 100
    message_queue.put({"type":"on_progress", "progress_rate":progress})

def on_complete(stream, file_handle, message_queue):
    message_queue.put({"type":"on_complete"})

def download(url, message_queue):
    print("Download process starting ... PID:{}, PPID:{}".format(os.getpid(), psutil.Process(os.getpid()).ppid()), flush=True)
    on_progress_with_MQ = partial(on_progress, message_queue=message_queue)
    on_complete_with_MQ = partial(on_complete, message_queue=message_queue)
    youtube_clip = YouTube(
                        url,
                        on_progress_callback=on_progress_with_MQ,
                        on_complete_callback=on_complete_with_MQ)
    youtube_stream = youtube_clip.streams.filter(
                        adaptive=True, 
                        file_extension='mp4').first()
    youtube_stream.download("multiprocessing")

- Multiprocessing

In [23]:
# Charlie Puth - One Call Away [Official Video]
url = "https://www.youtube.com/watch?v=BxuY9FET9Y4"

print("main process running ... PID:{}".format(os.getpid()), flush=True)

message_queue = multiprocessing.Queue()

process1 = multiprocessing.Process(target=draw_ui, args=(message_queue,))
process2 = multiprocessing.Process(target=download, args=(url, message_queue,))

process1.start()
process2.start()

process1.join()
process2.join()

main process running ... PID:1215
UI process starting ... PID:14454, PPID:1215
Download process starting ... PID:14457, PPID:1215


Downloading...: 100%|██████████| 100/100 [00:23<00:00,  4.28it/s]
