<a href="https://colab.research.google.com/github/karen-pal/notebooks/blob/master/videogrep_workshop.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Videogrep

Here's a quick notebook about using [Videogrep](https://github.com/antiboredom/videogrep) from within Google Colab.

## Install dependencies

In [1]:
!pip install videogrep
!pip install yt-dlp
!pip install vosk

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting videogrep
  Downloading videogrep-2.1.2-py3-none-any.whl (41.2 MB)
[K     |████████████████████████████████| 41.2 MB 1.2 MB/s 
[?25hCollecting moviepy<2.0.0,>=1.0.3
  Downloading moviepy-1.0.3.tar.gz (388 kB)
[K     |████████████████████████████████| 388 kB 73.5 MB/s 
[?25hCollecting beautifulsoup4<5.0.0,>=4.11.1
  Downloading beautifulsoup4-4.11.1-py3-none-any.whl (128 kB)
[K     |████████████████████████████████| 128 kB 77.8 MB/s 
[?25hCollecting soupsieve>1.2
  Downloading soupsieve-2.3.2.post1-py3-none-any.whl (37 kB)
Collecting proglog<=1.0.0
  Downloading proglog-0.1.10-py3-none-any.whl (6.1 kB)
Collecting imageio_ffmpeg>=0.2.0
  Downloading imageio_ffmpeg-0.4.7-py3-none-manylinux2010_x86_64.whl (26.9 MB)
[K     |████████████████████████████████| 26.9 MB 88.3 MB/s 
Building wheels for collected packages: moviepy
  Building wheel for moviepy (setup.py) ... [?25l[?

## Set up a preview function for videos

In [2]:
from IPython.display import HTML
from base64 import b64encode
 
def preview(video_path, video_width = 600):
    """Preview a video in ipython. From:
    https://stackoverflow.com/questions/57377185/how-play-mp4-video-in-google-colab"""
    
    video_file = open(video_path, "r+b").read()
    video_url = f"data:video/mp4;base64,{b64encode(video_file).decode()}"
    return HTML(f"""<video width={video_width} controls><source src="{video_url}"></video>""")
 

## Download a video to work with

Use `yt-dlp` to download a youtube video. The option `-f 22` downloads a smaller sized 1280x720 video; `-o shell.mp4` saves the video as `shell.mp4`; and `--write-auto-sub` downloads the video's auto-generated subtitle file.

In [15]:
!yt-dlp "https://www.youtube.com/watch?v=n0G5I5tGf2c" -f 22 -o shell2.mp4 --write-auto-sub

[youtube] n0G5I5tGf2c: Downloading webpage
[youtube] n0G5I5tGf2c: Downloading android player API JSON
[info] n0G5I5tGf2c: Downloading subtitles: en
[info] n0G5I5tGf2c: Downloading 1 format(s): 22
[info] Writing video subtitles to: shell2.en.vtt
[download] Destination: shell2.en.vtt
[K[download] 100% of    1.67KiB in [1;37m00:00:00[0m at [0;32m6.93KiB/s[0m
[download] Destination: shell2.mp4
[K[download] 100% of   30.31MiB in [1;37m00:00:04[0m at [0;32m6.32MiB/s[0m


## Print out the most common words in the video

In [20]:
!videogrep --input shell2.mp4 --ngrams 5

     2
the   these  1
  these   1
 these    1
these     1
    it 1
   it  1
  it   1
 it    1
it     1
    dc 1
   dc  1
  dc   1
 dc   and 1
dc   and  1
  and   1
 and    1


## Create a supercut

`--search-type fragment` tells videogrep to only extract individual words/phrases

`--search` tells videogrep what word to look for

In [10]:
!videogrep --input shell.mp4 --search-type fragment --search hand

[+] Creating clips.
[+] Concatenating clips.
[+] Writing ouput file.
Moviepy - Building video supercut.mp4.
MoviePy - Writing audio in temp-audio1668645791.306925.m4a
MoviePy - Done.
Moviepy - Writing video supercut.mp4

Moviepy - Done !
Moviepy - video ready supercut.mp4


## Preview the video

In [11]:
preview("supercut.mp4")

## Transcribe a video

You can use use the `--transcribe` option to transcribe a video. Sometimes this can yield better results than using youtube's auto-transcriber.

In [None]:
!videogrep --input shell.mp4 --transcribe

Transcribing shell.mp4


## Create another supercut

In [13]:
!videogrep --input shell.mp4 --search-type fragment --search swallowed --output billion.mp4

[+] Creating clips.
[+] Concatenating clips.
[+] Writing ouput file.
Moviepy - Building video billion.mp4.
MoviePy - Writing audio in temp-audio1668646401.717737.m4a
MoviePy - Done.
Moviepy - Writing video billion.mp4

Moviepy - Done !
Moviepy - video ready billion.mp4


In [14]:
preview("billion.mp4")