<a href="https://colab.research.google.com/github/nateraw/encoded-video/blob/main/examples/encoded_video_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
%%capture
! pip install encoded-video

In [2]:
! wget -nc  https://dl.fbaipublicfiles.com/pytorchvideo/projects/archery.mp4

--2021-10-28 16:01:17--  https://dl.fbaipublicfiles.com/pytorchvideo/projects/archery.mp4
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 104.22.74.142, 172.67.9.4, 104.22.75.142, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|104.22.74.142|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 549197 (536K) [video/mp4]
Saving to: ‘archery.mp4’


2021-10-28 16:01:19 (781 KB/s) - ‘archery.mp4’ saved [549197/549197]



In [3]:
import numpy as np
from IPython.display import Audio

from encoded_video import read_video, write_video, video_to_bytes, bytes_to_video

In [4]:
def show_video_info(vid):
    """prints out the keys/values of video dictionaries returned by read_video/EncodedVideo.get_clip"""
    for k, v in vid.items():
        if isinstance(v, np.ndarray):
            print(f"{k} shape: {v.shape}")
        else:
            print(f"{k}: {v}")

In [5]:
vid = read_video('archery.mp4')
show_video_info(vid)

video shape: (300, 240, 320, 3)
audio shape: (441344,)
duration: 10.01
fps: 29.97002997002997
audio_fps: 44100


Video/audio frames are represented as numpy arrays

In [6]:
video_arr = vid['video']  # (T, H, W, C)
audio_arr = vid['audio']  # (S,)

Let's listen to its audio



In [7]:
Audio(data=vid['audio'], rate=vid['audio_fps'])

We can write the video to disk like this

In [8]:
write_video(
    'out.mp4',
    vid['video'],
    fps=30,
    audio_array=np.expand_dims(vid['audio'], 0),
    audio_fps=vid['audio_fps'],
    audio_codec='aac'
)

Or we can serialize to bytes without writing to file if we want

In [11]:
out_bytes = video_to_bytes(
    vid['video'],
    fps=30,
    audio_array=np.expand_dims(vid['audio'], 0),
    audio_fps=vid['audio_fps'],
    audio_codec='aac'
)
assert isinstance(out_bytes, bytes)

You can then easily load the video back straight from bytes

In [12]:
restored_video = bytes_to_video(out_bytes)
show_video_info(restored_video)
Audio(data=restored_video['audio'], rate=restored_video['audio_fps'])

video shape: (300, 240, 320, 3)
audio shape: (442368,)
duration: 10.031020408163265
fps: 30.0
audio_fps: 44100
