
# SBS 3D Video Generation Pipeline

This notebook outlines the process of converting a monocular video into a side-by-side (SBS) 3D video.



## Setup and Preparation

Import necessary libraries and define the input video path.

Personally, I opt for the file structure `datasets/d{index}/[set of input/output folder for frames]`

In [1]:
import timeit
import os
import subprocess

dataset = 0
# Define the path to the input video
dataset_directory = f'datasets/d{dataset}/'
input_video_path = dataset_directory + 'test.mp4'
final_video_filename = dataset_directory + 'test_SBS.mp4'

## Import and Setup Depth Anything Project

Run depth anything model for video input, specify if wishing to include depth model locally

In [2]:
local_path = False
local_path_directory = "your_local_directory"
current_working_directory = os.getcwd()

In [3]:
if local_path:
    local_path_directory = r"depth_models"
    # Clone the repository
    os.system("git clone https://github.com/LiheYoung/Depth-Anything")
    # Change directory to the cloned repository
    os.chdir("Depth-Anything")
    # Create a Conda environment named 'depth-anything' with Python 3.11
    os.system("conda create -n depth-anything python=3.11")
    # Activate the Conda environment
    os.system("conda activate depth-anything")
    # Install PyTorch and other dependencies
    os.system("conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia")
    # Install additional Python packages from requirements.txt
    os.system("pip install -r requirements.txt")
    # Print a message to indicate successful setup
    print("Project setup complete.")

In [4]:
# if not local path, specify remote path for depth-anything
remote_path = r"..\Depth-Anything"
if not local_path: local_path_directory = remote_path
print(f"Path set {local_path_directory}")

Path set ..\Depth-Anything


In [5]:
def checkPath(path):
    if os.path.exists(path): print(f"The file '{path}' exists.")
    else: print(f"The file '{path}' does not exist.")

In [8]:
rgbd_frames = dataset_directory + 'rgbd_in/'
os.makedirs(rgbd_frames, exist_ok=True)

print(f"dataset directory {dataset_directory}\n video dire {input_video_path}\n outdir {dataset_directory}")
print(f"local path {local_path_directory}")

# Check if the file exists
file_path = os.path.join(local_path_directory, "run_depth_only.py")

# Get the current working directory
current_working_directory = os.getcwd()

# Define the full paths to input video and dataset directory
input_video_path_full = os.path.join(current_working_directory, input_video_path)
dataset_directory_full = os.path.join(current_working_directory, dataset_directory)
checkPath(input_video_path_full)
checkPath(dataset_directory_full)


os.chdir(local_path_directory)
!python {local_path_directory}/run_depth_only.py --encoder vitl --video-path {input_video_path_full} --outdir {dataset_directory_full}

dataset directory datasets/d0/
 video dire datasets/d0/test.mp4
 outdir datasets/d0/
local path ..\Depth-Anything
The file 'C:\Users\abahrema\Documents\Tools\sbs-generator\datasets/d0/test.mp4' exists.
The file 'C:\Users\abahrema\Documents\Tools\sbs-generator\datasets/d0/' exists.
Cuda available? True
Model directory C:\Users\abahrema\Documents\Tools\Depth-Anything\checkpoints\depth_anything_vitb14
Loading weights from local directory
Total parameters: 97.47M
Progress 1/1, Processing C:\Users\abahrema\Documents\Tools\sbs-generator\datasets/d0/test.mp4


xFormers not available
xFormers not available



## Extract Frames from Video

Use ffmpeg to extract frames from the input color video and depth video.


In [5]:
# switch directory back
output_depth_video = dataset_directory + "test_video_depth.mp4"
os.chdir(current_working_directory)
print(f"Changed directory to {current_working_directory}")

Changed directory to C:\Users\abahrema\Documents\Tools\sbs-generator


In [17]:
# Create directory if non-existent
output_frames_path = dataset_directory + 'rgbd_in/frame%d.jpg'
output_dir = os.path.dirname(output_frames_path)
os.makedirs(output_dir, exist_ok=True)
print(f"output_frames_path: {output_frames_path}, output_dir: {output_dir}")
print(f"video {input_video_path}")
# execute ffmpeg command for color
!ffmpeg -i {input_video_path} -q:v 2 {output_frames_path} 
output_frames_path = dataset_directory + 'rgbd_in/frame%d.png' # for depth
!ffmpeg -i {output_depth_video} -q:v 2 {output_frames_path}


output_frames_path: datasets/d0/rgbd_in/frame%d.jpg, output_dir: datasets/d0/rgbd_in
video datasets/d0/test.mp4


ffmpeg version 2024-01-28-git-e0da916b8f-full_build-www.gyan.dev Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --pkg-config=pkgconf --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --e


## Image Preprocessing

Rename and pair color and depth images as needed. Run the script or run the function inside notebook.


In [None]:
rgbd_frames = dataset_directory + 'rgbd_in/'
# Example for renaming images (adjust according to your script)
!python sbs_rename_directory.py {rgbd_frames}

In [20]:
import os
import re

def get_frame_number(filename):
    match = re.search(r"frame(\d+)_", filename)
    if match:
        return int(match.group(1))
    else:
        raise ValueError(f"Invalid filename format: {filename}")
        
def rename_files(source_dir):
    os.makedirs(source_dir, exist_ok=True)

    # Process color images
    color_files = sorted([f for f in os.listdir(source_dir) if f.startswith("frame") and f.endswith(".jpg")], 
                         key=lambda x: int(x.split("frame")[1].split(".")[0]))
    counter = 1
    for filename in color_files:
        new_name = f"color{counter}.jpg"
        os.rename(os.path.join(source_dir, filename), os.path.join(source_dir, new_name))
        counter += 1
    print(f"Renamed {counter} color files in {source_dir}.")

    # Process depth images
    depth_files = sorted([f for f in os.listdir(source_dir) if f.startswith("frame") and f.endswith(".png")], 
                         key=lambda x: int(x.split("frame")[1].split(".")[0]))
    counter = 1
    for filename in depth_files:
        new_name = f"depth{counter}.png"
        os.rename(os.path.join(source_dir, filename), os.path.join(source_dir, new_name))
        counter += 1
    print(f"Renamed {counter} depth files in {source_dir}.")
    
source_dir = dataset_directory + "rgbd_in/"
rename_files(source_dir)

Renamed 1 color files in datasets/d0/rgbd_in/.
Renamed 81 depth files in datasets/d0/rgbd_in/.



## Generate Stereo Views

Run the script to generate left and right eye views or run the function inside notebook.


In [None]:
stereo_input_dir =  dataset_directory + "rgbd_in/"
stereo_output_dir = dataset_directory + "stereo_out_frames/"
os.makedirs(stereo_input_dir, exist_ok=True)
os.makedirs(stereo_output_dir, exist_ok=True)

!python sbs_generate_stereoviews.py {stereo_input_dir} {stereo_output_dir}

In [None]:
%time

import cv2
import numpy as np
import os

def process_images(input_dir, output_dir, scale_factor):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    color_images = sorted([f for f in os.listdir(input_dir) if f.startswith('color')])
    depth_images = sorted([f for f in os.listdir(input_dir) if f.startswith('depth')])

    for color_image_path, depth_image_path in zip(color_images, depth_images):
        color_image = cv2.imread(os.path.join(input_dir, color_image_path))
        depth_map = cv2.imread(os.path.join(input_dir, depth_image_path), cv2.IMREAD_GRAYSCALE)

        if color_image is None:
            print(f"Error: Color image not found at {os.path.join(input_dir, color_image_path)}")
            continue

        if depth_map is None:
            print(f"Error: Depth map not found at {os.path.join(input_dir, depth_image_path)}")
            continue

        # Function to shift pixels based on depth map
        def shift_pixels(image, depth_map, direction):
            shifted_image = np.zeros_like(image)
            for y in range(image.shape[0]):
                for x in range(image.shape[1]):
                    disparity = calculate_disparity(depth_map[y, x])
                    new_x = x + disparity * direction
                    if 0 <= new_x < image.shape[1]:
                        shifted_image[y, new_x] = image[y, x]
            return shifted_image

        # Calculate disparity (example function, adjust as needed)
        def calculate_disparity(depth_value):
            # Simple linear mapping, adjust the scale factor as needed
            return int(depth_value * scale_factor)

        # Create left and right eye images
        left_eye_image = shift_pixels(color_image, depth_map, 1)
        right_eye_image = shift_pixels(color_image, depth_map, -1)

        frame_number = color_image_path.split('color')[1].split('.')[0]
        
        if not os.path.exists(os.path.join(output_dir,'leftEye')):
            os.makedirs(os.path.join(output_dir,'leftEye'))
        if not os.path.exists(os.path.join(output_dir,'rightEye')):
            os.makedirs(os.path.join(output_dir,'rightEye'))
        
        left_eye_output_path = os.path.join(output_dir, f'leftEye/leftEye{frame_number}.jpg')
        right_eye_output_path = os.path.join(output_dir, f'rightEye/rightEye{frame_number}.jpg')

        # Save the left and right eye images
        cv2.imwrite(left_eye_output_path, left_eye_image)
        cv2.imwrite(right_eye_output_path, right_eye_image)

        print(f"Processed frame {frame_number}.")

# Example usage
stereo_input_dir =  dataset_directory + "rgbd_in/"
stereo_output_dir = dataset_directory + "stereo_out_frames/"
os.makedirs(stereo_input_dir, exist_ok=True)
os.makedirs(stereo_output_dir, exist_ok=True)
process_images(stereo_input_dir, stereo_output_dir, 0.05)

CPU times: total: 0 ns
Wall time: 0 ns
Processed frame 1.
Processed frame 10.
Processed frame 11.
Processed frame 12.
Processed frame 13.
Processed frame 14.
Processed frame 15.
Processed frame 16.
Processed frame 17.
Processed frame 18.
Processed frame 19.
Processed frame 2.
Processed frame 20.
Processed frame 21.
Processed frame 22.
Processed frame 23.
Processed frame 24.
Processed frame 25.
Processed frame 26.
Processed frame 27.
Processed frame 28.
Processed frame 29.
Processed frame 3.
Processed frame 30.
Processed frame 31.
Processed frame 32.
Processed frame 33.
Processed frame 34.
Processed frame 35.
Processed frame 36.
Processed frame 37.
Processed frame 38.
Processed frame 39.
Processed frame 4.
Processed frame 40.
Processed frame 41.
Processed frame 42.
Processed frame 43.
Processed frame 44.
Processed frame 45.
Processed frame 46.
Processed frame 47.
Processed frame 48.
Processed frame 49.
Processed frame 5.
Processed frame 50.
Processed frame 51.
Processed frame 52.
Proces


## Inpainting Process

Run the script for inpainting left and right eye images or run the function inside notebook.


In [None]:
stereo_output_dir = dataset_directory + "stereo_out_frames/"
stereo_postprocess_dir = dataset_directory + "stereo_postprocess_frames/"
os.makedirs(stereo_output_dir, exist_ok=True)
os.makedirs(stereo_postprocess_dir, exist_ok=True)

!python sbs_inpaint_stereoviews.py {stereo_output_dir} {stereo_postprocess_dir}

In [None]:
import cv2
import numpy as np
import os

def create_mask_for_black_streaks(image):
    # Convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # Use adaptive thresholding to better capture the black streaks
    mask = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 3, 8)
    
    # Dilate the mask to include the edges of the black streaks
    kernel = np.ones((5,5), np.uint8)
    mask = cv2.dilate(mask, kernel, iterations=1)
    
    return mask

def inpaint_black_streaks(image, mask):
    # Inpaint the black streaks in the image
    inpainted_image = cv2.inpaint(image, mask, 5, cv2.INPAINT_TELEA)
    
    return inpainted_image

def process_images(input_dir, output_dir, save_masks=False):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    if not os.path.exists(os.path.join(output_dir,'leftEye')):
        os.makedirs(os.path.join(output_dir,'leftEye'))
    if not os.path.exists(os.path.join(output_dir,'rightEye')):
        os.makedirs(os.path.join(output_dir,'rightEye'))
    
    if save_masks:
        if not os.path.exists(os.path.join(output_dir,'leftEyeMask')):
            os.makedirs(os.path.join(output_dir,'leftEyeMask'))
        if not os.path.exists(os.path.join(output_dir,'rightEyeMask')):
            os.makedirs(os.path.join(output_dir,'rightEyeMask'))

    left_path = input_dir + "leftEye/"
    right_path = input_dir + "rightEye/"
    left_eye_images = sorted([f for f in os.listdir(left_path) if f.startswith('leftEye')])
    right_eye_images = sorted([f for f in os.listdir(right_path) if f.startswith('rightEye')])

    for left_eye_image_path, right_eye_image_path in zip(left_eye_images, right_eye_images):
        left_eye_image = cv2.imread(os.path.join(left_path, left_eye_image_path))
        right_eye_image = cv2.imread(os.path.join(right_path, right_eye_image_path))

        if left_eye_image is None or right_eye_image is None:
            print(f"Error: Image not found at {os.path.join(input_dir, left_eye_image_path)} or {os.path.join(input_dir, right_eye_image_path)}")
            continue

        # Create masks for the black streaks in both left and right eye images
        left_eye_mask = create_mask_for_black_streaks(left_eye_image)
        right_eye_mask = create_mask_for_black_streaks(right_eye_image)

        # Inpaint the black streaks in both left and right eye images
        left_eye_post = inpaint_black_streaks(left_eye_image, left_eye_mask)
        right_eye_post = inpaint_black_streaks(right_eye_image, right_eye_mask)

        frame_number = left_eye_image_path.split('leftEye')[1].split('.')[0]
        left_eye_post_output_path = os.path.join(output_dir + "leftEye/", f'leftEyePost{frame_number}.jpg')
        right_eye_post_output_path = os.path.join(output_dir + "rightEye/", f'rightEyePost{frame_number}.jpg')
        # Save the processed images and masks
        cv2.imwrite(left_eye_post_output_path, left_eye_post)
        cv2.imwrite(right_eye_post_output_path, right_eye_post)
        
        if (save_masks):
            left_eye_mask_output_path = os.path.join(output_dir + "leftEyeMask/", f'leftEyeMask{frame_number}.jpg')
            right_eye_mask_output_path = os.path.join(output_dir + "rightEyeMask/", f'rightEyeMask{frame_number}.jpg')
            cv2.imwrite(left_eye_mask_output_path, left_eye_mask)
            cv2.imwrite(right_eye_mask_output_path, right_eye_mask)

        print(f"Processed frame {frame_number}.")

                          
stereo_output_dir = dataset_directory + "stereo_out_frames/"
stereo_postprocess_dir = dataset_directory + "stereo_postprocess_frames/"
os.makedirs(stereo_output_dir, exist_ok=True)
os.makedirs(stereo_postprocess_dir, exist_ok=True)                          
# Example usage
process_images(stereo_output_dir, stereo_postprocess_dir)



## Create Videos from Images

Use ffmpeg to create left and right eye videos. 

**Note** what the appropriate frame rate should be based on your input video.


In [None]:
left_eye_dir = dataset_directory + "stereo_postprocess_frames/" + "leftEye/"
right_eye_dir = dataset_directory + "stereo_postprocess_frames/" + "rightEye/"
os.makedirs(left_eye_dir, exist_ok=True)
os.makedirs(right_eye_dir, exist_ok=True)
left_eye_dir += "leftEyePost%d.jpg"
right_eye_dir += "rightEyePost%d.jpg"

left_eye_vid = dataset_directory + "left_eye.mp4"
right_eye_vid = dataset_directory + "right_eye.mp4"


!ffmpeg -framerate 24 -i {left_eye_dir} -c:v libx264 -pix_fmt yuv420p -vf "fps=24" {left_eye_vid}
!ffmpeg -framerate 24 -i {right_eye_dir} -c:v libx264 -pix_fmt yuv420p -vf "fps=24" {right_eye_vid}



## Merge Videos and Inject Metadata

Combine the left and right eye videos into an SBS video and inject 3D metadata.


In [None]:
left_eye_vid = dataset_directory + "left_eye.mp4"
right_eye_vid = dataset_directory + "right_eye.mp4"
output_vid = dataset_directory + "output.SBS.mp4"

!ffmpeg -i {left_eye_vid} -i {right_eye_vid} -filter_complex "[0:v][1:v]hstack=inputs=2[v]" -map "[v]" {output_vid}
!ffmpeg -i {output_vid} -vf "scale=2*iw:ih" -c:v libx264 -x264opts "frame-packing=3" -aspect 2:1 {final_video_filename}


## Upload SBS video to Quest headset

First register device connection, then push file, and lastly force the file system to update without restarting the device.


In [None]:
!adb devices
!adb push {final_video_filename} /sdcard/Movies/
!adb shell am force-stop com.android.providers.media.module


## Cleanup and Finalization

(Optional) Cleanup temporary files and display/export the final video path.


In [None]:
# Example cleanup (adjust as needed)

#rgbd frame directories
rgbd_frames = dataset_directory + 'rgbd_in/'
#stereo directories
stereo_input_dir =  dataset_directory + "input_frames/"
stereo_output_dir = dataset_directory + "stereo_out_frames/"
#inpainting directories
stereo_postprocess_dir = dataset_directory + "stereo_postprocess_frames/"

#video directories
left_eye_vid = dataset_directory + "left_eye.mp4"
right_eye_vid = dataset_directory + "right_eye.mp4"
output_vid = dataset_directory + "output.SBS.mp4"

# delete all directories and videos
!rm -rf {rgbd_frames}
!rm -rf {stereo_input_dir}
!rm -rf {stereo_output_dir}
!rm -rf {stereo_postprocess_dir}
!rm {left_eye_vid} {right_eye_vid} {output_vid}