
# SBS 3D Video Generation Pipeline

This notebook outlines the process of converting a monocular video into a side-by-side (SBS) 3D video.



## Setup and Preparation

Import necessary libraries and define the input video path.


In [None]:
import timeit
import os
import subprocess

# Define the path to the input video
input_video_path = 'datasets/data_in/input.mp4'



## Extract Frames from Video

Use ffmpeg to extract frames from the input video.


In [None]:

!ffmpeg -i {input_video_path} -q:v 2 datasets/data_in/frame%d.jpg



## Generate Depth Images

In this step, we generate depth maps for each color frame. We are using a cross-platform library called [Marigold](https://github.com/prs-eth/Marigold?tab=readme-ov-file), which works well on both Mac and Windows. Marigold is designed to efficiently generate depth maps and is particularly optimized for Apple Silicon.

For Windows users, an alternative tool called [PatchFusion](https://zhyever.github.io/patchfusion/) is recommended. It's important to note that any depth map model compatible with your operating system can be used in this step. The key requirement is to obtain a depth map for each color frame.

**Ensure that your datasets are placed in the specified input directory, and the output directory is set up to receive the depth maps.**

# Running Marigold on Mac

To generate depth maps using Marigold on a Mac, especially optimized for Apple Silicon, run the following command in the directory containing the Marigold library:

In [None]:
!python run.py --input_rgb_dir datasets/d0/data_in/ --output_dir datasets/d0/data_out/ --apple_silicon


## Image Preprocessing

Rename and pair images as needed.


In [None]:
# Example for renaming images (adjust according to your script)
!python sbs_rename_directory.py

For color, run the following cell:

In [None]:
import os

source_dir = "./datasets/d2/battle/battle_in/"
target_prefix = "color"

def get_frame_number(filename):
    return int(filename.split("frame")[1].split(".")[0])


file_list = os.listdir(source_dir)
frame_files = sorted([f for f in file_list if f.startswith("frame") and f.endswith(".jpg")], key=get_frame_number)
counter = 0

for filename in frame_files:
    new_name = f"{target_prefix}{counter}.jpg"
    os.rename(os.path.join(source_dir, filename), os.path.join(source_dir, new_name))
    counter += 1

For depth, run the following cell:

In [4]:
import os

source_dir = "./datasets/d2/battle/battle_in/"
target_prefix = "depth"

def get_frame_number(filename):
    return int(filename.split("frame")[1].split(".")[0])


file_list = os.listdir(source_dir)
frame_files = sorted([f for f in file_list if f.startswith("frame") and f.endswith(".png")], key=get_frame_number)
counter = 0

for filename in frame_files:
    new_name = f"{target_prefix}{counter}.png"
    os.rename(os.path.join(source_dir, filename), os.path.join(source_dir, new_name))
    counter += 1

NameError: name 'os' is not defined


## Generate Stereo Views

Run the script to generate left and right eye views.


In [None]:
!python sbs_generate_stereoviews.py datasets/d2/data_in/ datasets/d2/data_out/  

In [None]:
%time

import cv2
import numpy as np
import os

def process_images(input_dir, output_dir, scale_factor):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    color_images = sorted([f for f in os.listdir(input_dir) if f.startswith('color')])
    depth_images = sorted([f for f in os.listdir(input_dir) if f.startswith('depth')])

    for color_image_path, depth_image_path in zip(color_images, depth_images):
        color_image = cv2.imread(os.path.join(input_dir, color_image_path))
        depth_map = cv2.imread(os.path.join(input_dir, depth_image_path), cv2.IMREAD_GRAYSCALE)

        if color_image is None:
            print(f"Error: Color image not found at {os.path.join(input_dir, color_image_path)}")
            continue

        if depth_map is None:
            print(f"Error: Depth map not found at {os.path.join(input_dir, depth_image_path)}")
            continue

        # Function to shift pixels based on depth map
        def shift_pixels(image, depth_map, direction):
            shifted_image = np.zeros_like(image)
            for y in range(image.shape[0]):
                for x in range(image.shape[1]):
                    disparity = calculate_disparity(depth_map[y, x])
                    new_x = x + disparity * direction
                    if 0 <= new_x < image.shape[1]:
                        shifted_image[y, new_x] = image[y, x]
            return shifted_image

        # Calculate disparity (example function, adjust as needed)
        def calculate_disparity(depth_value):
            # Simple linear mapping, adjust the scale factor as needed
            return int(depth_value * scale_factor)

        # Create left and right eye images
        left_eye_image = shift_pixels(color_image, depth_map, 1)
        right_eye_image = shift_pixels(color_image, depth_map, -1)

        frame_number = color_image_path.split('color')[1].split('.')[0]
        
        if not os.path.exists(output_dir):
        os.makedirs(output_dir)
        
        left_eye_output_path = os.path.join(output_dir, f'leftEye/leftEye{frame_number}.jpg')
        right_eye_output_path = os.path.join(output_dir, f'rightEye/rightEye{frame_number}.jpg')

        # Save the left and right eye images
        cv2.imwrite(left_eye_output_path, left_eye_image)
        cv2.imwrite(right_eye_output_path, right_eye_image)

        print(f"Processed frame {frame_number}.")

# Example usage
process_images('./datasets/d2/data_in/', './datasets/d2/data_out/', 0.05)

In [None]:
%time

import cv2
import numpy as np
import os
from concurrent.futures import ThreadPoolExecutor, as_completed

def shift_pixels(image, depth_map, direction, scale_factor):
    shifted_image = np.zeros_like(image)
    for y in range(image.shape[0]):
        for x in range(image.shape[1]):
            disparity = int(depth_map[y, x] * scale_factor)
            new_x = x + disparity * direction
            if 0 <= new_x < image.shape[1]:
                shifted_image[y, new_x] = image[y, x]
    return shifted_image

def process_single_image_pair(color_image_path, depth_image_path, input_dir, output_dir, scale_factor):
    try:
        color_image = cv2.imread(os.path.join(input_dir, color_image_path))
        depth_map = cv2.imread(os.path.join(input_dir, depth_image_path), cv2.IMREAD_GRAYSCALE)

        if color_image is None or depth_map is None:
            return f"Error: Image not found at {os.path.join(input_dir, color_image_path)} or {os.path.join(input_dir, depth_image_path)}"

        left_eye_image = shift_pixels(color_image, depth_map, 1, scale_factor)
        right_eye_image = shift_pixels(color_image, depth_map, -1, scale_factor)

        frame_number = color_image_path.split('color')[1].split('.')[0]
        left_eye_output_path = os.path.join(output_dir, f'leftEye{frame_number}.jpg')
        right_eye_output_path = os.path.join(output_dir, f'rightEye{frame_number}.jpg')

        cv2.imwrite(left_eye_output_path, left_eye_image)
        cv2.imwrite(right_eye_output_path, right_eye_image)

        return f"Processed frame {frame_number}."
    except Exception as e:
        return f"Error processing {color_image_path}: {e}"

def process_images(input_dir, output_dir, scale_factor):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    color_images = sorted([f for f in os.listdir(input_dir) if f.startswith('color')])
    depth_images = sorted([f for f in os.listdir(input_dir) if f.startswith('depth')])

    with ThreadPoolExecutor(max_workers=8) as executor:
        futures = [executor.submit(process_single_image_pair, color_image_path, depth_image_path, input_dir, output_dir, scale_factor) for color_image_path, depth_image_path in zip(color_images, depth_images)]

    for future in as_completed(futures):
        print(future.result())

# Example usage
process_images('./datasets/data_in/', './datasets/data_out_fast/', 0.05)



## Inpainting Process

Run the script for inpainting left and right eye images.


In [None]:
!python sbs_inpaint_stereoviews.py data_out/ data_out_final/

In [None]:
import cv2
import numpy as np
import os

def create_mask_for_black_streaks(image):
    # Convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # Use adaptive thresholding to better capture the black streaks
    mask = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 3, 8)
    
    # Dilate the mask to include the edges of the black streaks
    kernel = np.ones((5,5), np.uint8)
    mask = cv2.dilate(mask, kernel, iterations=1)
    
    return mask

def inpaint_black_streaks(image, mask):
    # Inpaint the black streaks in the image
    inpainted_image = cv2.inpaint(image, mask, 5, cv2.INPAINT_TELEA)
    
    return inpainted_image

def process_images(input_dir, output_dir):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    left_path = input_dir + "leftEye/"
    right_path = input_dir + "rightEye/"
    left_eye_images = sorted([f for f in os.listdir(left_path) if f.startswith('leftEye')])
    right_eye_images = sorted([f for f in os.listdir(right_path) if f.startswith('rightEye')])

    for left_eye_image_path, right_eye_image_path in zip(left_eye_images, right_eye_images):
        left_eye_image = cv2.imread(os.path.join(left_path, left_eye_image_path))
        right_eye_image = cv2.imread(os.path.join(right_path, right_eye_image_path))

        if left_eye_image is None or right_eye_image is None:
            print(f"Error: Image not found at {os.path.join(input_dir, left_eye_image_path)} or {os.path.join(input_dir, right_eye_image_path)}")
            continue

        # Create masks for the black streaks in both left and right eye images
        left_eye_mask = create_mask_for_black_streaks(left_eye_image)
        right_eye_mask = create_mask_for_black_streaks(right_eye_image)

        # Inpaint the black streaks in both left and right eye images
        left_eye_post = inpaint_black_streaks(left_eye_image, left_eye_mask)
        right_eye_post = inpaint_black_streaks(right_eye_image, right_eye_mask)

        frame_number = left_eye_image_path.split('leftEye')[1].split('.')[0]
        left_eye_post_output_path = os.path.join(output_dir + "leftEye/", f'leftEyePost{frame_number}.jpg')
        right_eye_post_output_path = os.path.join(output_dir + "rightEye/", f'rightEyePost{frame_number}.jpg')

        # Save the processed images and masks
        cv2.imwrite(left_eye_post_output_path, left_eye_post)
        cv2.imwrite(right_eye_post_output_path, right_eye_post)

        print(f"Processed frame {frame_number}.")

# Example usage
process_images('./datasets/d2/data_out/', './datasets/d2/data_out_post/')



## Create Videos from Images

Use ffmpeg to create left and right eye videos.


In [None]:

!ffmpeg -framerate 24 -i './datasets/d2/data_out_post/leftEye/leftEyePost%d.jpg' -c:v libx264 -pix_fmt yuv420p -vf "fps=24" left_eye.mp4
!ffmpeg -framerate 24 -i './datasets/d2/data_out_post/rightEye/rightEyePost%d.jpg' -c:v libx264 -pix_fmt yuv420p -vf "fps=24" right_eye.mp4



## Merge Videos and Inject Metadata

Combine the left and right eye videos into an SBS video and inject 3D metadata.


In [None]:

!ffmpeg -i left_eye.mp4 -i right_eye.mp4 -filter_complex "[0:v][1:v]hstack=inputs=2[v]" -map "[v]" datasets\d2\output.SBS.mp4
!ffmpeg -i datasets\d2\output.SBS.mp4 -vf "scale=2*iw:ih" -c:v libx264 -x264opts "frame-packing=3" -aspect 2:1 datasets\d2\output_final_sbs.mp4



## Cleanup and Finalization

(Optional) Cleanup temporary files and display/export the final video path.


In [None]:

# Example cleanup (adjust as needed)
# !rm -rf video_images/
# !rm left_eye.mp4 right_eye.mp4

# Display the final video path
final_video_path = 'outputv2-3D.mp4'
final_video_path
