## PyTorch3D in Colab

Modules `torch` and `torchvision` are required. If `pytorch3d` is not installed, install it using the following cell. Here, I modified to install PyTorch3D from my own pre-built wheel. Using my own pytorch3d wheel allows for faster installation. Installing from source takes several minutes to complete.

**‚ö†Ô∏è WARNING: If the PyTorch3D installation from the current wheel fails, create another one!!!**

PyTorch3D takes a long time to install from source in Colab. Instead of installing from source everytime an Colab instance is started, this notebook uses a pre-built whell. The pre-built PyTorch3D wheel is downloaded from my Dropbox (shared link). Another copy of the wheel is also stored in my Google Drive, and is located at: `/content/drive/MyDrive/research/projects/slosh_project/slosh_project_team_files/Colab_wheels/`

## Load the CAD model file

We will load a CAD model (e.g., `ply` or `obj`) file and create a **Meshes** object. **Meshes** is a unique datastructure provided in PyTorch3D for working with **batches of meshes of different sizes**. It has several useful class methods which are used in the rendering pipeline.

## Create a renderer

A **renderer** in PyTorch3D is composed of a **rasterizer** and a **shader** which each have a number of subcomponents such as a **camera** (orthographic/perspective). Here, we initialize some of these components and use default values for the rest.


# General settings (User input)


‚ö†Ô∏è <b>Attention:</b> Replace the information with your GitHub email and username.


‚ö†Ô∏è <b>Attention:</b> Press enter or run cells to accept default values.
</div>


In [None]:
#@title Settings for GitHub Access

# Set name and email for github cloning using #@param
git_username = "eraldoribeiro" #@param {type:"string"}
git_email = "eribeiro@fit.edu" #@param {type:"string"}

repository_name = "point3D_from_depth" #@param {type:"string"}
organization_name = "ribeiro-computer-vision" #@param {type:"string"}

In [None]:
#@title Path to PyTorch3D (pre-built) wheel

# Set name and email for github cloning using #@param
dropbox_link = "https://www.dropbox.com/scl/fi/fqvlnyponcbekjd01omhj/pytorch3d-0.7.8-cp312-cp312-linux_x86_64.whl?rlkey=563mfx35rog42z1c8y7qn31sk&dl=0" #@param {type:"string"}


In [None]:
#@title Path to the Mast3r checkpoints file (Google Drive)

checkpoints_gdrive_path = "/content/drive/MyDrive/teaching/tutorials_files/mast3r_checkpoints" #@param {type:"string"}

checkpoints_file_name =  "MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth" #@param {type:"string"}



In [None]:
#@title Path to mesh file (.obj)

# Path to mesh file
obj_path = "point3D_from_depth/assets/StarShip_small.obj" #@param {type:"string"}

In [None]:
#@title üì∑ Camera Intrinsics
# Focal lengths
focal_length_x = 900  #@param {type:"number"}
focal_length_y = 900  #@param {type:"number"}

# Principal point
principal_point_x = 128  #@param {type:"number"}
principal_point_y = 128  #@param {type:"number"}

# Image dimensions
image_witdh = 256   #@param {type:"number"}
image_height = 256  #@param {type:"number"}

# --- Aliases for convenience in code ---
fx, fy = focal_length_x, focal_length_y
cx, cy = principal_point_x, principal_point_y

print("\nK =")
print(f"[[{fx:8.2f} {0.0:8.2f} {cx:8.2f}]")
print(f" [{0.0:8.2f} {fy:8.2f} {cy:8.2f}]")
print(f" [{0.0:8.2f} {0.0:8.2f} {1.0:8.2f}]]\n")

# --- Aliases for convenience in code ---
W = image_witdh
H = image_height

---
# ‚öôÔ∏è Setting up


In [None]:
# Set this to True if you want to mount gdrive
mount_gdrive = True

In [None]:
!pip --quiet install ipython-autotime
%load_ext autotime

### üìÇ Clone Repository & üîë Mount Google Drive  & Install PyTorch3D/dependencies

Clone the repository and mount **Google Drive** (requires user interaction).  
This will also set up the environment and install the necessary libraries.


**Set name and email for github cloning**

<div style="border-left: 5px solid #FFA500; padding: 12px; background-color: #FFF4E5; font-size: 18px;">
  ‚ö†Ô∏è <b>Attention:</b> Replace the information with your GitHub email and username.
</div>


In [None]:
!git config --global user.name git_username
!git config --global user.email git_email

#### üîΩ Clone the Repository

The next cell will **clone the repository** containing the notebooks and helper functions you‚Äôll need.  

If the command fails (for example, due to missing secrets or permissions), you can open a **Terminal** in Colab and manually run the `git clone` command there.


In [None]:
def gh_clone(user, repo, token_key="GH_TOKEN"):
    from google.colab import userdata
    token = userdata.get(token_key)
    url = f"https://{user}:{token}@github.com/{user}/{repo}.git"
    !git clone $url
    %cd $repo
    !git remote set-url origin $url
    del token


The next cell will **clone the repository** containing this notebooks and helper functions you‚Äôll need.

If the `git clone` command fails (for example, due to missing secrets or permissions), you can open a **Terminal** in Colab and manually run the `git clone` command there.

In Colab, we can only open a current notebook. But, we can edit python files (containing our library of functions) using git as we would normally when working on a computer. Any changes to files will only be saved to GitHub if we commit/push the changes prior to disconnecting the Colab instance. Colab sometimes disconnects without a warning so make sure the changes to files or notebooks are saved to github or google drive.  




In [None]:
gh_clone(organization_name, repository_name)

# ‚úÖ Verify that the repository was cloned
import os
repo_name = "/content/" + repository_name
if os.path.exists(repo_name):
    print(f"‚úÖ Repository '{repo_name}' successfully cloned!")
else:
    print(f"‚ùå Repository '{repo_name}' not found. Try cloning manually.")

#### üîΩ Mount google drive

In [None]:
import os

from google.colab import drive
from google.colab import auth

# auth.authenticate_user()

local_path = os.getcwd()
print("Current local path:", local_path)

# Mount google drive if using Colab
if 'google.colab' in str(get_ipython()):
    print('Running on CoLab')
    local_path = "/content/"
    from google.colab import drive
    if mount_gdrive:
        if mount_gdrive:
            drive.mount('/content/drive', force_remount=True)
else:
    print('Not running on CoLab')

os.chdir(local_path)



### ‚öôÔ∏è Install Pytorch3D

#### ‚ö° Install PyTorch3D from Wheel

PyTorch3D installation can take longer than 8-10 minutes when installed from source.

Here, **PyTorch3D is installed from a wheel** for a faster setup of about 2 minutes in Colab.

- If the installer instead tries to **build from source**, it means the wheel is outdated or missing.  
- In that case, you can **create your own wheel directly in Colab**, save it to **Google Drive** (or Dropbox), and reuse it later for faster installation.
- To create your own PyTorch3D wheel in Colab, follow the instructions in the cell after these installation cells.



In [None]:
#  ---------------------------- IMPORTS -----------------------------------------
# Stdlib
import os
import sys
import math
import shutil
from pathlib import Path
from typing import Optional, Tuple, Literal, Dict, Any

# Third-party
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import cv2
import imageio
import requests
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from tqdm.notebook import tqdm
from skimage import img_as_ubyte

# set path for libraries
sys.path.append(repo_name)


In [None]:
# --- Config ---
mount_gdrive = False

# --- Imports ---
import importlib, os, sys, shutil, subprocess, urllib.request, pathlib
import installation_tools as install_tools
importlib.reload(install_tools)

# --- Short helpers (no notebook magics) ---
def run(cmd, check=True):
    print("$", " ".join(cmd))
    try:
        subprocess.run(cmd, check=check)
    except subprocess.CalledProcessError as e:
        print(f"Command failed ({e.returncode}): {' '.join(cmd)}")
        if check:
            raise

def pip_install(*pkgs, extra=None, check=True):
    args = [sys.executable, "-m", "pip", "install"]
    if extra:
        args += extra
    args += list(pkgs)
    run(args, check=check)

def conda_available():
    return shutil.which("conda") is not None

def conda_install(*pkgs):
    if not conda_available():
        print("conda not available; skipping conda installs.")
        return
    # Use -c conda-forge channel and auto-yes
    run(["conda", "install", "-y", "-c", "conda-forge", *pkgs], check=False)

# --- Detect platform ---
pm = install_tools.PlatformManager()
platform, local_path = pm.platform, pm.local_path
print("Detected:", platform, local_path)

# --- Optional: Mount GDrive if on Colab ---
if mount_gdrive and platform == "Colab":
    pm.mount_gdrive()

# --- Lightning AI specific environment tweaks ---
if platform == "LightningAI":
    # conda piece (if conda exists in the image)
    conda_install("libstdcxx-ng=13")
    # pip pins / extras
    pip_install("numpy<2.0", check=False)
    pip_install("scikit-image", "gradio", "moviepy", "plotly", check=False)
    # If requirements.txt exists in CWD, install it
    if os.path.exists("requirements.txt"):
        pip_install("-r", "requirements.txt")

# --- Install PyTorch3D (handles platform differences & fallbacks) ---
installer = install_tools.PyTorch3DInstaller(
    platform, local_path, dropbox_wheel_url=dropbox_link
)
installer.install()



# --- Extra libraries (quiet-ish) ---
# Original line had: trimesh pyrender opencv-python matplotlib pytorch-lightning
pip_install("trimesh", "pyrender", "opencv-python", "matplotlib", "pytorch-lightning", check=False)

# --- Download plot_image_grid.py if missing ---
filename = "plot_image_grid.py"
url = "https://raw.githubusercontent.com/facebookresearch/pytorch3d/main/docs/tutorials/utils/plot_image_grid.py"
if not os.path.exists(filename):
    print(f"Downloading {filename} ...")
    try:
        urllib.request.urlretrieve(url, filename)
        print("Saved to", pathlib.Path(filename).resolve())
    except Exception as e:
        print("Download failed:", e)

# --- gdown ---
pip_install("gdown", extra=["--quiet"], check=False)
print("‚úÖ Setup complete.")


**Install and import colorama module (color printing)**

In [None]:
!pip install colorama
from colorama import Fore, Back, Style, init

# ---------- pretty print helpers ----------
RESET="\033[0m"; BOLD="\033[1m"
C={"ok":"\033[1;32m","info":"\033[1;36m","step":"\033[1;35m","warn":"\033[1;33m"}
CYAN  = "\033[1;36m"; GREEN = "\033[1;32m"; YELLOW = "\033[1;33m"


def say(kind,msg): print(f"{C[kind]}{msg}{RESET}")
torch.set_printoptions(precision=4, sci_mode=False)
np.set_printoptions(precision=4, suppress=True)




#### üõ†Ô∏è (Optional) Build Your Own PyTorch3D Wheel

If the pre-built wheel does not match your setup, you can **build PyTorch3D from source** and save the wheel to Google Drive.  
This way, you only build once and reuse the `.whl` file in future Colab sessions.



###### Steps

#### üîΩ 1. Clone PyTorch3D
```python
!git clone https://github.com/facebookresearch/pytorch3d.git
%cd pytorch3d
```
#### üîΩ 2. Build the wheel (this may take several minutes)
``` python
!pip install build
!python -m build --wheel
```

#### üîΩ 3. Find the wheel file
``` python
import glob, os
wheels = glob.glob("dist/*.whl")
print("üì¶ Built wheels:", wheels)
```

#### üîΩ 4. Copy the wheel to Google Drive (adjust path if needed)
```python
from google.colab import drive
drive.mount('/content/drive')
save_path = "/content/drive/MyDrive/pytorch3d_wheels/"
os.makedirs(save_path, exist_ok=True)

for w in wheels:
    !cp $w $save_path
print("‚úÖ Wheel(s) saved to:", save_path)
```

#### PyTorch3D imports
The following cell require PyTorch3D. Ensure it is executed after PyTorch3D is installed.

In [None]:
# # ---------------------------- IMPORTS -----------------------------------------
# PyTorch3D ‚Äî IO & data structures
from pytorch3d.io import load_obj, load_ply, load_objs_as_meshes
from pytorch3d.structures import Meshes

# PyTorch3D ‚Äî transforms
from pytorch3d.transforms import Rotate, Translate

# PyTorch3D ‚Äî rendering
from pytorch3d.renderer import (
    FoVPerspectiveCameras,
    PerspectiveCameras,
    look_at_view_transform,
    look_at_rotation,
    camera_position_from_spherical_angles,
    RasterizationSettings,
    MeshRenderer,
    MeshRasterizer,
    BlendParams,
    SoftSilhouetteShader,
    SoftPhongShader,
    HardPhongShader,
    PointLights,
    DirectionalLights,
    Materials,
    TexturesUV,
    TexturesVertex,
)
from pytorch3d.renderer.cameras import CamerasBase

# PyTorch3D ‚Äî visualization helpers (optional)
from pytorch3d.vis.plotly_vis import AxisArgs, plot_batch_individually, plot_scene
from pytorch3d.vis.texture_vis import texturesuv_image_matplotlib

# Project utils path (adjust as needed)
sys.path.append(os.path.abspath(''))
# ------------------------------------------------------------------------------


# Start of the original tutorial (PyTorch3D)
Original notebook: https://pytorch3d.org/tutorials/bundle_adjustment

In [None]:
# Copyright (c) Meta Platforms, Inc. and affiliates. All rights reserved.

# Absolute camera orientation given set of relative camera pairs

This tutorial showcases the `cameras`, `transforms` and `so3` API.

The problem we deal with is defined as follows:

Given an optical system of $N$ cameras with extrinsics $\{g_1, ..., g_N | g_i \in SE(3)\}$, and a set of relative camera positions $\{g_{ij} | g_{ij}\in SE(3)\}$ that map between coordinate frames of randomly selected pairs of cameras $(i, j)$, we search for the absolute extrinsic parameters $\{g_1, ..., g_N\}$ that are consistent with the relative camera motions.

More formally:
$$
g_1, ..., g_N =
{\arg \min}_{g_1, ..., g_N} \sum_{g_{ij}} d(g_{ij}, g_i^{-1} g_j),
$$,
where $d(g_i, g_j)$ is a suitable metric that compares the extrinsics of cameras $g_i$ and $g_j$.

Visually, the problem can be described as follows. The picture below depicts the situation at the beginning of our optimization. The ground truth cameras are plotted in purple while the randomly initialized estimated cameras are plotted in orange:
![Initialization](https://github.com/facebookresearch/pytorch3d/blob/main/docs/tutorials/data/bundle_adjustment_initialization.png?raw=1)

Our optimization seeks to align the estimated (orange) cameras with the ground truth (purple) cameras, by minimizing the discrepancies between pairs of relative cameras. Thus, the solution to the problem should look as follows:
![Solution](https://github.com/facebookresearch/pytorch3d/blob/main/docs/tutorials/data/bundle_adjustment_final.png?raw=1)

In practice, the camera extrinsics $g_{ij}$ and $g_i$ are represented using objects from the `SfMPerspectiveCameras` class initialized with the corresponding rotation and translation matrices `R_absolute` and `T_absolute` that define the extrinsic parameters $g = (R, T); R \in SO(3); T \in \mathbb{R}^3$. In order to ensure that `R_absolute` is a valid rotation matrix, we represent it using an exponential map (implemented with `so3_exp_map`) of the axis-angle representation of the rotation `log_R_absolute`.

Note that the solution to this problem could only be recovered up to an unknown global rigid transformation $g_{glob} \in SE(3)$. Thus, for simplicity, we assume knowledge of the absolute extrinsics of the first camera $g_0$. We set $g_0$ as a trivial camera $g_0 = (I, \vec{0})$.


## 0. Install and Import Modules

Ensure `torch` and `torchvision` are installed. If `pytorch3d` is not installed, install it using the following cell:

In [None]:
# import os
# import sys
# import torch
# import subprocess
# need_pytorch3d=False
# try:
#     import pytorch3d
# except ModuleNotFoundError:
#     need_pytorch3d=True
# if need_pytorch3d:
#     pyt_version_str=torch.__version__.split("+")[0].replace(".", "")
#     version_str="".join([
#         f"py3{sys.version_info.minor}_cu",
#         torch.version.cuda.replace(".",""),
#         f"_pyt{pyt_version_str}"
#     ])
#     !pip install iopath
#     if sys.platform.startswith("linux"):
#         print("Trying to install wheel for PyTorch3D")
#         !pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/{version_str}/download.html
#         pip_list = !pip freeze
#         need_pytorch3d = not any(i.startswith("pytorch3d==") for  i in pip_list)
#     if need_pytorch3d:
#         print(f"failed to find/install wheel for {version_str}")
# if need_pytorch3d:
#     print("Installing PyTorch3D from source")
#     !pip install ninja
#     !pip install 'git+https://github.com/facebookresearch/pytorch3d.git@stable'

In [None]:
# imports
import torch
from pytorch3d.transforms.so3 import (
    so3_exp_map,
    so3_relative_angle,
)
from pytorch3d.renderer.cameras import (
    SfMPerspectiveCameras,
)

# add path for demo utils
import sys
import os
sys.path.append(os.path.abspath(''))

# set for reproducibility
torch.manual_seed(42)
if torch.cuda.is_available():
    device = torch.device("cuda:0")
else:
    device = torch.device("cpu")
    print("WARNING: CPU only, this will be slow!")

If using **Google Colab**, fetch the utils file for plotting the camera scene, and the ground truth camera positions:

In [None]:
!wget https://raw.githubusercontent.com/facebookresearch/pytorch3d/main/docs/tutorials/utils/camera_visualization.py
from camera_visualization import plot_camera_scene

!mkdir data
!wget -P data https://raw.githubusercontent.com/facebookresearch/pytorch3d/main/docs/tutorials/data/camera_graph.pth

OR if running **locally** uncomment and run the following cell:

In [None]:
# from utils import plot_camera_scene

## 1. Set up Cameras and load ground truth positions

In [None]:
# load the SE3 graph of relative/absolute camera positions
camera_graph_file = './data/camera_graph.pth'
(R_absolute_gt, T_absolute_gt), \
    (R_relative, T_relative), \
    relative_edges = \
        torch.load(camera_graph_file)

# create the relative cameras
cameras_relative = SfMPerspectiveCameras(
    R = R_relative.to(device),
    T = T_relative.to(device),
    device = device,
)

# create the absolute ground truth cameras
cameras_absolute_gt = SfMPerspectiveCameras(
    R = R_absolute_gt.to(device),
    T = T_absolute_gt.to(device),
    device = device,
)

# the number of absolute camera positions
N = R_absolute_gt.shape[0]

## 2. Define optimization functions

### Relative cameras and camera distance
We now define two functions crucial for the optimization.

**`calc_camera_distance`** compares a pair of cameras. This function is important as it defines the loss that we are minimizing. The method utilizes the `so3_relative_angle` function from the SO3 API.

**`get_relative_camera`** computes the parameters of a relative camera that maps between a pair of absolute cameras. Here we utilize the `compose` and `inverse` class methods from the PyTorch3D Transforms API.

In [None]:
def calc_camera_distance(cam_1, cam_2):
    """
    Calculates the divergence of a batch of pairs of cameras cam_1, cam_2.
    The distance is composed of the cosine of the relative angle between
    the rotation components of the camera extrinsics and the l2 distance
    between the translation vectors.
    """
    # rotation distance
    R_distance = (1.-so3_relative_angle(cam_1.R, cam_2.R, cos_angle=True)).mean()
    # translation distance
    T_distance = ((cam_1.T - cam_2.T)**2).sum(1).mean()
    # the final distance is the sum
    return R_distance + T_distance

def get_relative_camera(cams, edges):
    """
    For each pair of indices (i,j) in "edges" generate a camera
    that maps from the coordinates of the camera cams[i] to
    the coordinates of the camera cams[j]
    """

    # first generate the world-to-view Transform3d objects of each
    # camera pair (i, j) according to the edges argument
    trans_i, trans_j = [
        SfMPerspectiveCameras(
            R = cams.R[edges[:, i]],
            T = cams.T[edges[:, i]],
            device = device,
        ).get_world_to_view_transform()
         for i in (0, 1)
    ]

    # compose the relative transformation as g_i^{-1} g_j
    trans_rel = trans_i.inverse().compose(trans_j)

    # generate a camera from the relative transform
    matrix_rel = trans_rel.get_matrix()
    cams_relative = SfMPerspectiveCameras(
                        R = matrix_rel[:, :3, :3],
                        T = matrix_rel[:, 3, :3],
                        device = device,
                    )
    return cams_relative

## 3. Optimization
Finally, we start the optimization of the absolute cameras.

We use SGD with momentum and optimize over `log_R_absolute` and `T_absolute`.

As mentioned earlier, `log_R_absolute` is the axis angle representation of the rotation part of our absolute cameras. We can obtain the 3x3 rotation matrix `R_absolute` that corresponds to `log_R_absolute` with:

`R_absolute = so3_exp_map(log_R_absolute)`


In [None]:
# initialize the absolute log-rotations/translations with random entries
log_R_absolute_init = torch.randn(N, 3, dtype=torch.float32, device=device)
T_absolute_init = torch.randn(N, 3, dtype=torch.float32, device=device)

# furthermore, we know that the first camera is a trivial one
#    (see the description above)
log_R_absolute_init[0, :] = 0.
T_absolute_init[0, :] = 0.

# instantiate a copy of the initialization of log_R / T
log_R_absolute = log_R_absolute_init.clone().detach()
log_R_absolute.requires_grad = True
T_absolute = T_absolute_init.clone().detach()
T_absolute.requires_grad = True

# the mask the specifies which cameras are going to be optimized
#     (since we know the first camera is already correct,
#      we only optimize over the 2nd-to-last cameras)
camera_mask = torch.ones(N, 1, dtype=torch.float32, device=device)
camera_mask[0] = 0.

# init the optimizer
optimizer = torch.optim.SGD([log_R_absolute, T_absolute], lr=.1, momentum=0.9)

# run the optimization
n_iter = 5000  # fix the number of iterations
for it in range(n_iter):
    # re-init the optimizer gradients
    optimizer.zero_grad()

    # compute the absolute camera rotations as
    # an exponential map of the logarithms (=axis-angles)
    # of the absolute rotations
    R_absolute = so3_exp_map(log_R_absolute * camera_mask)

    # get the current absolute cameras
    cameras_absolute = SfMPerspectiveCameras(
        R = R_absolute,
        T = T_absolute * camera_mask,
        device = device,
    )

    # compute the relative cameras as a composition of the absolute cameras
    cameras_relative_composed = \
        get_relative_camera(cameras_absolute, relative_edges)

    # compare the composed cameras with the ground truth relative cameras
    # camera_distance corresponds to $d$ from the description
    camera_distance = \
        calc_camera_distance(cameras_relative_composed, cameras_relative)

    # our loss function is the camera_distance
    camera_distance.backward()

    # apply the gradients
    optimizer.step()

    # plot and print status message
    if it % 200==0 or it==n_iter-1:
        status = 'iteration=%3d; camera_distance=%1.3e' % (it, camera_distance)
        plot_camera_scene(cameras_absolute, cameras_absolute_gt, status)

print('Optimization finished.')


## 4. Conclusion

In this tutorial we learnt how to initialize a batch of SfM Cameras, set up loss functions for bundle adjustment, and run an optimization loop.