<a href="https://colab.research.google.com/github/wangjiajiTHU/BlenderNeuralangelo/blob/main/Neuralangelo_%F0%9F%97%BF%F0%9F%8F%9B.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neuralangelo (Colab demo)
This is a Google Colab example for running Neuralangelo.  

<img src="https://github.com/NVlabs/neuralangelo/raw/main/assets/teaser.gif">

**Neuralangelo: High-Fidelity Neural Surface Reconstruction**  
[Zhaoshuo Li](https://mli0603.github.io/),
[Thomas Müller](https://tom94.net/),
[Alex Evans](https://research.nvidia.com/person/alex-evans),
[Russell H. Taylor](https://www.cs.jhu.edu/~rht/),
[Mathias Unberath](https://mathiasunberath.github.io/),
[Ming-Yu Liu](https://mingyuliu.net/),
[Chen-Hsuan Lin](https://chenhsuanlin.bitbucket.io/)  
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

[Project page](https://research.nvidia.com/labs/dir/neuralangelo/) | [Paper](https://arxiv.org/abs/2306.03092/)

### 💻 Get started with the full code! --> [Github repo](https://github.com/nvlabs/neuralangelo)


### Notes
- This is a preview of how Neuralangelo works. For full reproduction of results, please check out our [Github repo](https://github.com/nvlabs/neuralangelo).  
- Please make sure to connect to a runtime session with a GPU device.

First, clone the Neuralangelo repo.

In [None]:
# @title { vertical-output: true }
!git clone https://github.com/nvlabs/neuralangelo
%cd /content/neuralangelo
!git submodule update --init --recursive

Install COLMAP (takes ~2 minutes).

In [None]:
# @title { vertical-output: true }
%cd /content
!apt-get install \
    ninja-build \
    build-essential \
    libboost-program-options-dev \
    libboost-filesystem-dev \
    libboost-graph-dev \
    libboost-system-dev \
    libeigen3-dev \
    libflann-dev \
    libfreeimage-dev \
    libmetis-dev \
    libgoogle-glog-dev \
    libgtest-dev \
    libsqlite3-dev \
    libglew-dev \
    qtbase5-dev \
    libqt5opengl5-dev \
    libcgal-dev \
    libceres-dev
!apt-get install xvfb
# libglvnd is needed for COLMAP to run on Google Colab (https://github.com/colmap/colmap/issues/1271#issuecomment-931900582)
!git clone https://github.com/NVIDIA/libglvnd
%cd /content/libglvnd
!apt-get install libxext-dev libx11-dev x11proto-gl-dev
!apt-get install autoconf automake libtool
!apt-get install libffi-dev
!./autogen.sh
!./configure
!make -j4
!make install
# Download and extract the pre-compiled COLMAP library
%cd /content
!gdown 1PyyNKY2mt4dHlYN5WcPnvWdV2nufQru1
!tar -C /usr -zxf colmap-3.8.tar.gz
# Install other Python libraries for the data preparation scripts
!pip install \
    addict \
    k3d \
    opencv-python-headless \
    pillow \
    plotly \
    pyyaml \
    trimesh

Download the Lego toy example video and visualize.

In [None]:
# @title { vertical-output: true }
%cd /content/neuralangelo
# Download a toy example video. The video will be saved as lego.mp4
!gdown 1yWoZ4Hk3FgmV3pd34ZbW7jEqgqyJgzHy
# Take a look at the video.
from IPython.display import HTML
from base64 import b64encode
mp4 = open("lego.mp4", "rb").read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML(f"""<video src="{data_url}" width=400 controls></video>""")

Preprocess the Lego toy example video (end-to-end version, including COLMAP).
- There are 200 frames in the video. We set `DOWNSAMPLE_RATE=2` to downsample the video and extract 100 frames.
- We set the scene type to `object` (simple object, not modeling varying appearances).
- In most cases, COLMAP should be able to register all the 100 images. In the rare case that it doesn't:
  - Try rerunning it. Structure from motion in COLMAP is not fully deterministic.
  - Neuralangelo can still run with only a subset of registered images, but the quality may degrade.

In [None]:
# @title { vertical-output: true }
%cd /content/neuralangelo
# Set variables.
SEQUENCE = "lego"
PATH_TO_VIDEO = "lego.mp4"
DOWNSAMPLE_RATE = 2
SCENE_TYPE = "object"  # {outdoor,indoor,object}
# Run the script.
colmap_path = f"datasets/{SEQUENCE}_ds{DOWNSAMPLE_RATE}"
!rm -rf {colmap_path}
!bash projects/neuralangelo/scripts/preprocess.sh {SEQUENCE} {PATH_TO_VIDEO} {DOWNSAMPLE_RATE} {SCENE_TYPE}
# Check whether we have 100 images registered.
import os
num_images = len(os.listdir(f"{colmap_path}/images"))
print("----------------------------------------")
print(f"Number of registered images: {num_images}")

Let's inspect the COLMAP results. First, we load the COLMAP data.

In [None]:
# @title { vertical-output: true }
%cd /content/neuralangelo
# Import Python libraries.
import numpy as np
import torch
import k3d
import json
import plotly.graph_objs as go
from collections import OrderedDict
# Import imaginaire modules.
from projects.nerf.utils import camera, visualize
from third_party.colmap.scripts.python.read_write_model import read_model
# Read the COLMAP data.
cameras, images, points_3D = read_model(path=f"{colmap_path}/sparse", ext=".bin")
# Convert camera poses.
images = OrderedDict(sorted(images.items()))
qvecs = torch.from_numpy(np.stack([image.qvec for image in images.values()]))
tvecs = torch.from_numpy(np.stack([image.tvec for image in images.values()]))
Rs = camera.quaternion.q_to_R(qvecs)
poses = torch.cat([Rs, tvecs[..., None]], dim=-1)  # [N,3,4]
print(f"# images: {len(poses)}")
# Get the sparse 3D points and the colors.
xyzs = torch.from_numpy(np.stack([point.xyz for point in points_3D.values()]))
rgbs = np.stack([point.rgb for point in points_3D.values()])
rgbs_int32 = (rgbs[:, 0] * 2**16 + rgbs[:, 1] * 2**8 + rgbs[:, 2]).astype(np.uint32)
print(f"# points: {len(xyzs)}")

This is where you should visualize and adjust the bounding sphere for Neuralangelo.
- Use the forms to tune `readjust_center` and `readjust_scale` to adjust the bounding sphere.
  - The bounding sphere should ideally *just* encapsulate the target object/scene.
  - In the Lego toy example case, setting `readjust_scale=0.5` would be a good choice.
- Also check whether the camera trajectory matches the expectation from the video observation.

In [None]:
# @title { vertical-output: true }
# Visualize the bounding sphere.
json_fname = f"{colmap_path}/transforms.json"
with open(json_fname) as file:
    meta = json.load(file)
center = meta["sphere_center"]
radius = meta["sphere_radius"]
# ------------------------------------------------------------------------------------
# These variables can be adjusted to make the bounding sphere fit the region of interest.
# The adjusted values can then be set in the config as data.readjust.center and data.readjust.scale
readjust_x = 0.  # @param {type:"number"}
readjust_y = 0.  # @param {type:"number"}
readjust_z = 0.  # @param {type:"number"}
readjust_scale = 1.  # @param {type:"number"}
readjust_center = np.array([readjust_x, readjust_y, readjust_z])
# ------------------------------------------------------------------------------------
center += readjust_center
radius *= readjust_scale
# Make some points to hallucinate a bounding sphere.
sphere_points = np.random.randn(100000, 3)
sphere_points = sphere_points / np.linalg.norm(sphere_points, axis=-1, keepdims=True)
sphere_points = sphere_points * radius + center

Visualize the bounding sphere in the 3D interactive visualizer.
- If the bounding sphere doesn't look right, readjust in the above form and rerun the code block.
- You can modify `vis_depth` to adjust the size of the cameras.

In [None]:
# @title { vertical-output: true }
vis_depth = 0.2
# Visualize with Plotly.
x, y, z = *xyzs.T,
colors = rgbs / 255.0
sphere_x, sphere_y, sphere_z = *sphere_points.T,
sphere_colors = ["#4488ff"] * len(sphere_points)
traces_poses = visualize.plotly_visualize_pose(poses, vis_depth=vis_depth, xyz_length=0.02, center_size=0.01, xyz_width=0.005, mesh_opacity=0.05)
trace_points = go.Scatter3d(x=x, y=y, z=z, mode="markers", marker=dict(size=1, color=colors, opacity=1), hoverinfo="skip")
trace_sphere = go.Scatter3d(x=sphere_x, y=sphere_y, z=sphere_z, mode="markers", marker=dict(size=0.5, color=sphere_colors, opacity=0.7), hoverinfo="skip")
traces_all = traces_poses + [trace_points, trace_sphere]
layout = go.Layout(scene=dict(xaxis=dict(showspikes=False, backgroundcolor="rgba(0,0,0,0)", gridcolor="rgba(0,0,0,0.1)"),
                              yaxis=dict(showspikes=False, backgroundcolor="rgba(0,0,0,0)", gridcolor="rgba(0,0,0,0.1)"),
                              zaxis=dict(showspikes=False, backgroundcolor="rgba(0,0,0,0)", gridcolor="rgba(0,0,0,0.1)"),
                              xaxis_title="X", yaxis_title="Y", zaxis_title="Z", dragmode="orbit",
                              aspectratio=dict(x=1, y=1, z=1), aspectmode="data"), height=800)
fig = go.Figure(data=traces_all, layout=layout)
fig.show()

Now we are ready to run Neuralangelo. First, we install PyTorch and other libraries (takes ~1 minute).
- Ignore the warning about restarting the runtime for `ipywidgets`.

In [None]:
# @title { vertical-output: true }
%cd /content/neuralangelo
# Install PyTorch.
!pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
# Install Neuralangelo dependencies (excluding tiny-cuda-nn).
with open("requirements.txt") as file, open("requirements1.txt", "w") as file1:
    for line in file:
        if "tiny-cuda-nn" in line:
            continue
        file1.write(line)
!pip install -r requirements1.txt
# Download and extract the pre-compiled tiny-cuda-nn.
%cd /content
!gdown 1Ah-SKJufHE6BmjF96ic-zUjZ_ic90JZ5
!pip install tinycudann-1.7-cp310-cp310-linux_x86_64.whl

Let's run Neuralangelo! We use a simplified setup by adjusting the following hyperparameters:
- `max_iter` to 20k optimization steps.
- Disabling validation steps (by setting `validation_iter` to a large number).
- Smaller `model.object.sdf.encoding.coarse2fine.step` to add progressive levels faster.
- Smaller `model.object.sdf.encoding.hashgrid.dict_size` since the object is relatively less complex.
- Setting `data.readjust.scale=0.5` as suggested above.

Neuralangelo under this setup takes ~2 hours on a T4 GPU.  
(Maybe grab a ☕️ or 🧋 while we wait!)

In [None]:
# @title { vertical-output: true }
%cd /content/neuralangelo
GROUP = "test_exp"
NAME = "lego"
!torchrun --nproc_per_node=1 train.py \
    --logdir=logs/{GROUP}/{NAME} \
    --show_pbar \
    --config=projects/neuralangelo/configs/custom/lego.yaml \
    --data.readjust.scale=0.5 \
    --max_iter=20000 \
    --validation_iter=99999999 \
    --model.object.sdf.encoding.coarse2fine.step=200 \
    --model.object.sdf.encoding.hashgrid.dict_size=19 \
    --optim.sched.warm_up_end=200 \
    --optim.sched.two_steps=[12000,16000]

Finally, let's extract the 3D mesh!
- We default `--resolution` (3D mesh resolution) to 300 due to the constraints of Colab.
- Add `--textured` to extract the colors as well.

In [None]:
# @title { vertical-output: true }
%cd /content/neuralangelo
mesh_fname = f"logs/{GROUP}/{NAME}/mesh.ply"
!torchrun --nproc_per_node=1 projects/neuralangelo/scripts/extract_mesh.py \
    --config=logs/{GROUP}/{NAME}/config.yaml \
    --checkpoint=logs/{GROUP}/{NAME}/epoch_00400_iteration_000020000_checkpoint.pt \
    --output_file={mesh_fname} \
    --resolution=300 --block_res=128 \
    --textured

Visualize our final textured 3D mesh! 🤩

In [None]:
# @title { vertical-output: true }
import numpy as np
import trimesh
# Load the mesh.
mesh = trimesh.load(mesh_fname)
print(f"# vertices: {len(mesh.vertices)}")
print(f"# faces: {len(mesh.faces)}")
# Create a Trimesh scene and visualize the mesh.
scene = trimesh.Scene()
scene.add_geometry(mesh)
scene.show()

Some final notes (on the resulting bulldozer model)
- It looks darker due to insufficient lighting in the 3D visualizer.
- The quality of the reconstructed geometry can be further improved by
    - training longer
    - training with more GPUs
    - extracting the mesh with higher resolution
- There are random white blobs around the bulldozer. This is normal because the geometry is ambiguous in the white background regions (without other regularizations). It is typically not an issue for real-world video captures.

### 💻 Get started with the full code! --> [Github repo](https://github.com/nvlabs/neuralangelo)