# ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization
    
<img src="https://raw.githubusercontent.com/zubair-irshad/shapo/master/demo/mesh_models.png" width=70% height=auto>

<center>
    
Made by [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/zubairirshad.svg?style=social&label=Follow%20%40zubairirshad)](https://twitter.com/mzubairirshad)

Code in [![GitHub stars](https://img.shields.io/github/stars/zubair-irshad/shapo?style=social)](https://github.com/zubair-irshad/shapo)

Page at [![](https://img.shields.io/badge/Project-Page-blue?style=flat&logo=Google%20chrome&logoColor=blue)](https://zubair-irshad.github.io/projects/ShAPO.html)

</center>




In [1]:
import IPython

IPython.display.HTML('<h2>5-Minute Presentation</h2><iframe width="560" height="315" src="https://www.youtube.com/embed/LMg7NDcLDcA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>')

### Explore Inference and Optimization of ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization

This is a colab to explore ShAPO inference and optimization properties, proposed in our work [ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization](https://zubair-irshad.github.io/projects/ShAPO.html).
#### Make sure that you have enabled the GPU under Runtime-> Change runtime type!


We will then reproduce the following results from the paper:

1. [**Single Shot inference**](#Single-Shot-inference)

    1.1 [Visualize peak and depth output](#Visualize-Peaks-and-Depth-output)
    
    1.2 [Decode shape with predicted textures from shape and appearance embeddings](#Decode-shape-with-predicted-textures-from-shape-and-appearance-embeddings)
    
    1.3 [Project 3D Pointclouds and 3D bounding boxes on 2D image](#Project-3D-Pointclouds-and-3D-bounding-boxes-on-2D-image)
    
    
2. [**Shape, Appearance and Pose Optimization**](#Shape-Appearance-and-Pose-Optimization)

    2.1 [Core optimization loop](Core-optimization-loop)
    
    2.2 [Viusalizing optimized 3D output](Viusalizing-optimized-3D-output)

Let's get started! The whole notebook takes ~5 minutes or so to run.


In [4]:
!apt-get install python3.8
!apt-get update -y
!update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1
!update-alternatives --config python3 <<< 3
# select python version
!apt install python3-pip
!apt install python3.8-distutils
!python --version

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
python3.8 is already the newest version (3.8.20-1+jammy1).
0 upgraded, 0 newly installed, 0 to remove and 37 not upgraded.
Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Get:3 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:5 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Hit:7 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:8 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:9 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Get:10 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB]
Fetched 384 kB in 1s (380 kB/s)
Reading pa

In [5]:
!apt-get --purge remove cuda nvidia* libnvidia-*
!dpkg -l | grep cuda- | awk '{print $2}' | xargs -n1 dpkg --purge
!apt-get remove cuda-*
!apt autoremove
!apt-get update

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Note, selecting 'nvidia-driver-550-server' for glob 'nvidia*'
Note, selecting 'nvidia-firmware-550-server-550.144.03' for glob 'nvidia*'
Note, selecting 'nvidia-kernel-source-575-open' for glob 'nvidia*'
Note, selecting 'nvidia-firmware-535-535.154.05' for glob 'nvidia*'
Note, selecting 'nvidia-docker2' for glob 'nvidia*'
Note, selecting 'nvidia-firmware-560-server-560.28.03' for glob 'nvidia*'
Note, selecting 'nvidia-driver-570-server' for glob 'nvidia*'
Note, selecting 'nvidia-cuda-toolkit-doc' for glob 'nvidia*'
Note, selecting 'nvidia-imex' for glob 'nvidia*'
Note, selecting 'nvidia-dkms-450-server' for glob 'nvidia*'
Note, selecting 'nvidia-firmware-535-server-535.154.05' for glob 'nvidia*'
Note, selecting 'nvidia-headless-390' for glob 'nvidia*'
Note, selecting 'nvidia-cuda-toolkit-gcc' for glob 'nvidia*'
Note, selecting 'nvidia-headless-418' for glob 'nvidia*'
Note, selecting 'nvidia

In [6]:
!wget https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb

--2025-07-05 21:25:51--  https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
Resolving developer.nvidia.com (developer.nvidia.com)... 23.211.118.193, 23.211.118.195
Connecting to developer.nvidia.com (developer.nvidia.com)|23.211.118.193|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://developer.nvidia.com/downloads/compute/cuda/10.1/prod/local_installers/cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb [following]
--2025-07-05 21:25:51--  https://developer.nvidia.com/downloads/compute/cuda/10.1/prod/local_installers/cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
Reusing existing connection to developer.nvidia.com:443.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://developer.download.nvidia.com/compute/cuda/10.1/secure/Prod/local_installers/cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1

In [7]:
!sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
!apt-get update

OK
Get:1 file:/var/cuda-repo-10-1-local-10.1.105-418.39  InRelease
Ign:1 file:/var/cuda-repo-10-1-local-10.1.105-418.39  InRelease
Get:2 file:/var/cuda-repo-10-1-local-10.1.105-418.39  Release [574 B]
Get:2 file:/var/cuda-repo-10-1-local-10.1.105-418.39  Release [574 B]
Get:3 file:/var/cuda-repo-10-1-local-10.1.105-418.39  Release.gpg [833 B]
Get:3 file:/var/cuda-repo-10-1-local-10.1.105-418.39  Release.gpg [833 B]
Hit:4 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Get:5 file:/var/cuda-repo-10-1-local-10.1.105-418.39  Packages [24.3 kB]
Hit:6 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:7 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Hit:8 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:9 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:10 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:11 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:12 https://ppa.launchpadcontent.net/graphic

In [7]:
!apt-cache madison cuda

      cuda | 10.1.105-1 | file:/var/cuda-repo-10-1-local-10.1.105-418.39  Packages


In [8]:
!apt-get install cuda=10.1.105-1

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  cpp-12 cuda-10-1 cuda-command-line-tools-10-1 cuda-compiler-10-1
  cuda-cudart-10-1 cuda-cudart-dev-10-1 cuda-cufft-10-1 cuda-cufft-dev-10-1
  cuda-cuobjdump-10-1 cuda-cupti-10-1 cuda-curand-10-1 cuda-curand-dev-10-1
  cuda-cusolver-10-1 cuda-cusolver-dev-10-1 cuda-cusparse-10-1
  cuda-cusparse-dev-10-1 cuda-demo-suite-10-1 cuda-documentation-10-1
  cuda-driver-dev-10-1 cuda-drivers cuda-gdb-10-1
  cuda-gpu-library-advisor-10-1 cuda-libraries-10-1 cuda-libraries-dev-10-1
  cuda-license-10-1 cuda-memcheck-10-1 cuda-misc-headers-10-1 cuda-npp-10-1
  cuda-npp-dev-10-1 cuda-nsight-10-1 cuda-nsight-compute-10-1
  cuda-nsight-systems-10-1 cuda-nvcc-10-1 cuda-nvdisasm-10-1 cuda-nvgraph-10-1
  cuda-nvgraph-dev-10-1 cuda-nvjpeg-10-1 cuda-nvjpeg-dev-10-1
  cuda-nvml-dev-10-1 cuda-nvprof-10-1 cuda-nvprune-10-1 cuda-nvrtc-10-1
  cuda-nvrtc-dev-10-1

In [9]:
!nvcc --version
# 10.1

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105


In [10]:
!pip install -U setuptools

Collecting setuptools
  Downloading setuptools-75.3.2-py3-none-any.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m17.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 68.1.2
    Not uninstalling setuptools at /usr/lib/python3/dist-packages, outside environment /usr
    Can't uninstall 'setuptools'. No files were found to uninstall.
Successfully installed setuptools-75.3.2
[0m

In [11]:
#Run this on Google Colab
!git clone https://github.com/zubair-irshad/shapo.git

Cloning into 'shapo'...
remote: Enumerating objects: 504, done.[K
remote: Counting objects: 100% (123/123), done.[K
remote: Compressing objects: 100% (58/58), done.[K
remote: Total 504 (delta 78), reused 93 (delta 65), pack-reused 381 (from 1)[K
Receiving objects: 100% (504/504), 37.68 MiB | 39.50 MiB/s, done.
Resolving deltas: 100% (273/273), done.


In [19]:
!pip install --upgrade pip
!cd shapo && pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html -v
!pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Using pip 25.0.1 from /usr/local/lib/python3.8/dist-packages/pip (python 3.8)
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting argparse (from -r requirements.txt (line 1))
  Obtaining dependency information for argparse from https://files.pythonhosted.org/packages/f2/94/3af39d34be01a24a6e65433d19e107099374224905f1e0cc6bbe1fd22a2f/argparse-1.4.0-py2.py3-none-any.whl.metadata
  Using cached argparse-1.4.0-py2.py3-none-any.whl.metadata (2.8 kB)
Collecting trimesh (from -r requirements.txt (line 3))
  Obtaining dependency information for trimesh from https://files.pythonhosted.org/packages/90/2f/03e3829bf98fdf77b8669174957f47e8a977385ffd2c810e818a603a32ca/trimesh-4.6.13-py3-none-any.whl.metadata
  Using cached trimesh-4.6.13-py3-none-any.whl.metadata (18 kB)
[31mERROR: Operation cancelled by user[0m[31m
[0mLooking in links: https://download.pytorch.org/whl/torch_stable.html


In [None]:
!cd shapo && wget https://www.dropbox.com/s/cvqyhr67zpxyq36/test_subset.tar.xz?dl=1 -O test_subset.tar.xz && tar -xvf test_subset.tar.xz
!cd shapo && wget https://www.dropbox.com/s/929kz7zuxw8jajy/sdf_rgb_pretrained.tar.xz?dl=1 -O sdf_rgb_pretrained.tar.xz && tar -xvf sdf_rgb_pretrained.tar.xz
!cd shapo && wget https://www.dropbox.com/s/nrsl67ir6fml9ro/ckpts.tar.xz?dl=1 -O ckpts.tar.xz && tar -xvf ckpts.tar.xz
!cd shapo && mkdir test_data && mv test_subset/* test_data && mv sdf_rgb_pretrained test_data

In [14]:
!cd shapo && pip install --ignore-installed open3d

Collecting open3d
  Using cached open3d-0.19.0-cp38-cp38-manylinux_2_31_x86_64.whl.metadata (4.3 kB)
Collecting numpy>=1.18.0 (from open3d)
  Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting dash>=2.6.0 (from open3d)
  Using cached dash-3.1.1-py3-none-any.whl.metadata (10 kB)
Collecting werkzeug>=3.0.0 (from open3d)
  Using cached werkzeug-3.0.6-py3-none-any.whl.metadata (3.7 kB)
Collecting flask>=3.0.0 (from open3d)
  Using cached flask-3.0.3-py3-none-any.whl.metadata (3.2 kB)
Collecting nbformat>=5.7.0 (from open3d)
  Using cached nbformat-5.10.4-py3-none-any.whl.metadata (3.6 kB)
Collecting configargparse (from open3d)
  Using cached configargparse-1.7.1-py3-none-any.whl.metadata (24 kB)
Collecting ipywidgets>=8.0.4 (from open3d)
  Using cached ipywidgets-8.1.7-py3-none-any.whl.metadata (2.4 kB)
Collecting addict (from open3d)
  Using cached addict-2.4.0-py3-none-any.whl.metadata (1.0 kB)
Collecting pillow>=9.3.0 (from o

In [16]:
!pip list

Package                   Version
------------------------- -------------
addict                    2.4.0
asttokens                 3.0.0
attrs                     25.3.0
backcall                  0.2.0
blinker                   1.8.2
cachetools                5.5.2
certifi                   2025.6.15
charset-normalizer        3.4.2
click                     8.1.8
colour                    0.1.5
comm                      0.2.2
ConfigArgParse            1.7.1
configparser              7.1.0
contourpy                 1.1.1
cryptography              3.4.8
cycler                    0.12.1
dash                      3.1.1
dbus-python               1.2.18
decorator                 5.2.1
distro                    1.7.0
eval_type_backport        0.2.2
executing                 2.2.0
fastjsonschema            2.21.1
Flask                     3.0.3
fonttools                 4.57.0
future                    1.0.0
grpcio                    1.70.0
httplib2                  0.20.2
idna               

# Dependencies

In [17]:
import argparse
import pathlib
import cv2
import numpy as np
import torch
import torch.nn.functional as F
import open3d as o3d
import matplotlib.pyplot as plt
import os
import time
import pytorch_lightning as pl
import _pickle as cPickle
import os, sys
sys.path.append('shapo')
from simnet.lib.net import common
from simnet.lib import camera
from simnet.lib.net.panoptic_trainer import PanopticModel
from utils.nocs_utils import load_img_NOCS, create_input_norm
from utils.viz_utils import depth2inv, viz_inv_depth
from utils.transform_utils import get_gt_pointclouds, transform_coordinates_3d, calculate_2d_projections
from utils.transform_utils import project, get_pc_absposes, transform_pcd_to_canonical
from utils.viz_utils import save_projected_points, draw_bboxes, line_set_mesh, display_gird, draw_geometries, show_projected_points
from sdf_latent_codes.get_surface_pointcloud import get_surface_pointclouds_octgrid_viz, get_surface_pointclouds
from sdf_latent_codes.get_rgb import get_rgbnet, get_rgb_from_rgbnet

ModuleNotFoundError: No module named 'open3d'

# ShAPO Model (Setup)

In [None]:
sys.argv = ['', '@shapo/configs/net_config.txt']
parser = argparse.ArgumentParser(fromfile_prefix_chars='@')
common.add_train_args(parser)
app_group = parser.add_argument_group('app')
app_group.add_argument('--app_output', default='inference', type=str)
app_group.add_argument('--result_name', default='shapo_inference', type=str)
app_group.add_argument('--data_dir', default='shapo/test_data', type=str)

hparams = parser.parse_args()
min_confidence = 0.50
use_gpu=True
hparams.checkpoint = 'shapo/ckpts/shapo_real.ckpt'
model = PanopticModel(hparams, 0, None, None)
model.eval()
if use_gpu:
    model.cuda()
data_path = open(os.path.join(hparams.data_dir, 'Real', 'test_list_subset.txt')).read().splitlines()
_CAMERA = camera.NOCS_Real()
sdf_pretrained_dir = os.path.join(hparams.data_dir, 'sdf_rgb_pretrained')
rgb_model_dir = os.path.join(hparams.data_dir, 'sdf_rgb_pretrained', 'rgb_net_weights')

# Single Shot inference
 Note that how this part is similar to [CenterSnap](https://zubair-irshad.github.io/projects/CenterSnap.html) and we predict *SDF embeddings* instead of *pointcloud embeddings*. We further predict *appearance embeddings* and *segmentation masks* as well for downstream optimization

In [None]:
#num from 0 to 3 (small subset of data)
num = 0
img_full_path = os.path.join(hparams.data_dir, 'Real', data_path[num])
img_vis = cv2.imread(img_full_path + '_color.png')

left_linear, depth, actual_depth = load_img_NOCS(img_full_path + '_color.png' , img_full_path + '_depth.png')
input = create_input_norm(left_linear, depth)[None, :, :, :]

if use_gpu:
    input = input.to(torch.device('cuda:0'))

with torch.no_grad():
    seg_output, _, _ , pose_output = model.forward(input)
    _, _, _ , pose_output = model.forward(input)
    shape_emb_outputs, appearance_emb_outputs, abs_pose_outputs, peak_output, scores_out, output_indices = pose_output.compute_shape_pose_and_appearance(min_confidence,is_target = False)

### Visualize Peaks and Depth output

In [None]:
display_gird(img_vis, depth, peak_output)

## Decode shape with predicted textures from shape and appearance embeddings



**Note:** The expected output here is colored pointclouds. Although our shape representation is implicit (i.e. SDF), we only output pointclouds here for computational reasons (i.e. marching cubes output would take some time). If you are interested in getting a mesh, please see save_mesh function in `save_canonical_mesh.py`.

`Click on orbital rotation on the top right side to move the colored pointclouds smoothly.`

**Note:** The shape and pose predictions are really good from the single-shot prediction whereas you'll see appearance embeddings doesn't seem to be there yet. Hence we will perform optimization giving a single-view RGB-D. Please see **4.**

In [None]:
rotated_pcds = []
points_2d = []
box_obb = []
axes = []
lod = 7 # Choose from LOD 3-7 here, going higher means more memory and finer details

# Here we visualize the output of our network
for j in range(len(shape_emb_outputs)):
    shape_emb = shape_emb_outputs[j]
    # appearance_emb = appearance_emb_putputs[j]
    appearance_emb = appearance_emb_outputs[j]
    is_oct_grid = True
    if is_oct_grid:
        # pcd_dsdf_actual = get_surface_pointclouds_octgrid_sparse(shape_emb, sdf_latent_code_dir = sdf_pretrained_dir, lods=[2,3,4,5,6])
        pcd_dsdf, nrm_dsdf = get_surface_pointclouds_octgrid_viz(shape_emb, lod=lod, sdf_latent_code_dir=sdf_pretrained_dir)
    else:
        pcd_dsdf = get_surface_pointclouds(shape_emb)
    rgbnet = get_rgbnet(rgb_model_dir)
    pred_rgb = get_rgb_from_rgbnet(shape_emb, pcd_dsdf, appearance_emb, rgbnet)
    rotated_pc, rotated_box, _ = get_pc_absposes(abs_pose_outputs[j], pcd_dsdf)
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(np.copy(rotated_pc))
    pcd.colors = o3d.utility.Vector3dVector(pred_rgb.detach().cpu().numpy())
    pcd.normals = o3d.utility.Vector3dVector(nrm_dsdf)
    rotated_pcds.append(pcd)

    cylinder_segments = line_set_mesh(rotated_box)
    # draw 3D bounding boxes around the object
    for k in range(len(cylinder_segments)):
      rotated_pcds.append(cylinder_segments[k])

    # draw 3D coordinate frames around each object
    mesh_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.1, origin=[0, 0, 0])
    T = abs_pose_outputs[j].camera_T_object
    mesh_t = mesh_frame.transform(T)
    rotated_pcds.append(mesh_t)

    points_mesh = camera.convert_points_to_homopoints(rotated_pc.T)
    points_2d.append(project(_CAMERA.K_matrix, points_mesh).T)
    #2D output
    points_obb = camera.convert_points_to_homopoints(np.array(rotated_box).T)
    box_obb.append(project(_CAMERA.K_matrix, points_obb).T)
    xyz_axis = 0.3*np.array([[0, 0, 0], [0, 0, 1], [0, 1, 0], [1, 0, 0]]).transpose()
    sRT = abs_pose_outputs[j].camera_T_object @ abs_pose_outputs[j].scale_matrix
    transformed_axes = transform_coordinates_3d(xyz_axis, sRT)
    axes.append(calculate_2d_projections(transformed_axes, _CAMERA.K_matrix[:3,:3]))
draw_geometries(rotated_pcds)

## Project 3D Pointclouds and 3D bounding boxes on 2D image

In [None]:
color_img = np.copy(img_vis)
projected_points_img = show_projected_points(color_img, points_2d)
colors_box = [(63, 237, 234)]
im = np.array(np.copy(img_vis)).copy()
for k in range(len(colors_box)):
    for points_2d, axis in zip(box_obb, axes):
        points_2d = np.array(points_2d)
        im = draw_bboxes(im, points_2d, axis, colors_box[k])

plt.gca().invert_yaxis()
plt.axis('off')
plt.imshow(im[...,::-1])
plt.show()

# Shape Appearance and Pose Optimization

Here we run the core optimization loop i.e. update the shape, appearance latent codes as well as absolute poses to fit the single-view test-time RGB-D observation better

In [None]:
# First define some params for optim and import relevant functions
from sdf_latent_codes.get_surface_pointcloud import get_sdfnet
from sdf_latent_codes.get_rgb import get_rgbnet
from utils.transform_utils import get_abs_pose_vector_from_matrix, get_abs_pose_from_vector
from utils.nocs_utils import get_masks_out, get_aligned_masks_segout, get_masked_textured_pointclouds
from opt.optimization_all import Optimizer

optimization_out = {}
latent_opt = []
RT_opt = []
scale_opt = []

do_optim = True
latent_opt = []
RT_opt = []
scale_opt = []
appearance_opt = []
colored_opt_pcds = []
colored_opt_meshes = []
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
psi, theta, phi, t = (0, 0, 0, 0)
shape_latent_noise = np.random.normal(loc=0, scale=0.02, size=64)
add_noise = False
viz_type = None

# get masks and masked pointclouds of each object in the image
depth_ = np.array(depth, dtype=np.float32)*255.0
seg_output.convert_to_numpy_from_torch()
masks_out = get_masks_out(seg_output, depth_)
masks_out = get_aligned_masks_segout(masks_out, output_indices, depth_)
masked_pointclouds, areas, masked_rgb = get_masked_textured_pointclouds(masks_out, depth_, left_linear[:,:,::-1], camera = _CAMERA)


In [None]:
#helper function to draw textured shape with absolute pose after optimization loop
def draw_colored_shape(emb, abs_pose, appearance_emb, rgbnet, sdf_latent_code_dir, is_oct_grid= False):
    if is_oct_grid:
        lod = 7
        pcd_dsdf, nrm_dsdf = get_surface_pointclouds_octgrid_viz(emb, lod=lod, sdf_latent_code_dir = sdf_latent_code_dir)
    else:
        pcd_dsdf = get_surface_pointclouds(emb)

    pred_rgb = get_rgb_from_rgbnet(emb, pcd_dsdf, appearance_emb, rgbnet)
    #pred_rgb = get_rgb(emb, pcd_dsdf, appearance_emb)

    rotated_pc, rotated_box, _ = get_pc_absposes(abs_pose, pcd_dsdf)
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(np.copy(rotated_pc))
    pcd.colors = o3d.utility.Vector3dVector(pred_rgb.detach().cpu().numpy())
    pcd.normals = o3d.utility.Vector3dVector(nrm_dsdf)
    return pcd

## Core optimization loop

This script will take a couple of minutes to run per image. Note that you can playaround with optimization parameters for best speed/accuracy trade-off i.e. setting a lower LoD or setting number of optimization steps to 100 would suffice in most cases

In [None]:
#Core optimization loop
for k in range(len(shape_emb_outputs)):
  print("Starting optimization, object:", k, "\n", "----------------------------", "\n")
  if viz_type is not None:
      optim_foldername = str(output_path) + '/optim_images_'+str(k)
      if not os.path.exists(optim_foldername):
          os.makedirs(optim_foldername)
  else:
    optim_foldername = None

  #optimization starts here:
  abs_pose = abs_pose_outputs[k]
  mask_area = areas[k]
  RT, s = get_abs_pose_vector_from_matrix(abs_pose.camera_T_object, abs_pose.scale_matrix, add_noise = False)

  if masked_pointclouds[k] is not None:
    shape_emb = shape_emb_outputs[k]
    appearance_emb = appearance_emb_outputs[k]
    decoder = get_sdfnet(sdf_latent_code_dir = sdf_pretrained_dir)
    rgbnet = get_rgbnet(rgb_model_dir)
    params = {}
    weights = {}

    if add_noise:
      shape_emb += shape_latent_noise

    #Set latent vectors/abs pose to optimize here
    params['latent'] = shape_emb
    params['RT'] = RT
    params['scale'] = np.array(s)
    params['appearance'] = appearance_emb
    weights['3d'] = 1

    optimizer = Optimizer(params, rgbnet, device, weights, mask_area)
    # Optimize the initial pose estimate
    iters_optim = 200
    optimizer.optimize_oct_grid(
        iters_optim,
        masked_pointclouds[k],
        masked_rgb[k],
        decoder,
        rgbnet,
        optim_foldername,
        viz_type=viz_type
    )

    #save latent vectors after optimization
    latent_opt.append(params['latent'].detach().cpu().numpy())
    RT_opt.append(params['RT'].detach().cpu().numpy())
    scale_opt.append(params['scale'].detach().cpu().numpy())
    appearance_opt.append(params['appearance'].detach().cpu().numpy())
    abs_pose = get_abs_pose_from_vector(params['RT'].detach().cpu().numpy(), params['scale'].detach().cpu().numpy())
    obj_colored = draw_colored_shape(params['latent'].detach().cpu().numpy(), abs_pose, params['appearance'].detach().cpu().numpy(), rgbnet, sdf_pretrained_dir, is_oct_grid=True)
    colored_opt_pcds.append(obj_colored)
  else:
    latent_opt.append(shape_emb_outputs[k])
    RT_opt.append(RT)
    scale_opt.append(np.array(s))
    appearance_opt.append(appearance_emb_outputs[k])
    print("Done with optimization, object:", k, "\n", "----------------------------", "\n")

## Viusalizing optimized 3D output

Finally we visualize the optimized 3D shape, appearance and poses. Notice the difference from regressed output specially appearance

In [None]:
draw_geometries(colored_opt_pcds)