<a href="https://colab.research.google.com/github/VigneshBaskar/forfun/blob/master/Copy_of_ECCV2022_Implicitron_data_PUBLIC.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Copyright (c) Meta Platforms, Inc. and affiliates. All rights reserved.

# Demo data
In this demo, we show how to access CO3D (v2) dataset using the provided data loaders. We also visualise the images, COLMAP-reconstructed point clouds, and cameras.

## 0. Install and import modules

Ensure `torch` and `torchvision` are installed. If `pytorch3d` is not installed, install it using the following cell:


In [2]:
import os
import sys
import torch
need_pytorch3d=False
try:
    import pytorch3d
except ModuleNotFoundError:
    need_pytorch3d=True
if need_pytorch3d:
    if torch.__version__.startswith("1.12.") and sys.platform.startswith("linux"):
        # We try to install PyTorch3D via a released wheel.
        pyt_version_str=torch.__version__.split("+")[0].replace(".", "")
        version_str="".join([
            f"py3{sys.version_info.minor}_cu",
            torch.version.cuda.replace(".",""),
            f"_pyt{pyt_version_str}"
        ])
        !pip install fvcore iopath omegaconf
        !pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/{version_str}/download.html
    else:
        # We try to install PyTorch3D from source.
        !curl -LO https://github.com/NVIDIA/cub/archive/1.10.0.tar.gz
        !tar xzf 1.10.0.tar.gz
        os.environ["CUB_HOME"] = os.getcwd() + "/cub-1.10.0"
        !pip install 'git+https://github.com/facebookresearch/pytorch3d.git@stable'

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting fvcore
  Downloading fvcore-0.1.5.post20220512.tar.gz (50 kB)
[K     |████████████████████████████████| 50 kB 6.7 MB/s 
[?25hCollecting iopath
  Downloading iopath-0.1.10.tar.gz (42 kB)
[K     |████████████████████████████████| 42 kB 1.1 MB/s 
[?25hCollecting omegaconf
  Downloading omegaconf-2.2.3-py3-none-any.whl (79 kB)
[K     |████████████████████████████████| 79 kB 7.4 MB/s 
Collecting yacs>=0.1.6
  Downloading yacs-0.1.8-py3-none-any.whl (14 kB)
Collecting portalocker
  Downloading portalocker-2.6.0-py2.py3-none-any.whl (15 kB)
Collecting antlr4-python3-runtime==4.9.*
  Downloading antlr4-python3-runtime-4.9.3.tar.gz (117 kB)
[K     |████████████████████████████████| 117 kB 49.1 MB/s 
[?25hBuilding wheels for collected packages: fvcore, iopath, antlr4-python3-runtime
  Building wheel for fvcore (setup.py) ... [?25l[?25hdone
  Created wheel for fvcore: filename=fv

Looking in links: https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py37_cu113_pyt1121/download.html
Collecting pytorch3d
  Downloading https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py37_cu113_pyt1121/pytorch3d-0.7.1-cp37-cp37m-linux_x86_64.whl (47.2 MB)
[K     |████████████████████████████████| 47.2 MB 2.0 MB/s 
Installing collected packages: pytorch3d
Successfully installed pytorch3d-0.7.1


In [3]:
import base64
import IPython
import imageio
import numpy as np
import torch
from PIL import Image
from pytorch3d.implicitron.dataset.json_index_dataset_map_provider_v2 import JsonIndexDatasetMapProviderV2
from pytorch3d.implicitron.tools.config import expand_args_fields
from pytorch3d.renderer import join_cameras_as_batch
from pytorch3d.vis.plotly_vis import plot_batch_individually, plot_scene

Here is a subset of CO3D: one apple sequence. We provide it separately to save time and space.

In [None]:
# prepare the data
!wget https://dl.fbaipublicfiles.com/pytorch3d/data/implicitron_tutorial/co3d_apple_1_sequence.tar.gz
!tar -xzf co3d_apple_1_sequence.tar.gz

## 1. Calling data loaders

In [None]:
output_resolution = 80
torch.set_printoptions(sci_mode=False)

In [None]:
CO3D_ROOT = "."

dataset_provider = JsonIndexDatasetMapProviderV2(
    category="apple",
    subset_name="manyview_dev_0",
    dataset_root=CO3D_ROOT,
    dataset_JsonIndexDataset_args={
        "load_depths": False,
        "load_point_clouds": True,
    },
)
# this is a lightweight provider that loads only metadata but not bolbs like images
# e.g. at test time, images can be hidden or unknown.
dataset_provider_cameras_only = JsonIndexDatasetMapProviderV2(
    category="apple",
    subset_name="manyview_dev_0",
    dataset_root=CO3D_ROOT,
    dataset_JsonIndexDataset_args={
        "box_crop": False,
        "load_images": False,
        "load_depths": False,
        "load_depth_masks": False,
        "load_masks": False,
    },
)

In [None]:
dataset_map = dataset_provider.get_dataset_map()

In [None]:
train_list  = list(dataset_map.train)

### What is FrameData?

The elements of `train_list` are `FrameData` dataclass objects that
contain all data fields relevant to the current image / viewpoint.

```python
@dataclass
class FrameData(Mapping[str, Any]):
    image_path: Union[str, List[str], None] = None
    image_rgb: Optional[torch.Tensor] = None
    depth_map: Optional[torch.Tensor] = None
    bbox_xywh: Optional[torch.Tensor] = None
    camera: Optional[PerspectiveCameras] = None
    # many more fields
    ...
```

They can be collated with `FrameData.collate` and unpacked with `**`.
Below, we will be using `image_rgb` and `camera` fields.

## 2. Visualisation

In [None]:
def to_numpy_image(image):
    # Takes an image of shape (C, H, W) in [0,1], where C=3 or 1
    # to a numpy uint image of shape (H, W, 3)
    return (image * 255).to(torch.uint8).permute(1, 2, 0).detach().cpu().expand(-1, -1, 3).numpy()
def resize_image(image):
    # Takes images of shape (B, C, H, W) to (B, C, output_resolution, output_resolution)
    return torch.nn.functional.interpolate(image, size=(output_resolution, output_resolution))

def show_gif(fname):
    """Show a gif in a bento notebook"""
    with open(fname, "rb") as fd:
        b64 = base64.b64encode(fd.read()).decode("ascii")
    return IPython.display.HTML(f'<img src="data:image/gif;base64,{b64}" />')
    
images_to_display = [[to_numpy_image(resize_image(a.image_rgb[None])[0])] for a in train_list]
n_rows = 7
n_images = len(images_to_display)
blank_image = images_to_display[0][0] * 0
n_per_row = 1+(n_images-1)//n_rows
for _ in range(n_per_row*n_rows - n_images):
    images_to_display.append([blank_image])

split = []
for row in range(n_rows):
    split.append(images_to_display[row*n_per_row:(row+1)*n_per_row])  


In [None]:
plot_batch_individually(dataset_map.train[0].sequence_point_cloud)

Let’s now show *training* images.

In [None]:
Image.fromarray(np.block(split))

In [None]:
imageio.mimsave('renders.gif', [im[0] for im in images_to_display])
show_gif('renders.gif')

In [None]:
tr_cameras = [training_frame.camera for training_frame in dataset_map.train]
plot = plot_scene({"k": {i: camera for i, camera in enumerate(tr_cameras)}}, camera_scale=0.25)
plot.layout.scene.aspectmode = "data"
plot

In [None]:
union = {i: camera for i, camera in enumerate(tr_cameras)}
union["cloud"] = dataset_map.train[0].sequence_point_cloud
plot_scene({"k": union})

In [None]:
dataset_map_cameras = dataset_provider_cameras_only.get_dataset_map()
train_cameras = join_cameras_as_batch([frame.camera for frame in dataset_map_cameras.train])
#train_cameras=train_cameras[[0,1]]
val_cameras = join_cameras_as_batch([frame.camera for frame in dataset_map_cameras.val])
test_cameras = join_cameras_as_batch([frame.camera for frame in dataset_map_cameras.test])
#test_cameras = test_cameras[[10,11]]

In [None]:
plot_scene({name:{"name":cameras} for name, cameras in [("train", train_cameras),("test",test_cameras)]}, ncols=2)