# S3DIS

This notebook lets you instantiate the **[S3DIS](http://buildingparser.stanford.edu/dataset.html)** dataset from scratch and visualize **3D+2D spherical samples**.

Note that you will need **at least 300G** available for the S3DIS raw dataset and **at least 64G** for the processed files at **5cm voxel resolution** and **1024x512 image resolution**.

The S3DIS dataset is composed of **6 Folds**, each representing a distinct building. Images used here are equirectangular panoramic pictures. 

The images and point clouds provided by the [2D-3D-semantics](https://github.com/alexsax/2D-3D-Semantics) repository are not all aligned. The `S3DISFusedDataset` class from `torch_points3d.datasets.segmentation.multimodal.s3dis` deals with correcting some room and camera orientations before processing the data.

This dataset is not too large and as such can be entirely loaded into a **64G RAM** memory. For training, we sample this dataset with **3D spheres and all images seeing at least one point in the spheres**.

In [None]:
# Select you GPU
I_GPU = 0

In [None]:
# Uncomment to use autoreload
# %load_ext autoreload
# %autoreload 2

import os
import os.path as osp
import sys
import torch
import numpy as np
from time import time
from omegaconf import OmegaConf
start = time()
import warnings
warnings.filterwarnings('ignore')

torch.cuda.set_device(I_GPU)
DIR = os.path.dirname(os.getcwd())
ROOT = os.path.join(DIR, "..")
sys.path.insert(0, ROOT)
sys.path.insert(0, DIR)

from torch_points3d.utils.config import hydra_read
from torch_geometric.data import Data
from torch_points3d.core.multimodal.data import MMData
from torch_points3d.visualization.multimodal_data import visualize_mm_data
from torch_points3d.core.multimodal.image import SameSettingImageData, ImageData
from torch_points3d.datasets.segmentation.multimodal.s3dis import S3DISFusedDataset, OBJECT_LABEL, OBJECT_COLOR
COLORS = OBJECT_COLOR.tolist()
CLASSES = list(OBJECT_LABEL.keys()) + ['unlabelled']

If `visualize_mm_data` does not throw any error but the visualization does not appear, you may need to change your plotly renderer below.

In [None]:
import plotly.io as pio

pio.renderers.default = 'jupyterlab'        # for local notebook
# pio.renderers.default = 'iframe_connected'  # for remote notebook. Other working (but seemingly slower) options are: 'sphinx_gallery' and 'iframe'

## Dataset creation

The following will instantiate the dataset. If the data is not found at `DATA_ROOT`, the folder structure will be created there and the raw dataset will be downloaded there. 

**Memory-friendly tip** : if you have already downloaded the dataset once and simply want to instantiate a new dataset with different preprocessing (*e.g* change 3D or 2D resolution, mapping parameterization, etc), I recommend you manually replicate the folder hierarchy of your already-existing dataset and create a symlink to its `raw/` directory to avoid downloading and storing (very) large files twice.

You will find the config file ruling the dataset creation at `conf/data/segmentation/multimodal/s3disfused-sparse.yaml`. You may edit this file or create new configs inheriting from this one using Hydra and create the associated dataset by modifying `dataset_config` accordingly in the following cell.

In [None]:
# Set your dataset root directory, where the data was/will be downloaded
DATA_ROOT = '/path/to/your/dataset/root/directory'
train_is_trainval = False                                 # set to True if you want to the Train set to be Train+Val
sample_per_epoch = 2000                                   # number of spheres sampled in the Train set. Corrects class imbalance. Set to 0 for regularly-sampled spheres
fold = 5                                                  # fold that will be used as Test

dataset_config = 'segmentation/multimodal/s3disfused-sparse'   
models_config = 'segmentation/multimodal/sparseconv3d'    # this does not really matter here, but is expected by hydra for config parsing
model_name = 'Res16UNet34-L4-early-ade20k-interpolate'    # this does not really matter here, but is expected by hydra for config parsing

overrides = [
    'task=segmentation',
    f'data={dataset_config}',
    f'models={models_config}',
    f'model_name={model_name}',
    f'data.dataroot={DATA_ROOT}',
    f'data.fold={fold}',
]

cfg = hydra_read(overrides)
# print(OmegaConf.to_yaml(cfg))

The dataset will now be created based on the parsed configuration. I recommend having **at least 300G** available for the S3DIS raw dataset and **at least 64G** for the processed files at **5cm voxel resolution** and **1024x512 image resolution**.

As long as you do not change core dataset parameters, preprocessing should only be performed once for your dataset. It may take some time, **mostly depending on the 3D and 2D resolutions** you choose to work with (the larger the slower). As a rule of thumb, it took me **between 1 and 2 hours** (excluding download) to preprocess the 2D and 3D data the S3DIS dataset.

In [None]:
# Dataset instantiation
start = time()
dataset = S3DISFusedDataset(cfg.data)
# print(dataset)
print(f"Time = {time() - start:0.1f} sec.")

To visualize the multimodal samples produced by the dataset, we need to remove some of the dataset transforms that affect points, images and mappings. The `sample_real_data` function will be used to get samples without breaking mappings consistency for visualization.

At training and evaluation time, these transforms are used for data augmentation, dynamic size batching (see our [paper](https://arxiv.org/submit/4264152)), etc...

In [None]:
from torch_geometric.transforms import Center, RandomRotate
from torch_points3d.core.data_transform import RandomNoise, RandomScaleAnisotropic, RandomSymmetry, \
    DropFeature, XYZFeature, AddFeatsByKeys
from torch_points3d.core.data_transform.multimodal.image import ToFloatImage, AddPixelHeightFeature, \
    PickImagesFromMappingArea, PickImagesFromMemoryCredit, CropImageGroups, Normalize
from torch_points3d.datasets.base_dataset import BaseDataset
from torch_points3d.datasets.base_dataset_multimodal import BaseDatasetMM

def sample_real_data(tg_dataset, idx=0):
    """
    Temporarily remove the 3D and 2D transforms affecting the point 
    positions and images from the dataset to better visualize points 
    and images relative positions.
    """
    # Remove some 3D transforms
    transform = tg_dataset.transform
    tg_dataset.transform = BaseDataset.remove_transform(transform, [Center, RandomNoise,
        RandomRotate, RandomScaleAnisotropic, RandomSymmetry, DropFeature, AddFeatsByKeys])
    
    # Remove some 2D transforms
    transform_image = tg_dataset.transform_image
    tg_dataset.transform_image = BaseDatasetMM.remove_multimodal_transform(transform_image, [ToFloatImage, AddPixelHeightFeature, Normalize, PickImagesFromMemoryCredit ])
    
    # Get a sample from the dataset
    out = tg_dataset[idx]
    
    # Restore transforms
    tg_dataset.transform = transform
    tg_dataset.transform_image = transform_image
    return out

## Visualize a single multimodal sample

We can now pick samples from the train, val and test datasets.

In [None]:
# Pick a multimodal spherical sample
mm_data = sample_real_data(dataset.train_dataset)             # pick a random sphere in the Train set, with balanced class probabilites
# mm_data = sample_real_data(dataset.val_dataset)               # pick a random sphere in the Val set
# mm_data = sample_real_data(dataset.test_dataset[0], idx=100)  # pick a random sphere in the Test set

visualize_mm_data(mm_data, class_names=CLASSES, class_colors=COLORS, figsize=1000, voxel=0.05, show_3d=True, show_2d=True, front='y', alpha=3, pointsize=5)

## Visualize a whole S3DIS fold

We can also view a whole fold.

To allow rapid visualization, note the window is subsampled to `voxel=0.1` and the maximum number of points shown is set `max_points=500000`. Besides `show_2d=False` prevents images from being displayed, which would overload the viewer.

In [None]:
from torch_points3d.datasets.multimodal.data import MMData

i_fold = 0  # fold number but does not exactly correspond to the real fold names
mm_data_large = MMData(dataset.train_dataset._datas[i_fold], dataset.train_dataset._images[i_fold], dataset.train_dataset._mappings[i_fold])
# mm_data_large = MMData(dataset.val_dataset._datas[i_fold], dataset.val_dataset._images[i_fold], dataset.val_dataset._mappings[i_fold])
# mm_data_large = MMData(dataset.test_dataset[0]._datas[0], dataset.test_dataset[0]._images[0], dataset.test_dataset[0]._mappings[0])
visualize_mm_data(mm_data_large, class_names=CLASSES, class_colors=COLORS, figsize=1600, voxel=0.1, show_2d=False, max_points=500000)