# Downsampling
Downsampling reduces point cloud. The reason for using downsampling are as follows:

- Saving memory and processing time : When processing a point cloud with many points, a huge amount of memory and calculation time may be required during processing. By performing down-sampling for such point clouds, we can reduce the burden during processing.
- Reducing the complexity of point clouds: Some points in a point cloud are redundant for processing. Downsampling can reduce such points.

This section introduce the following downsampling methods. 
- Random Sampling
- FPS (Furthest point sampling)
- Voxel grid sampling

In [1]:
%load_ext autoreload
%autoreload 2

## Random sampling
This sampling method samples $N$ points randomly ($N$ is any number). This subsection use the following code:

In [2]:
# for random sampling
from tutlibs.sampling import random_sampling
from tutlibs.operator import gather

# for description
import numpy as np
from tutlibs.io import Points as io
from tutlibs.visualization import JupyterVisualizer as jv
from tutlibs.transformation import Transformation as tr
from tutlibs.utils import single_color
import inspect

Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.


In [3]:
# load point cloud data.
coords, _, _ = io.read('../data/bunny_pc.ply')

# get sample indices from random sampling function
idxs = random_sampling(coords, 500)

# get sample points
sampled_coords = gather(coords, idxs)

# visualize samples and origin.
sampled_coords = tr.translation(sampled_coords, np.array([1, 0, 0]))
obj_points = jv.point(coords, single_color("#ff0000",len(coords)))
obj_sampled_points = jv.point(sampled_coords)
jv.display([obj_points, obj_sampled_points])

Output()

In the above output, red points are original points, blue points are sampled points.

## FPS (Furthest point sampling)
FPS samples iteratively point that are furthest points from sampled points until any number of points. Therefore, FPS is useful if we specify the number of samples and get uniform points in 3D space. This subsection use the following code:

In [4]:
# for FPS
from tutlibs.sampling import furthest_point_sampling
from tutlibs.operator import gather

# for description
import numpy as np
from tutlibs.io import Points as io
from tutlibs.visualization import JupyterVisualizer as jv
from tutlibs.transformation import Transformation as tr
from tutlibs.utils import single_color
import inspect

In [5]:
# load point cloud data.
coords, _, _ = io.read('../data/bunny_pc.ply')

# get sample indices from FPS function
idxs = furthest_point_sampling(coords, 500)

# get sample points
sampled_coords = gather(coords, idxs)

# visualize samples and origin.
sampled_coords = tr.translation(sampled_coords, np.array([1, 0, 0]))
obj_points = jv.point(coords, single_color("#ff0000",len(coords)))
obj_sampled_points = jv.point(sampled_coords)
jv.display([obj_points, obj_sampled_points])

Output()

In the above output, red points are original points, blue points are sampled points. The FPS function `furthest_point_sampling` outputs $N$ sample indices corresponding to coordinates of points `coords`.

Next, let's look at the contents of the `furtgest_point_sampling`.

In [6]:
print(inspect.getsource(furthest_point_sampling))

def furthest_point_sampling(coords: np.ndarray, num_sample: int) -> np.ndarray:
    """Furthest point sampling

    Args:
        coords: xyz coordinates, (N, 3)
        num_sample: number of sammple

    Returns:
        indices: sample indices, (num_sample)
    """
    N, _ = coords.shape

    min_square_dists = np.full(N, 2 ** 16 - 1, dtype=np.float32)
    sample_indices = np.zeros(num_sample, dtype=np.int32)

    # Get first index
    sample_indices[0] = 0
    for i in range(1, num_sample):
        # compute square distances between coords and previous sample.
        previous_sample = coords[sample_indices[i - 1]]
        relative_coords = coords - previous_sample[np.newaxis, :]  # (N, 3) - (1, 3)
        square_dists = np.sum(relative_coords ** 2, axis=1)  # (N)

        # update minimum distance between coords and samples.
        min_dist_mask = square_dists < min_square_dists
        min_square_dists[min_dist_mask] = square_dists[min_dist_mask]

        # get new furthest poin


In the above implementation, FPS returns sample indices of `coords`.
FPS algorithm iterates until it reaches any number of points. The iterative process is as follows:

1. Finds the point that is the furthest from all sampling points obtained up to the current iteration. The furthest sample must be a point that is not sampled point yet. 
2. Add furthest point as a new sampling point.


## Voxel grid sampling
Voxel grid sampling gets average coordinates of points on space subdivided according to the voxel grid. Therefore, samples are points with one point for each grid size.   
**Note**: in this subsection, the voxel grid is a 3D space divided into tiny cubes in a grid, and a voxel are the tiny cube.

This subsection use the following code:

In [4]:
# for voxel grid sampling
import numpy as np
from tutlibs.sampling import voxel_grid_sampling

# for description
from tutlibs.io import Points as io
from tutlibs.visualization import JupyterVisualizer as jv
from tutlibs.transformation import Transformation as tr
from tutlibs.utils import single_color
import inspect

In [5]:
coords, rgb, _ = io.read('../data/bunny_pc.ply')
sampled_coords = voxel_grid_sampling(coords, 0.1)

# visualization
sampled_coords = tr.translation(sampled_coords, np.array([1, 0, 0]))
obj_points = jv.point(coords, single_color("#ff0000",len(coords)))
obj_sampled_points = jv.point(sampled_coords)
jv.display([obj_points, obj_sampled_points])

Output()

In the above output, red points are original points, blue points are sampled points. The voxel grid sampling function `voxel_grid_sampling` outputs $N$ sample coordinates.

Next, let's look at the contents of the `voxel_grid_sampling`.

In [9]:
print(inspect.getsource(voxel_grid_sampling))

def voxel_grid_sampling(coords: np.ndarray, voxel_size: float) -> np.ndarray:
    """Voxel grid sampling

    Args:
        coords: coords (N, C)
        voxel_size: voxel grid size

    Returns:
        samples: sample coords (M, C)
    """
    N, C = coords.shape
    
    # get voxel indices.
    indices_float = coords / voxel_size
    indices = indices_float.astype(np.int32)

    # calculate the average coordinate of the point for each voxel.
    _, voxel_labels = np.unique(indices, axis=0, return_inverse=True)
    df = pd.DataFrame(data=np.concatenate(
        [voxel_labels[:, np.newaxis], coords], axis=1), columns=np.arange(C+1))
    voxel_mean_df = df.groupby(0).mean()

    # use average coordinates as samples.
    samples = voxel_mean_df.to_numpy()

    return samples



In the above implementation, `voxel_grid_sampling` returns coordinates of samples. Voxel grid sampling process is as follows:

1. Divides each point into voxels. 
2. For each voxel, calculates the average value of the coordinates of the points in the voxel. This average value becomes the result of sampling.