# 14 - PointNet++ Set Abstraction Layer (Code Study)

This notebook is a detailed code study of the PointNet++ Set Abstraction (SA) layer.

It is based on the Keras layer implememtation of PointNet++ from https://github.com/dgriffiths3/pointnet2-tensorflow2.

Follow the notebook to better understand the Set Abstraction layer and its implementation.

## Setup TensorFlow

In [None]:
# Change X to the GPU number you want to use,
# otherwise you will get a Python error
# e.g. USE_GPU = 4
USE_GPU = X

In [2]:
# Import TensorFlow 
import tensorflow as tf

# Print the installed TensorFlow version
print(f'TensorFlow version: {tf.__version__}\n')

# Get all GPU devices on this server
gpu_devices = tf.config.list_physical_devices('GPU')

# Print the name and the type of all GPU devices
print('Available GPU Devices:')
for gpu in gpu_devices:
    print(' ', gpu.name, gpu.device_type)
    
# Set only the GPU specified as USE_GPU to be visible
tf.config.set_visible_devices(gpu_devices[USE_GPU], 'GPU')

# Get all visible GPU  devices on this server
visible_devices = tf.config.get_visible_devices('GPU')

# Print the name and the type of all visible GPU devices
print('\nVisible GPU Devices:')
for gpu in visible_devices:
    print(' ', gpu.name, gpu.device_type)
    
# Set the visible device(s) to not allocate all available memory at once,
# but rather let the memory grow whenever needed
for gpu in visible_devices:
    tf.config.experimental.set_memory_growth(gpu, True)
    
# Import Keras
from tensorflow import keras

# Print the installed Keras version
print(f'\nKeras version: {keras.__version__}\n')

TensorFlow version: 2.3.1

Available GPU Devices:
  /physical_device:GPU:0 GPU
  /physical_device:GPU:1 GPU
  /physical_device:GPU:2 GPU
  /physical_device:GPU:3 GPU
  /physical_device:GPU:4 GPU
  /physical_device:GPU:5 GPU
  /physical_device:GPU:6 GPU
  /physical_device:GPU:7 GPU

Visible GPU Devices:
  /physical_device:GPU:4 GPU

Keras version: 2.4.0



## Prepare TensorFlow CUDA operations

In this section, the TensorFlow operations implemented in CUDA (a programming language for GPUs from NVIDIA) are prepared to be used in the PointNet++ Jupyter notebooks.

**Note that this section of the notebook needs only be executed once to install and compile the TensorFlow CUDA operations.** But there should be no harm in executing it repeatedly. 



**Make sure that the file 'tf_ops.zip' is in the same folder as this notebook and all other notebooks you want to use these TensorFlow operations with.**

First, the source code and the scripts are unzipped.

In [3]:
!unzip -o -q "tf_ops.zip"

Then, the access permissions of the 'compile_ops.sh' script file is changed, and the CUDA code that contains the TensorFlow operations are compiled.

In [4]:
!chmod u+x "tf_ops/compile_ops.sh"
!tf_ops/compile_ops.sh

2021-02-04 11:27:25.989666: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-02-04 11:27:27.393458: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1


**I suggest you immediately comment out the 3 lines of Linux commands above, as they no longer need to be repeated and only cost extra time.**

Now, the CUDA operations should be ready to be used.

## Load a point cloud patch

The data for PointNet++ is provided as a set of files that contain circular patches.

Load a patch with the xyz-coordinates of 100.000 points and their labels. A patch is a circular cutout from the original point cloud. Here, one of the patches in the middle of the dataset is loaded as the boundary patches are not really shaped as a circle anymore. (The dataset is too small and most patches are already close to the border. However, the network still performs well considering the overall size of the dataset.)

In [5]:
import numpy as np
import pandas as pd

You can reproduce this code step-wise by checking with print(x) and print(x.shape) the result of each line of code. (Exchange x with the variable you want to examine.) Especially the array dimensions are interesting in order to understand what is happening. As the points still contain their original coordinates, the patches need to be centered by the center of an axis-aligned bounding box of all contained points. Note that the centering is done with vector operations that work on all points at once. Check the documentation of the Numpy functions to learn more.

In [6]:
from pathlib import Path
import os

# directory of PointNet data
data_dir = str(Path.home()) + r'/coursematerial/GIS/ISPRS/PointNet++'

# filename and path of one patch
filename = r'Vaihingen3D_Training_0016.csv'
filepath = os.path.join(data_dir, 'patches', filename)

# read csv file as Pandas DataFrame
xyz_df = pd.read_csv(filepath, sep=' ')

# extract x,y,z-columns from DataFrame and convert to NumPy array
xyz = xyz_df[['x','y','z']].to_numpy()

# center by center point of bounding box
min = np.min(xyz, axis=0)
max = np.max(xyz, axis=0)
xyz = xyz - np.expand_dims(0.5 * (max+min), axis=0)

# extract the labels from DataFrame
labels = xyz_df[['labels']].to_numpy()

Inspecting the xyz array, you notice that there are 100.000 points per patch.

In [7]:
print(xyz, '\n\nShape:', xyz.shape)

[[ 1.00000000e-02 -8.50000000e-02 -1.89500000e+00]
 [ 1.00000000e-02  1.85000001e-01 -1.84500000e+00]
 [ 1.00000000e-02  1.95000000e-01 -1.83500000e+00]
 ...
 [ 4.49000000e+01 -3.23450000e+01 -4.92500000e+00]
 [-5.24900000e+01 -1.71350000e+01 -3.62500000e+00]
 [-5.24900000e+01 -1.71350000e+01 -3.62500000e+00]] 

Shape: (100000, 3)


Also, read the labels, which are just for outputting a colorized point cloud of the patch.

In [8]:
print(labels, '\n\nShape:', labels.shape)

[[5]
 [5]
 [5]
 ...
 [2]
 [2]
 [2]] 

Shape: (100000, 1)


## Output colorized point cloud

Define a helper function that colorizes and saves a point cloud (xyz) according to their labels (y) in a csv file with given filename.

The code is quite condensed and might be difficult to follow up in detail. But a detailed understanding is also not necessary for this course.

In [9]:
def save_colorized_point_cloud(xyz, y, filename):

    color_map = np.array([
        [255, 255, 125],
        [  0, 255, 255],
        [255, 255, 255],
        [255, 255,   0],
        [  0, 255, 125],
        [  0,   0, 255],
        [  0, 125, 255],
        [125, 255,   0],
        [  0, 255,   0]])
    
    u, inverses = np.unique(y, return_inverse=True)    
    
    colors = color_map[inverses]
    
    df = pd.DataFrame(xyz, columns=['x', 'y', 'z'])    

    df['red'] = pd.Series(data=colors[:,0], name='red')
    df['green'] = pd.Series(data=colors[:,1], name='green')
    df['blue'] = pd.Series(data=colors[:,2], name='blue')
    
    df.to_csv(filename, index=False, header=False)
    
    print(f'Saved "{filename}"')

Save the just loaded point cloud patch as colorized point cloud. You can download the saved file and view it with CloudCompare on your computer.

In [10]:
save_colorized_point_cloud(xyz, labels, 'Patch.csv')

Saved "Patch.csv"


## Sample and Group

This part studies the sampling & grouping part of the Set Abstraction layer, which is then only followed by a multi-layer perceptron and max pooling.

### Farthest point sampling

Import the Python functions that form the interface to the TensorFlow operations.

In [11]:
from tf_ops.tf_ops import(
    farthest_point_sample,
    gather_point,
    query_ball_point,
    group_point,
)

Sample 8192 points from the 100.000 points of the patch using farthest point sampling. The function returns the indices of the points.

The function, however, expects a tensor of shape BxNx3, where B is the batch size. Our current Numpy array has shape Nx3. Therefore, we need to expand the dimensions first.

In [12]:
x = np.expand_dims(xyz, axis=0)

print(x.shape)

(1, 100000, 3)


There is also a different notation with slicing and the ... operator.

In [13]:
x = xyz[np.newaxis, ...]

print(x.shape)

(1, 100000, 3)


Python will convert our Numpy array into a TensorFlow tensor first, before calling the **farthest_point_sample()** function. And the return type of the array of indices is also a TensorFlow tensor. In recent versions of TensorFlow, the framework converts automatically between Numpy arrays and TensorFlow tensors.

As we will use the xyz array more often from now on in the form of a tensor (that also includes a batch dimension), we better convert it once and keep it in a variable. Otherwise we need to expand the dimensions all the time. Here, we reuse the variable xyz. So, be careful that xyz is from now on a tensor of shape 1xNx3. The data type of the tensor needs to be specified to be a 32-bit floating type, because the network works on this data type as well.

(It might take a few seconds as TensorFlow is used the first time and initializes first.)

In [14]:
xyz = tf.convert_to_tensor(np.expand_dims(xyz, axis=0), dtype=tf.float32)
xyz

<tf.Tensor: shape=(1, 100000, 3), dtype=float32, numpy=
array([[[ 1.0000e-02, -8.5000e-02, -1.8950e+00],
        [ 1.0000e-02,  1.8500e-01, -1.8450e+00],
        [ 1.0000e-02,  1.9500e-01, -1.8350e+00],
        ...,
        [ 4.4900e+01, -3.2345e+01, -4.9250e+00],
        [-5.2490e+01, -1.7135e+01, -3.6250e+00],
        [-5.2490e+01, -1.7135e+01, -3.6250e+00]]], dtype=float32)>

Now, the farthest point sampling can be applied on our xyz tensor. Here, one of the CUDA operations are called, which are not part of TensorFlow itself, but are provided by PointNet++. (As a side note, a lot of neural network architectures for point clouds and their implementations that followed PointNet++ also use farthest point sampling and the CUDA operations of PointNet++.)

In [15]:
fps_idx = farthest_point_sample(8192, xyz)

print(fps_idx, '\n\nShape:', fps_idx.shape)

tf.Tensor([[    0 99677 99945 ... 58857 95615 92527]], shape=(1, 8192), dtype=int32) 

Shape: (1, 8192)


As you can see, the first point (at index 0) is always in the result set, and the algorithm then iteratively identifies from the remaining points the point that is farthest away from the points that are already in the result set. It then includes this point in the result set and continues until the requested amount of points are sampled.

### Gather sampled points

At the moment, fps_idx only contains the indices of the sampled points, but not the points themselves. The TensorFlow function **gather()** takes from the input tensor the points at the rows identified by the indices. Although, we need to provide the axis from which we want to gather the points, as otherwise it gathers the data from the first dimension, which is our batch dimension.

In [16]:
spts_xyz = tf.gather(xyz, fps_idx, axis=1, batch_dims=1)

spts_xyz.shape

TensorShape([1, 8192, 3])

Now, we have a tensor of the sampled coordinates (,just like our input xyz tensor).

Also create a tensor of labels to colorize and save the output with the helper function. The class powerline (class label 7) is used to display all points in yellow. The TensorFlow function **fill()** constructs a tensor with given dimensions, and fills it with the given value.

In [17]:
spts_labels = tf.fill(dims=(8192,1), value=7)
spts_labels

<tf.Tensor: shape=(8192, 1), dtype=int32, numpy=
array([[7],
       [7],
       [7],
       ...,
       [7],
       [7],
       [7]], dtype=int32)>

Reshape the sample points tensor, so that the batch size is removed, convert both TensorFlow tensors to Numpy arrays, and call the function to colorize and save the point cloud. You can now inspect the sampled points in CloudCompare. Approximately every 12th point is used and is evenly distributed with regard to the original point cloud. Particularly notice how the sampled points are at the base of buildings, on the facade if there is a small cluster in the original point cloud, and also at the roofs. If an even sampling over the 2D space would have been used, then points that are on top of each other would not appear in the sampled points. Therefore, farthest point sampling generates a homogeneous point set from the point cloud.

In [18]:
save_colorized_point_cloud(tf.reshape(spts_xyz, shape=(8192,3)).numpy(), spts_labels.numpy(), 'SamplePoints.csv')

Saved "SamplePoints.csv"


The CUDA function **gather_point()** from PointNet++ does the same thing as the (general purpose) TensorFlow function as shown above. And is just given here as it is part of the original implementation. However, no axis and batch dimensions need to be specified.

In [19]:
spts_from_tf_ops = gather_point(xyz, fps_idx)

spts_from_tf_ops.shape

TensorShape([1, 8192, 3])

We can compare the two tensors element-wise with the equal (==) operator, which results in a tensor with Boolean values (True or False) having the same dimensions as the two input tensors (on either side of the equal symbol), and reduce this tensor afterwards by checking if all Boolean values are True. The function **reduce_all()** only returns True if all input values of the tensor are True, and otherwise returns False.

In [20]:
spts_xyz == spts_from_tf_ops

<tf.Tensor: shape=(1, 8192, 3), dtype=bool, numpy=
array([[[ True,  True,  True],
        [ True,  True,  True],
        [ True,  True,  True],
        ...,
        [ True,  True,  True],
        [ True,  True,  True],
        [ True,  True,  True]]])>

In [21]:
tf.reduce_all(spts_xyz == spts_from_tf_ops)

<tf.Tensor: shape=(), dtype=bool, numpy=True>

Using either the **gather_point()** function of PointNet++ or the **gather()** function of TensorFlow, the sampled points are now stored in a tensor with their xyz-coordinates.

### Query fixed radius (ball) points

For every sampled point, retrieve all points from the input within a given fixed radius. This is sometimes also referred to as a ball query. 

In [22]:
# define variables for radius and number of points
ball_query_radius = 1.0
ball_query_n_pts = 16

In [23]:
pts_idx, pts_cnt = query_ball_point(ball_query_radius, ball_query_n_pts, xyz, spts_xyz)

Check the tensor dimensions. For every sample point, there are now 16 indices of neighbor points.

In [24]:
pts_idx

<tf.Tensor: shape=(1, 8192, 16), dtype=int32, numpy=
array([[[    0,     1,     2, ...,    13,    14,    15],
        [96475, 97034, 97049, ..., 99681, 99922, 96475],
        [97680, 97882, 98392, ..., 97680, 97680, 97680],
        ...,
        [58723, 58857, 58916, ..., 60574, 58723, 58723],
        [93013, 95577, 95615, ..., 93013, 93013, 93013],
        [91420, 91932, 92527, ..., 91420, 91420, 91420]]], dtype=int32)>

You might have noticed that the first index (in this cutout of the result tensor) is in some rows repeated in the last columns. This happens if not enough points have been found in the ball region. (However, the sampling point must not necessarily be the first point. Then the index of this sample point is repeated. Duplicate points have no effect on the feature extraction of PointNet++, because of the way features are extracted with max pooling in the end. So there is no harm in these repeated points.)

The second output of the query_ball_point() function returns a tensor (pts_cnt) that gives the number of unique points in this query region. However, this is not used any further (as the number of unique points do not matter as just mentioned.

In [25]:
pts_cnt

<tf.Tensor: shape=(1, 8192), dtype=int32, numpy=array([[16, 15, 11, ..., 14,  6,  3]], dtype=int32)>

## Group points

As no points are returned (in the sense of coordinates), but rather the indices of points, the CUDA function **group_points()** gathers the points (for every sample point) and groups them accordingly into a tensor of dimension BxSxGx3, where S is the number of sampling points (from farthest point sampling), and G the number of points per group (from the ball query).

In [26]:
spts_groups_xyz = group_point(xyz, pts_idx)

spts_groups_xyz.shape

TensorShape([1, 8192, 16, 3])

This is, once again, the same as if using the gather() function from TensorFlow. And we can check this by using the reduce_all() function on the comparison with the result of the TensorFlow gather() function.

In [27]:
tf.reduce_all(spts_groups_xyz == tf.gather(xyz, pts_idx, axis=1, batch_dims=1))

<tf.Tensor: shape=(), dtype=bool, numpy=True>

### Zero-center every group

Before each group can be processed by a small PointNet module (with an MLP and max pooling), every group must be zero-centered according to the sampling point it originates from. If we do not center the groups, each group is in its own coordinate space and the PointNet module could not learn shared features. Rather, it would try to learn features for all the different coordinate spaces, but would not succeed to generalize at all.

Let us first inspect one of those groups before it is zero-centered. We use the group at index 1000 in hopes it is somewhere in the middle of the patch. Groups at the border of the patch (e.g. at index 0) have degenerated neighborhoods. Meaning that all points are on one side of the sampling point, because as just pointed out the sampling is at the border.

(The slicing operator takes a slice from the tensor at first batch (index 0), for point 1000, all neighbor points (:), and all coordinates (:). Remember that the colon (:) denotes the start and end index (not including the last element) for this dimension. For example, from index 4 to 8 would be given as 4:9. Not providing a start and end index means to just take all from this dimension.)

In [28]:
spts_groups_xyz[0, 1000, :, :]

<tf.Tensor: shape=(16, 3), dtype=float32, numpy=
array([[ 4.36 , 11.835, -2.615],
       [ 5.3  , 11.715, -2.495],
       [ 4.37 , 12.135, -2.605],
       [ 5.49 , 11.715, -2.025],
       [ 5.49 , 11.775, -2.635],
       [ 5.3  , 11.945, -2.625],
       [ 5.5  , 12.075, -2.625],
       [ 5.3  , 12.245, -2.605],
       [ 5.5  , 12.385, -2.595],
       [ 5.3  , 12.535, -2.595],
       [ 5.5  , 12.665, -2.615],
       [ 4.36 , 11.835, -2.615],
       [ 4.36 , 11.835, -2.615],
       [ 4.36 , 11.835, -2.615],
       [ 4.36 , 11.835, -2.615],
       [ 4.36 , 11.835, -2.615]], dtype=float32)>

For the purpose of centering, the tensor of the sampling points (which were used as the center point of the ball query from which the groups originate) is expanded to 4 dimensions that it fits the tensor of the groups. Therefore, the tensor is expaned by the 3rd dimension (the dimension at index 2), meaning another dimension of size 1 is inserted between the 2nd (index 1) and 3rd (index 2) dimension. Now the group tensor and the sample point tensor are the same dimensions and almost the same shape, with the exception of the number of points in the 3rd dimension.

In [29]:
spts_xyz.shape

TensorShape([1, 8192, 3])

In [30]:
tf.expand_dims(spts_xyz, 2).shape

TensorShape([1, 8192, 1, 3])

Because we want to subtract the same sample point from all neighbor points, the sample point coordinates are repeated as many times as there are points in each group in the third dimension. This is accomplished with the **tile()** function of TensorFlow.

In [31]:
tf.tile(tf.expand_dims(spts_xyz, 2), [1, 1, ball_query_n_pts, 1]).shape

TensorShape([1, 8192, 16, 3])

Now, both tensors that are to be used for the subtraction have the same dimensions and the tensors can be subtracted element-wise.

In [32]:
spts_groups_xyz_centered = spts_groups_xyz - tf.tile(tf.expand_dims(spts_xyz, 2), [1, 1, ball_query_n_pts, 1])

spts_groups_xyz_centered.shape

TensorShape([1, 8192, 16, 3])

Compare this tensor with the one we started with, the points are now centered by the sample point. (In many cases, probably for the border points, the first point of the group is the sampling point. But it seems for group 1000, this is not the case. For this group, the sampling point was at index 5.)

In [33]:
spts_groups_xyz_centered[0, 1000, :, :]

<tf.Tensor: shape=(16, 3), dtype=float32, numpy=
array([[-0.94000006, -0.10999966,  0.00999999],
       [ 0.        , -0.22999954,  0.13000011],
       [-0.9300003 ,  0.19000053,  0.01999998],
       [ 0.18999958, -0.22999954,  0.5999999 ],
       [ 0.18999958, -0.17000008, -0.00999999],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.19999981,  0.13000011,  0.        ],
       [ 0.        ,  0.3000002 ,  0.01999998],
       [ 0.19999981,  0.44000053,  0.02999997],
       [ 0.        ,  0.59000015,  0.02999997],
       [ 0.19999981,  0.72000027,  0.00999999],
       [-0.94000006, -0.10999966,  0.00999999],
       [-0.94000006, -0.10999966,  0.00999999],
       [-0.94000006, -0.10999966,  0.00999999],
       [-0.94000006, -0.10999966,  0.00999999],
       [-0.94000006, -0.10999966,  0.00999999]], dtype=float32)>

These zero-centered xyz-coordinates of the groups can now be fed into a PointNet module to extract geometric features for each group and by this for the sampling point. But PointNet++ also integrates further input features along with the xyz-coordinates before it applies a PointNet module.

### Group features

If there are further features per point as input besides the xyz-coordinates, then these features can be gathered with the group_point() function as well. Feature could be input features like intensity, return number, etc., but are also features extracted from the previous set abstraction layer of the PointNet++ architecture.

Let us create a tensor with random numbers as a feature tensor with the respective shape that fits the input points. We assume 5 features.

In [34]:
features = tf.random.uniform(shape=[1, 100000, 5])

The features are also grouped per sampling point with the same indices as used to group the xyz-coordinates. (Compare with above.)

In [35]:
features_groups = group_point(features, pts_idx)

features_groups.shape

TensorShape([1, 8192, 16, 5])

The feature groups do not need to be centered in any way. So, we are already finished.

Both the zero-centered xyz-coordinates as well as the feature groups have the same tensor shapes with the exception of the last dimension.

In [36]:
print(spts_groups_xyz_centered.shape)

print(features_groups.shape)

(1, 8192, 16, 3)
(1, 8192, 16, 5)


Therefore, we can concatenate them into a single tensor by their last (-1) dimension.

In [37]:
output = tf.concat([spts_groups_xyz_centered, features_groups], axis=-1)

print(output.shape)

(1, 8192, 16, 8)


This output tensor can now be used in a PointNet module for feature extraction. It does not matter that the geometric and other features are in the same tensor. Either the PointNet modules can learn something useful from this combination. And if not, then it will learn to ignore certain (feature) channels by setting the weights for this channel to 0. As PointNet applies a large number of filters, each filter can learn a different combination of features. But such an approach is the most flexible and most general approach to handling features.

### PointNet++ implementation of the Set Abstraction layer

The first set abstraction layer might receive no input features, simply because the input data does not have any information besides the xyz-coordinates. Then, only the grouped points are the input to the PointNet module of the set abstraction layer. At the second set abstraction layer, there are always the features from the previous layer as additional input besides the coordinates. These features are provided by the variable called points (which is a confusing name and a better name would maybe be 'features').

PointNet++ optionally does not include the grouped point xyz coordinates, but only continues with the grouped features. Then the concatenate part (as seen above) is not executed and only the grouped features are the output of the function to sample and group. This might be interesting for the second and higher set abstraction layer. But then, no further features are derived from the geometry in higher set abstraction layers. The default, however, is to use xyz-coordinates.

In the following cell, an implementation of sample and group is given. You should recognize most of the parts (with the exception that there is also an option to use k nearest neighbors instead of a fixed radius ball query). 

As already mentioned above, the naming of the variables is sometimes a little confusing:

- The variable called 'points' contains the feature tensor (the one we generated from random values).
- npoint is the number of points for farthest point sampling
- nsample is the number of points in the fixed radius ball query
- radius is the radius of the fixed radius ball query
- knn is a Boolean variable and determines if k nearest neighbor search should be used instead of farthest point sampling
- use_xyz is a Boolean variable and determines if the grouped xyz-coordinates should be included in the output tensor

In [38]:
def sample_and_group(npoint, radius, nsample, xyz, points, knn=False, use_xyz=True):

    new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz))
    if knn:
        _,idx = knn_point(nsample, xyz, new_xyz)
    else:
        idx, pts_cnt = query_ball_point(radius, nsample, xyz, new_xyz)
    grouped_xyz = group_point(xyz, idx)
    grouped_xyz -= tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1])
    if points is not None:
        grouped_points = group_point(points, idx)
        if use_xyz:
            new_points = tf.concat([grouped_xyz, grouped_points], axis=-1)
        else:
            new_points = grouped_points
    else:
        new_points = grouped_xyz

    return new_xyz, new_points, idx, grouped_xyz

The function **sample_and_group()** implements all the above funcionality in a function and returns the sampled points, the grouped points with concatenated features, the indices from the ball query, and the grouped xyz-coordinates. Not all information is then further used in PointNet++.

In the following, the function is called without input features.

In [39]:
new_xyz, new_points, idx, grouped_xyz = sample_and_group(8192, 1.0, 16, xyz, None)

new_points.shape

TensorShape([1, 8192, 16, 3])

The resulting new_points tensor is the same as the zero-centered groups of sampled points.

In [40]:
tf.reduce_all(new_points == spts_groups_xyz_centered)

<tf.Tensor: shape=(), dtype=bool, numpy=True>

When the function is called with our random feature tensor, the result is the same as what we calculated above.

In [41]:
new_xyz, new_points, idx, grouped_xyz = sample_and_group(8192, 1.0, 16, xyz, features)

tf.reduce_all(new_points == output)

<tf.Tensor: shape=(), dtype=bool, numpy=True>

## Set Abstraction (SA) layer

In the following, a (simplified) implementation of the Set Abstraction layer is given as a TensorFlow custom layer. A custom layer is a class that is derived from the Keras base class Layer and can be used in a custom neural network model.

In the constructor (**\_\_init__()** method), the class takes the settings (hyperparameters) of the layer, which should be self-explanatory with the explanations above. As mlp, it takes a list of filters that are used for the multi-layer perceptron of the PointNet module, e.g., like [64, 64, 128]. This way, the multi-layer perceptron can be specified once in the constructor and the custom layer remembers it.

The **build()** method is called by the build() method of the model that is always called after specifying the network and before compiling it. It constructs the convolutional 2D layers with the filter numbers provided by mlp and stores them in a list. Here, the ReLU activation function is used and a kernel_size of (1,1).

The **call()** method is called by the build() method and also by the fit() method of the model. It specifies the behaviour of the class during training and prediction. It does not do anything surprising, just calls sample_and_group() for the data (xyz, points) and the specified hyperparameters, then the convolutional layers in the list, and max pooling. The remainder of the code is just managing tensor dimensions.

In [42]:
from tensorflow.keras.layers import Layer

class Pointnet_SA(Layer):

    def __init__(self, npoint, radius, nsample, mlp, knn=False, use_xyz=True):
        super(Pointnet_SA, self).__init__()

        self.npoint = npoint
        self.radius = radius
        self.nsample = nsample
        self.mlp = mlp
        self.knn = False
        self.use_xyz = use_xyz

        self.mlp_list = []
        
    def build(self, input_shape):

        # construct the convolutional layers for the MLP
        for i, n_filters in enumerate(self.mlp):
            self.mlp_list.append(keras.layers.Conv2D(n_filters, kernel_size=[1,1], activation='relu'))

        super(Pointnet_SA, self).build(input_shape)

    def call(self, xyz, points, training=True):

        # expand dimensions if batch dimension is missing
        if points is not None:
            if len(points.shape) < 3:
                points = tf.expand_dims(points, axis=0)

        new_xyz, new_points, idx, grouped_xyz = sample_and_group(
            self.npoint, self.radius, self.nsample, xyz, points, False, use_xyz=True)

        # call the convolutional layers for the MLP
        for i, mlp_layer in enumerate(self.mlp_list):
            new_points = mlp_layer(new_points, training=training)

        # max pooling
        new_points = tf.math.reduce_max(new_points, axis=2, keepdims=True)

        return new_xyz, tf.squeeze(new_points)

Please continue with the code study of the Feature Propagation layer.