# From 2D to 3D
If you have completed the 2D MNIST notebook tutorial, you already know how to build a simple classifier for images. But there is only so much you can do with images. Instead, we want to focus on <tt>3D</tt> objects. So how exactly can one learn on <tt>3D</tt> geometry?<br>
For <tt>2D</tt> the data representation is straight forward: images. For <tt>3D</tt> we have multiple possibilities: voxelgrids, point clouds, meshes, multiview-based approaches, just to name a few. In this tutorial we will be using a voxelgrid, which is basically an image with <tt>3D</tt> pixels (voxel cells).

This tutorial will be a bit more hands-on than the last one, as you will transform the <tt>2D</tt> network of the previous MNIST tutorial into a <tt>3D</tt> network yourself.

## Installing dependencies

In [None]:
!pip install tensorflow matplotlib renumics-spotlight scipy umap-learn k3d

## Defining imports
Again, we start by defining some imports. We will use the new imports `k3d` and `ipywidgets` for <tt>3D</tt> plotting.

In [1]:
import os

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

import k3d
import numpy as np
from ipywidgets import GridspecLayout, Label, VBox
from tensorflow.keras.layers import Conv3D, Dense, Flatten, Input
from tensorflow.keras.models import Model

## Loading the dataset
This time the dataset is a bit more interesting.
We use a subset of the DMU-Net dataset, a research dataset of engineering CAD models.
![](imgs/dmunet.png)
*Some DMU-Net examples. [Source.](https://www.researchgate.net/publication/325170238_Deep_learning_for_big_data_applications_in_CAD_and_PLM_-_Research_review_opportunities_and_case_study)*

We train on 3 simple classes: Nuts, Screws and Gear Wheels.
The dataset is already processed for you and saved as voxelgrids of size <tt>48x48x48</tt>.

In [2]:
import datasets

ds = datasets.load_dataset("renumics/dmu_tiny")
ds_train = ds["train"]
ds_test = ds["test"]

Repo card metadata block was not found. Setting CardData to empty.


In [8]:
from renumics import spotlight

spotlight.show(ds, dtype={"mesh": spotlight.Mesh, "mesh_voxelized": spotlight.Mesh})

VBox(children=(Label(value='Spotlight running on http://127.0.0.1:62177/'), HBox(children=(Button(description=…

In [25]:
train_geometries = np.array(ds_train["voxel"])
train_geometries.shape

(292, 48, 48, 48)

In [3]:
class_names = ["Nut", "Screw", "GearWheel"]
train_geometries = np.array(ds_train["voxel"])
train_labels = ds_train["label"]
test_geometries = np.array(ds_test["voxel"])
test_labels = ds_test["label"]
# test_geometries, test_labels, test_ids = data['test_geometries'], data['test_labels'], data['test_ids']
train_geometries = train_geometries.reshape(*train_geometries.shape, 1)
test_geometries = test_geometries.reshape(*test_geometries.shape, 1)
all_geometries = np.append(test_geometries, train_geometries, 0)

Again, let's view an example from the train set, this time in 3D. You can zoom, shift and rotate as you like.

In [4]:
sample_idx = 5
print("Label: " + class_names[train_labels[sample_idx]])
plot = k3d.plot(menu_visibility=False)
plot += k3d.voxels(
    train_geometries[sample_idx].squeeze().astype(np.uint8), bounds=[0, 1, 0, 1, 0, 1]
)
plot.display()

Label: Screw


Output()

## Building the network
This is where you need to write your own code, or at least make some small code changes. The lines of the following code cells are simply copied from the <tt>2D</tt> MNIST tutorial. Your job is to make them work for the brandnew <tt>3D</tt> DMU-Net dataset. If you missed some changes you'll get according error messages along the way.
<br>
Make all appropriate changes to adapt the MNIST model layers to the new data.

In [None]:
input_geometry = Input((48, 48, 1))
x = Conv3D(filters=16, kernel_size=(3, 3), padding="same", activation="relu")(input_geometry)
x = MaxPool2D(pool_size=(2, 2))(x)
x = Conv3D(filters=32, kernel_size=(3, 3), padding="same", activation="relu")(x)
x = MaxPool2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(units=128, activation="relu")(x)
class_probs = Dense(units=10, activation="softmax")(x)

<details style=\"border-radius: 2px;border:1px solid #55AA55;background:#CCFFCC;\">
<summary>Click for solution.</summary>
Change the input shape to <tt>48x48x48x1</tt>.<br>
Replace all <tt>Conv2d</tt> and <tt>MaxPool2D</tt> layers with <tt>Conv3D</tt> and <tt>MaxPool3D</tt> layers, respectively.<br>
Add a third dimension to <tt>kernel_size</tt> and <tt>pool_size</tt>.<br>
    Change the number of output units to <tt>3</tt>.
</details>

These changes were pretty easy. However, you should also be aware of how the dimensions of the tensors change throughout the layers, from input to output. Can you guess the tensor shape right before the <tt>Flatten</tt> layer?
<details style=\"border-radius: 2px;border:1px solid #55AA55;background:#CCFFCC;\">
<summary>Click for solution.</summary>
The tensor shape is <tt>12x12x32</tt>.<br>
The input shape is <tt>48x48x48x1</tt>. We apply max pooling twice, which each halves the input shape dimensions. The current filter size is <tt>32</tt>.
</details>

That are all the changes you need to make. Now we construct and compile the model.

In [None]:
model = Model(input_geometry, class_probs)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

## Training
Again, we train the model. This time we have a much smaller dataset to train on, but as we are working in 3D, the training does take some time.

In [None]:
model.fit(
    train_geometries,
    train_labels,
    validation_data=(test_geometries, test_labels),
    epochs=5,
    batch_size=16,
)

## Visualization
Lastly, we visualize some test samples and their respective predicted class labels.

In [None]:
predictions = model.predict(test_geometries)
rows, cols = 4, 3
grid = GridspecLayout(rows, cols)
for i in range(rows):
    for j in range(cols):
        sample_idx = i * cols + j
        plot = k3d.plot(height=300, menu_visibility=False, grid_visible=False)
        plot += k3d.voxels(
            test_geometries[sample_idx].squeeze().astype(np.uint8), bounds=[0, 1, 0, 1, 0, 1]
        )
        grid[i, j] = VBox(
            [
                Label(
                    value="Pred: {} ({:.2f}), Label: {} ({:.2f})".format(
                        class_names[np.argmax(predictions[sample_idx])],
                        predictions[sample_idx, np.argmax(predictions[sample_idx])],
                        class_names[test_labels[sample_idx]],
                        predictions[sample_idx, test_labels[sample_idx]],
                    )
                ),
                plot,
            ]
        )
grid

Let's look at the wrong samples.

In [None]:
wrong_indices = (np.argmax(predictions, axis=-1) != test_labels).nonzero()[0]
rows, cols = 4, 2
grid = GridspecLayout(rows, cols)
for i in range(rows):
    for j in range(cols):
        if i * cols + j >= len(wrong_indices):
            break
        sample_idx = wrong_indices[i * cols + j]
        plot = k3d.plot(height=300, menu_visibility=False, grid_visible=False)
        plot += k3d.voxels(
            test_geometries[sample_idx].squeeze().astype(np.uint8), bounds=[0, 1, 0, 1, 0, 1]
        )
        grid[i, j] = VBox(
            [
                Label(
                    value="{} | Pred: {} ({:.2f}), Label: {} ({:.2f})".format(
                        test_ids[sample_idx],
                        class_names[np.argmax(predictions[sample_idx])],
                        predictions[sample_idx, np.argmax(predictions[sample_idx])],
                        class_names[test_labels[sample_idx]],
                        predictions[sample_idx, test_labels[sample_idx]],
                    )
                ),
                plot,
            ]
        )
grid

Great work, you managed to build and train your first simple <tt>3D</tt> model for voxelgrids.

## Qualitative analysis of the model
In order to interpret the model results, we compute several additional outputs:
- We take the output of the last CNN-layer as a similarity measure ("embedding")
- We compute the entropy of the softmax layer



In [None]:
import shutil

shutil.copyfile("dmu_dataset_base.h5", "dmu_dataset.h5")

In [None]:
from scipy import stats

embedding_model = Model(input_geometry, head)
predicted_embeddings = embedding_model.predict(all_geometries)
predictions_softmax = model.predict(all_geometries)
predictions = np.argmax(predictions_softmax, axis=1)
entropies = stats.distributions.entropy(predictions_softmax, axis=1)

false = predictions != all_labels

with Dataset("dmu_dataset.h5", "a") as dataset:
    dataset.append_float_column("entropy_1", entropies)
    dataset.append_categorical_column("prediction_1", ["0", "1", "2"], predictions.astype("U"))
    dataset.append_embedding_column("embedding_1", predicted_embeddings)
    dataset.append_int_column("false_1", false.astype(int))