# 3D Convolutions : Understanding + Use Case - Drug Discovery
[3D Convolutions : Understanding + Use Case (kaggle_page)](https://www.kaggle.com/code/shivamb/3d-convolutions-understanding-use-case)
[3D-MNIST | basic CNN | Adorable visualisations](https://www.kaggle.com/code/michaelcripman/3d-mnist-basic-cnn-adorable-visualisations)
3D MNIST 데이터세트, 3D 컨볼루션과 구현을 설명한다.

### 컨볼루션
입력 데이터에서 저차원 특징을 추출한다. 또한 입력 데이터의 공간과 위치적 관계를 보존한다.

* 1D Convolutions   
window내의 로컬 패턴을 볼 수 있다.   
![](https://i.imgur.com/5UQz1zI.jpg)   

* 2D Convolutions   
![](https://tensorflowkorea.files.wordpress.com/2016/08/no_padding_no_strides1.gif)

* 3D Convolutions   
![](https://i.imgur.com/jriyCTU.png?1)   

* Dilated Convolutions   
![](https://tensorflowkorea.files.wordpress.com/2016/08/padding_strides_transposed.gif)  
3*3 입력에 패딩과 사이 간격을 주어서 컨볼루션을 한 결과는 5*5 입력을 컨볼루션 한 결과와 동일한 크기를 만든다.   
![](https://tensorflowkorea.files.wordpress.com/2016/08/same_padding_no_strides.gif)   


In [3]:
pip install plotly

Defaulting to user installation because normal site-packages is not writeable
Collecting plotly
  Obtaining dependency information for plotly from https://files.pythonhosted.org/packages/df/79/c80174d711ee26ee5da55a9cc3e248f1ec7a0188b5e4d6bbbbcd09b974b0/plotly-5.17.0-py2.py3-none-any.whl.metadata
  Using cached plotly-5.17.0-py2.py3-none-any.whl.metadata (7.0 kB)
Collecting tenacity>=6.2.0 (from plotly)
  Obtaining dependency information for tenacity>=6.2.0 from https://files.pythonhosted.org/packages/f4/f1/990741d5bb2487d529d20a433210ffa136a367751e454214013b441c4575/tenacity-8.2.3-py3-none-any.whl.metadata
  Using cached tenacity-8.2.3-py3-none-any.whl.metadata (1.0 kB)
Using cached plotly-5.17.0-py2.py3-none-any.whl (15.6 MB)
Using cached tenacity-8.2.3-py3-none-any.whl (24 kB)
Installing collected packages: tenacity, plotly
Successfully installed plotly-5.17.0 tenacity-8.2.3
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.2.1 -> 23.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [4]:
from keras.layers import Conv3D, MaxPool3D, Flatten, Dense
from keras.layers import Dropout, Input, BatchNormalization
from sklearn.metrics import confusion_matrix, accuracy_score
from plotly.offline import iplot, init_notebook_mode
from keras.losses import categorical_crossentropy
from keras.optimizers import Adadelta
import plotly.graph_objs as go
from matplotlib.pyplot import cm
from keras.models import Model
import numpy as np
import keras
import h5py

init_notebook_mode(connected=True)
%matplotlib inline

In [6]:
with h5py.File('./data/full_dataset_vectors.h5', 'r') as dataset:
    x_train = dataset["X_train"][:]
    x_test = dataset["X_test"][:]
    y_train = dataset["y_train"][:]
    y_test = dataset["y_test"][:]

print ("x_train shape: ", x_train.shape)
print ("y_train shape: ", y_train.shape)

print ("x_test shape:  ", x_test.shape)
print ("y_test shape:  ", y_test.shape)

x_train shape:  (10000, 4096)
y_train shape:  (10000,)
x_test shape:   (2000, 4096)
y_test shape:   (2000,)


학습 데이터 10,000개 의 16,16,16 크기의 3d 숫자들이 있다.


In [46]:
with h5py.File("./data/train_point_clouds.h5", "r") as points_dataset:
    print(points_dataset['0']["img"])
    print(points_dataset['0']["points"])
    print(points_dataset['0'].attrs["label"])

<HDF5 dataset "img": shape (30, 30), type "<f8">
<HDF5 dataset "points": shape (25700, 3), type "<f8">
5


In [58]:
with h5py.File("./data/train_point_clouds.h5", "r") as points_dataset:        
    digits = []
    for i in range(10):
        digit = (points_dataset[str(i)]["img"][:], 
                 points_dataset[str(i)]["points"][:], 
                 points_dataset[str(i)].attrs["label"]) 
        digits.append(digit)
print(digits[0][1].shape)

n = 2
x_c = [r[0] for r in digits[n][1]]
y_c = [r[1] for r in digits[n][1]]
z_c = [r[2] for r in digits[n][1]]

trace1 = go.Scatter3d(x=x_c, y=y_c, z=z_c, mode='markers', 
                      marker=dict(size=1, color=z_c, colorscale='Viridis', opacity=0.7))

data = [trace1]
layout = go.Layout(height=500, width=600, title= "Digit: "+str(digits[0][2]) + " in 3D space")
fig = go.Figure(data=data, layout=layout)
iplot(fig)

(25700, 3)


16,16,16 크기의 데이터에 rgb 차원을 추가해 4차원의 16,16,16,3으로 변경한다.
라벨도 원핫 인코딩

In [59]:
xtrain = np.ndarray((x_train.shape[0], 4096, 3))
xtest = np.ndarray((x_test.shape[0], 4096, 3))

def add_rgb_dimention(array):
    scaler_map = cm.ScalarMappable(cmap="Oranges")
    array = scaler_map.to_rgba(array)[:, : -1]
    return array

for i in range(x_train.shape[0]):
    xtrain[i] = add_rgb_dimention(x_train[i])
for i in range(x_test.shape[0]):
    xtest[i] = add_rgb_dimention(x_test[i])

xtrain = xtrain.reshape(x_train.shape[0], 16, 16, 16, 3)
xtest = xtest.reshape(x_test.shape[0], 16, 16, 16, 3)

y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

모델 구조 설계
입력 레이어 : 16,16,16,3
출력 레이어 : 

In [62]:
input_layer = Input((16, 16, 16, 3))

conv_layer1 = Conv3D(filters=8, kernel_size=(3, 3, 3), activation='relu')(input_layer)
conv_layer2 = Conv3D(filters=16, kernel_size=(3, 3, 3), activation='relu')(conv_layer1)

pooling_layer1 = MaxPool3D(pool_size=(2, 2, 2))(conv_layer2)

conv_layer3 = Conv3D(filters=32, kernel_size=(3, 3, 3), activation='relu')(pooling_layer1)
conv_layer4 = Conv3D(filters=64, kernel_size=(3, 3, 3), activation='relu')(conv_layer3)
pooling_layer2 = MaxPool3D(pool_size=(2, 2, 2))(conv_layer4)

pooling_layer2 = BatchNormalization()(pooling_layer2)
flatten_layer = Flatten()(pooling_layer2)

dense_layer1 = Dense(units=2048, activation='relu')(flatten_layer)
dense_layer1 = Dropout(0.4)(dense_layer1)
dense_layer2 = Dense(units=512, activation='relu')(dense_layer1)
dense_layer2 = Dropout(0.4)(dense_layer2)
output_layer = Dense(units=10, activation='softmax')(dense_layer2)

model = Model(inputs=input_layer, outputs=output_layer)

모델 컴파일 후 훈련 시작

In [63]:
model.compile(loss=categorical_crossentropy, optimizer=Adadelta(lr=0.1), metrics=['acc'])
model.fit(x=xtrain, y=y_train, batch_size=128, epochs=50, validation_split=0.2)


The `lr` argument is deprecated, use `learning_rate` instead.



Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x1604df19cf0>

In [64]:
pred = model.predict(xtest)
pred = np.argmax(pred, axis=1)
pred



array([7, 3, 2, ..., 3, 9, 4], dtype=int64)