<div style="width:100%;text-align: center;"> <img align=middle src="https://www.mdpi.com/cancers/cancers-11-01937/article_deploy/html/images/cancers-11-01937-g001.png" alt="Heat beating" style="height:366px;margin-top:3rem;"> </div>

# <h1 style='background:#FAC213; border:0; color:white'><center>Prostate Cancer Classification</center></h1>

# **<span style="color:#cd486b;">About the Dataset</span>**

The dataset contains images of Prostate cancel from Benign and Grade 3-5 to make Gleason Score classification.

# **<span style="color:#cd486b;">About the files</span>**

The dataset contains 2 folders: one with the test data and the other one with train data.
The test-train-split ratio is 0.2, with the test dataset containing 40 images and the train dataset containing 153.
The images have a resolution of 240x240 pixels in RGB color model.
Both the folders contain 4 classes:

> Benign

> Grade 3

> Grade 4

> Grade 5

In [1]:
#Environment check
import os
import warnings
warnings.filterwarnings("ignore")

In [2]:
#Imports
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D, Dense, Dropout, GlobalAveragePooling2D
from tensorflow.keras import optimizers, losses
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# **<span style="color:#cd486b;">Get Data and apply some augmentation</span>**


In [3]:
train_dir = "../input/prostate-cancer-classification-ukm/Prostate_Split/Train/"
test_dir = "../input/prostate-cancer-classification-ukm/Prostate_Split/Test/"

In [4]:
train_datagen = ImageDataGenerator(rescale = 1./ 255, rotation_range = 40, width_shift_range = 0.2, height_shift_range = 0.2,
                                  shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True, fill_mode = 'nearest')

In [5]:
test_datagen = ImageDataGenerator(rescale = 1./ 255)

In [6]:
train_data = train_datagen.flow_from_directory(directory = train_dir, batch_size = 32, target_size = (240,240), class_mode = "categorical", shuffle = False)
test_data = test_datagen.flow_from_directory(directory = test_dir, batch_size = 32, target_size = (240,240), class_mode = "categorical")

Found 153 images belonging to 4 classes.
Found 40 images belonging to 4 classes.


In [7]:
print(len(os.listdir(test_dir + 'Benign')))
print(len(os.listdir(test_dir + 'G3')))
print(len(os.listdir(test_dir + 'G4')))
print(len(os.listdir(test_dir + 'G5')))

10
10
10
10


In [8]:
import plotly.express as px

class_names = ['Benign', 'G3', 'G4', 'G5'] 

n_benign = len(os.listdir(train_dir + 'Benign'))
n_g3 = len(os.listdir(train_dir + 'G3'))
n_g4 = len(os.listdir(train_dir + 'G4'))
n_g5 = len(os.listdir(train_dir + 'G5'))
n_images = [n_benign, n_g3, n_g4, n_g5]
px.pie(names=class_names, values=n_images)

# **<span style="color:#cd486b;">Model</span>**


In [9]:
model = tf.keras.models.Sequential([
    Conv2D(16, (3,3), activation = 'relu', input_shape = (240,240, 3)),
    MaxPooling2D(2,2),
    Conv2D(32, (3,3), activation = 'relu'),
    MaxPooling2D(2,2),
    Conv2D(32, (3,3), activation = 'relu'),
    MaxPooling2D(2,2),
    Conv2D(32, (3,3), activation = 'relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dense(512, activation = 'relu'),
    Dropout(0.2),
    Dense(4, activation = 'softmax') # 4 classes
])

2023-01-08 12:16:54.124809: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-08 12:16:54.128658: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-08 12:16:54.129391: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-08 12:16:54.131390: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compil

In [10]:
print(len(model.layers))

12


In [11]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 238, 238, 16)      448       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 119, 119, 16)      0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 117, 117, 32)      4640      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 58, 58, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 56, 56, 32)        9248      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 28, 28, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 26, 26, 32)        9

# **<span style="color:#cd486b;">Failed Experiment</span>**


In [12]:
model.compile(loss = 'categorical_crossentropy', optimizer = tf.keras.optimizers.Adam(), metrics = ['accuracy'])
history = model.fit(train_data, epochs = 20, steps_per_epoch = len(train_data), 
                    validation_data = test_data, validation_steps = int(0.25 * len(test_data)))

2023-01-08 12:17:07.624348: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


Epoch 1/20


2023-01-08 12:17:13.334441: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8005


Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [13]:
loss_test, acc_test = model.evaluate(test_data)
print("Test: accuracy = %f  ;  loss = %f" % (acc_test, loss_test))

Test: accuracy = 0.250000  ;  loss = 1.347763


# **<span style="color:#cd486b;">Try Transfer Learning</span>**


In [14]:
#transfer learning

base_model = tf.keras.applications.ResNet50V2(include_top = False)
base_model.trainable = False

inputs = tf.keras.layers.Input(shape = (240, 240, 3), name = 'InputLayer')
x = base_model(inputs)
x = tf.keras.layers.GlobalAveragePooling2D(name = 'global_average_pooling_layer')(x)
x = tf.keras.layers.Dense(512, activation = 'softmax', name = 'Dense_layer')(x)
x = Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(4, activation = 'softmax', name = 'output_layer')(x)

model = tf.keras.Model(inputs, outputs)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50v2_weights_tf_dim_ordering_tf_kernels_notop.h5


In [15]:
print(len(model.layers))

6


In [16]:
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
InputLayer (InputLayer)      [(None, 240, 240, 3)]     0         
_________________________________________________________________
resnet50v2 (Functional)      (None, None, None, 2048)  23564800  
_________________________________________________________________
global_average_pooling_layer (None, 2048)              0         
_________________________________________________________________
Dense_layer (Dense)          (None, 512)               1049088   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
output_layer (Dense)         (None, 4)                 2052      
Total params: 24,615,940
Trainable params: 1,051,140
Non-trainable params: 23,564,800
_________________________________________

In [17]:

model.compile(loss = 'categorical_crossentropy', optimizer = tf.keras.optimizers.Adam(learning_rate = 0.001), metrics = ['accuracy'])
history = model.fit(train_data, epochs = 20, steps_per_epoch = len(train_data), 
                    validation_data = test_data, validation_steps = int(0.25 * len(test_data)))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


# **<span style="color:#cd486b;">Plots</span>**


In [18]:
loss_test, acc_test = model.evaluate(test_data)
print("Test: accuracy = %f  ;  loss = %f" % (acc_test, loss_test))
model.save("resnet50v2_model.h5")

Test: accuracy = 0.700000  ;  loss = 1.279350


In [19]:
results = pd.DataFrame(history.history)
results.tail()

Unnamed: 0,loss,accuracy
15,1.299824,0.627451
16,1.293975,0.607843
17,1.287922,0.601307
18,1.272921,0.647059
19,1.259306,0.686275


In [20]:
model.evaluate(test_data)



[1.279349684715271, 0.699999988079071]

# **<span style="color:#850E35;">Conclusion</span>**

Clearly, we see there is some overfitting which we can overcome. Rest the ResNet50 did some good work, next we can try inception or DenseNet models as well.

-----------------------------------------------------------------------

**<span style="color:#A77979;">This is my very first Computer Vision Notebook. This dataset belongs to UKM.</span>**

**<span style="color:#A77979;">Please share your feedback and suggestions and help me improve 😇</span>**