This notebook was created at 18:54 17/10/20

Cholesteric or Nematic classification of liquid crystal phases using deep learning techniques

The aim is to advance upon previous notebooks that attempted to perform multi_class phase classification from LC textures by using a balanced dataset (only nematic and cholesteric due to a lack of images for other phases.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Model
from keras.layers import Input, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D
from keras.layers import AveragePooling2D, MaxPooling2D, Dropout, GlobalMaxPooling2D, GlobalAveragePooling2D
from keras.layers.experimental.preprocessing import Rescaling
from keras.preprocessing import image_dataset_from_directory
from sklearn.metrics import confusion_matrix

Load in the data.

In [3]:
train_directory = "C:/Users/Jason/Documents/University/Year_4/MPhys_Project(s)/Liquid_crystals-machine_learning/LiquidCrystalMachineLearning/Images_balanced/Train"
test_directory = "C:/Users/Jason/Documents/University/Year_4/MPhys_Project(s)/Liquid_crystals-machine_learning/LiquidCrystalMachineLearning/Images_balanced/Test"
image_size = (368,640)

# Change images to grayscale as colour isnt an important feature at this stage
train_dataset = image_dataset_from_directory(train_directory,
                            labels="inferred",
                            label_mode="categorical",
                            color_mode="grayscale",
                            batch_size=64,
                            image_size=image_size,
                            shuffle=True
                        )
val_dataset = image_dataset_from_directory(test_directory,
                            labels="inferred",
                            label_mode="categorical",
                            color_mode="grayscale",
                            batch_size=64,
                            image_size=image_size,
                            shuffle=True
                        )

Found 316 files belonging to 2 classes.
Found 35 files belonging to 2 classes.


Let's see if the files imported as expected.

In [4]:
print(train_dataset.element_spec)
print(train_dataset.class_names)
for data, labels in train_dataset:
    print(data.shape)
    print(data.dtype)
    print(labels.shape)
    print(labels.dtype)

(TensorSpec(shape=(None, 368, 640, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None, 2), dtype=tf.float32, name=None))
['Cholesteric', 'Nematic']
(64, 368, 640, 1)
<dtype: 'float32'>
(64, 2)
<dtype: 'float32'>
(64, 368, 640, 1)
<dtype: 'float32'>
(64, 2)
<dtype: 'float32'>
(64, 368, 640, 1)
<dtype: 'float32'>
(64, 2)
<dtype: 'float32'>
(64, 368, 640, 1)
<dtype: 'float32'>
(64, 2)
<dtype: 'float32'>
(60, 368, 640, 1)
<dtype: 'float32'>
(60, 2)
<dtype: 'float32'>


In [5]:
print(val_dataset.element_spec)
print(val_dataset.class_names)
for data, labels in val_dataset:
    print(data.shape)
    print(data.dtype)
    print(labels.shape)
    print(labels.dtype)

(TensorSpec(shape=(None, 368, 640, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None, 2), dtype=tf.float32, name=None))
['Cholesteric', 'Nematic']
(35, 368, 640, 1)
<dtype: 'float32'>
(35, 2)
<dtype: 'float32'>


Let's define out pipeline.

In [68]:
image_shape = (image_size[0], image_size[1], 1)
X_inputs = Input(shape = image_shape)
# Rescale images to have values in range [0,1]
X = Rescaling(scale = 1/255)(X_inputs)


# Apply convolutional and pooling layers
X = Conv2D(filters=32, kernel_size=(3,3), activation="relu")(X)
X = MaxPooling2D(pool_size=(3,3))(X)
X = Conv2D(filters=32, kernel_size=(3,3), activation="relu")(X)
X = MaxPooling2D(pool_size=(3,3))(X)

# Apply fully connected layer
X = Flatten()(X)
X = Dense(units=128, activation="relu")(X)
X = Dense(units=64, activation="relu")(X)
# Output layer
num_classes = 2
X_outputs = Dense(units=num_classes, activation="softmax")(X)

model = Model(inputs = X_inputs, outputs = X_outputs)

Let's see what this model looks like.

In [69]:
model.summary()

Model: "functional_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_5 (InputLayer)         [(None, 368, 640, 1)]     0         
_________________________________________________________________
rescaling_4 (Rescaling)      (None, 368, 640, 1)       0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 366, 638, 32)      320       
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 122, 212, 32)      0         
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 120, 210, 32)      9248      
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 40, 70, 32)        0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 89600)            

Now we need to compile, train and test the model.

In [70]:
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

In [71]:
model.fit(train_dataset, epochs=15)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x21e4bf3da08>

Let's see how the model does on unseen data.

In [73]:
loss, acc = model.evaluate(val_dataset)



Let's see the confusion matrix on our predictions to understand how our model is performing on the unseen data.

In [74]:
predictions = model.predict(val_dataset)
y_pred = np.argmax(predictions, axis = 1)

In [75]:
# Get true labels
y_true = np.argmax(np.concatenate([labels for data, labels in val_dataset], axis=0), axis=1)
print(y_true)
print(y_pred)

print("Confusion matrix:")
print(confusion_matrix(y_true=y_true, y_pred=y_pred, normalize="true"))

[1 0 0 0 0 1 1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1]
[0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0]
Confusion matrix:
[[0.63157895 0.36842105]
 [0.9375     0.0625    ]]


An imbalnced training dataset (many more cholesteric images than other phases) is one major flaw with the current model.