### EXPLORATION OF POSSIBLE USE OF NEURAL NETWORKS IN ACNE DIAGNOSIS 
CLEMENT-ANDI CLEMENT EDET 
COMPUTATIONAL PHYSICS REPORT

### Abstract

Acne is a facial condition suffered usually by teenagers and young adults. Conventional methods of acne diagnosis involve booking an appointment with a doctor and undergoing examination. Examination may include checking family history, asking questions on ingestion habits and skin care products used. This is generally time consuming for both doctors and patients. The question this paper tries to answer the question “is it possible to cut out delays in the diagnosis process?” After all, a good model would save medical professionals a lot of time, especially when it comes to probing for relevant information from patients. This is because patients may not have adequate knowledge of information such as family history which could lead medical professionals down the wrong path. Also, patients may conceal their hidden habits and possibly the skin products they use. As such a model that relies as little as possible on subjective would be of massive help to all parties. 

### Introduction
This attempt at diversifying medical diagnosis methods seeks to propose and contribute to the growing field of machine learning and its applications in modern medicine. The model used is only a very small one with limited amounts of data, however, this should give a glimpse of the potential of neural networks in medical diagnosis.

### Tools
The software and libraries used for this project include: Jupyter Notebook, PIL, matplotlib, TensorFlow and keras. 

Jupyter notebook was chosen due to the ability to run various parts of code in isolation and perform adjustments until desired results were reached. This is all done without having to rerun all the code again. This was helpful with the fact that training models is not a computationally inexpensive task, especially using only a CPU. Jupyter notebook was also chosen because of the localization of all the data needed to analyze the data. The graphs persist which allows analysis on the data to be carried out over long periods. Also, JSON files were used as backups of data in case that data was lost for whatever reason. 

 

PIL – the Python Imaging Library allows for image processing capabilities with the Python Interpreter. It is also a very popular library with a lot of support. 

 

Matplotlib was chosen due to its data visualization prowess. 

TensorFlow is a powerful and flexible machine learning framework and keras is an API that aids building and training of neural networks with TensorFlow. 

### Procedure

Setting up the neural network was of utmost importance. The placement of layers, variables, constants all play a crucial role in a well-functioning neural network. 

In the case of a convolutional neural network the following layers are the most common: 

Convolution: this is used for the activation of various features in an image. This is achieved by pushing an image through a set of convolution filters. A convolutional filter is a technique used in image processing to change the brightness of a pixel based on the brightness of the neighboring pixels. A matrix known as a kernel is used to extract the features from the image. 

The kernel is multiplied with the region of the image under itself, and the results are summed. This process is repeated for every location in the image. The result is a new image where each pixel is weighted sum of pixels in the original image. 

Rectified Linear Unit: this utilizes a function where negative values are mapped to zero whilst positive values are maintained. Thus, only the positive features are carried to the next layer. 

Pooling simply reduces the output of a layer, thus, reducing the number of parameters the network needs to learn.  A max pooling layer breaks the image into small blocks and looks for the pixel with the maximum value in the block. From here this value is used to create a new and compressed version of the initial image.  This layer makes models more resistant to small variations in input. 

 

Other layers used: 

Flatten: this converts multi-dimensional output from a previous layer into a one-dimensional array.  It connects the convolution and pooling layers which extract features with the fully connected layers which carry out classification and regression tasks. 

Dense: this is a fully connected layer that connects every neuron in a layer to every neuron in the previous layer. This is where regression tasks are carried out plus the eventual prediction by the model. 

There are two phases, however. Forward propagation and backpropagation. 

Forward propagation involves moving input data through hidden layers to an output layer. Each output receives input values multiplies them by its weights and add bias. It also applies an activation function and passes output to the next layer. In the case of the model developed. A ReLU function is used because it is computationally efficient compared to sigmoid and Tanh functions. It also accelerates the convergence of gradient descent towards the global minimum of the loss function. 

Gradient descent is a function used to minimize loss/error. 

Backpropagation is the process of adjusting weights to minimize error. It calculates loss – error between predicted and actual values. Computation of gradients with respect to each weight using chain rule. Update weights using gradient descent. The Adam optimizer was used due to its efficiency with large problems with lots of data and parameters. 

Dropout: This is a regularization technique that randomly drops out a certain percentage of neurons during training to prevent overfitting. Thus, neurons are forced to learn more robust features that aren’t dependent on a certain neuron. 

L2 regularization is a regression technique that helps prevent overfitting by adding a penalty term to the loss function. Thus, it adds a penalty term that is a sum of all the squared values of all the model weight multiplied by a regularization parameter. 

This was preferred over L1 because it does not force weight to be exactly zero. Instead pushes then to smaller values. This prevents a single feature from dominating the prediction.

Below is a summary of the model used for this project.

![image.png](attachment:7243a67a-5030-4b12-b1c8-d90d7a5ba89a.png)

Fig 1 - Model Summary

The datasets were stored in directories which detailed what kind of images they were.
So the training data were put in the "train" directory and the valodation data were put in the "valid" directory.

Further in this directories there existed "acne" and "clean" directroies which allowed the model tag the images correctly.

### Result

To be a good fit, the training loss decreases to a point of stability. The validation loss decreases to a point of stability plus a small gap exists called the generalization gap. Also, the validation curve must be mostly above the training curve. 

The model created tends to this description. The loss is constantly reducing for both curves until they both flatten out close to each other. There is also a little generalization gap.  

![image.png](attachment:2567132e-01c1-4521-93d6-921aeb68078d.png)

Fig 2 - Accuracy and Loss graphs for training and validation sets.

Below you can see the confusion matrix of the model on the test and validation sets.

![image.png](attachment:2c6bb443-2675-4a43-94b5-2547f787188e.png)

Fig 3 - Confusion Matrix for the test set.

![image.png](attachment:4e394c6c-d587-4f87-857b-f5a9d2a1cec0.png)

Fig 4 - Confusion Matrix for the validation set.

The confusion matrix gives us a way to interpret the results of the model.
Specifically, allow us to quantify the false positive, true positive, false negative and true negative predictions of the model.
In the context of this model these terms have to be explained.
1. False Positive: the model predicts an image as including acne but the image does not.
2. True Positive: the model predicts the image as including acne correctly.
3. False Negative: the model predicts the image as including no acne but the image has acne.
4. True Negative: the model predicts the image as including no acne correctly.

The true negative and false postive categories correspond to the blocks on the first row.
The false negative and true positive categories correspond to the blocks on the second row.

As can be seen the model does very well in correctly predicting images with acne.
However, it does not do as well in predicting images with no acne.
It also seems to have a bias towards the false positives which may allow us infer that the training set had a larger amount of images with acne than not.

The model recorded 94.30% accuracy on the test set.
The model however recorded 76.79% accuracy on the validation set.
After combining the results from both sets the accuracy of the model was 84.43%.

Once the results were gotten comparisons had to be made with those in the featured paper.
The featured paper had 6 classes: cyst, blackhead, normal skin, pustule, whitehead and nodule.
The model developed in that paper had 86.80% accuracy on average.

Thus, the model developed here is in comparison close to it in the classification task.
However, it must be noted that the models are very different in theirs tasks.
The model developed in this paper only classifies based on two labels whilst the other classifies for 6 labels.
Thus, it must be conceded that the featured paper's model might be a much better model.

### Discussion

With this project, we notice that convolutional neural networks are extremely useful in classification tasks. In this case the model was trained on a modest dataset and has been able to learn features of images with acne extremely well. The model was also able to work well with images completely out of the realm of what it was fed. 

To improve this model more data had to be found especially for images with no acne.
Apart from this

With all the results seen, the potential of neural networks in acne diagnosis is very high especially with advancements in GPU technology and potentially massive datasets. 

### References

Shen, X., Zhang, J., Yan, C. et al. An Automatic Diagnosis Method of Facial Acne Vulgaris Based on Convolutional Neural Network. Sci Rep 8, 5839 (2018). https://doi.org/10.1038/s41598-018-24204-6 