# FIND STROKE - DEEP NEURAL NETWORK LEARNING

![image.png](attachment:0cd7972e-be94-4165-9ed5-caaa5b25e364.png)

# CONTENTS 

## - BACKGROUND 
## - THE PROBLEM
## - DATA COLLECTION 
## - MODEL TRAINING 
## - CONCLUSION

# BACKGROUND

One big challenge in medicine today is regarding the diagnosis of patient conditions. To obtain the correct diagnosis and make it early enough during the course of the disease to permit interventions that will improve patient conditions. 

In the emergency care setting, patients presenting with typical/atypical stroke symptoms within a specific time period from onset, are potentially eligible for specific treatments. These treatments could potentially be life-saving. This is of course dependent on a timely diagnosis of the stroke. 

There are numerous tasks for which AI/Deep Learning is currently in use in Healthcare and across other industries.
The potential of AI within the clinical specialty of Radiology (imaging/diagnostics) was explored in this project. This task is currently performed by Radiologists (clinician specialists).

A Deep Learning model was trained in this project to detect abnormalities in radiology imaging - specifically in the identification of the different types of strokes on CT brain imaging.

Deep learning is a computer technique to extract and transform data - with use cases inclusive of image classification, human speech recognition & tabular data - through the use of multiple layers of neural networks. Each layer takes its inputs from previous layers and progressively refines them. The layers are trained by algorithms that minimise their errors and improve their accuracy. In this way, the network learns to perform a specific task.

The neural network is a function that is flexible enough to solve any given problem, just by varying its weight.

Software used in this project include - PyTorch, fastai and Jupyter.

# THE PROBLEM

The problem was defined as a **Classification Supervised Machine Learning** one. The objective here was to build an image classifer to identify and distinguish  between 3 categories of CT brain images: normal brain, brain with ischaemic stroke and brain with haemorrhage stroke. Each image was labelled using its filename.

# DATA COLLECTION 

A Bing Image Search API was used to find and download the URLs of the relevant images. Each image was placed into its corresponding folder - brain_types = 'ischaemicStroke', 'haemorrhagicStroke', 'normalBrain'.
20% of the data was held out as a validation set, with the remaining 80% of the data (the training set) used to train the model. The validation set was used to demonstrate model accuracy i.e. how well the model performed on unseen images. 

A DataLoaders object was used to assemble the data into a format suitable for model training (to separate the data into "train" and "valid" buckets). This object provided the data for the model.

The images were all of different sizes - a random selection and crop to a specific part of each image was done in order to ensure size dimensions were constant across all of the different images used to train the model. This approach permitted a model ability to recognise and focus on different features present in each image. This ultimately ensured the data was assembled in a format fit for model training.

# MODEL TRAINING 

A minimal amount of data was available (approximately 300 images across all categories of brain imaging) for this problem, so the model was trained using a function that randomly resized and cropped each image (to focus on different parts of each image during each epoch), with an image size of 128 pixels. 

The model was subsequently trained and fine-tuned using a convolutional neural network, with inputs of the dataloader object containing the data, the architecture (resnet18) into which the data and parameters were passed, with **"error_rate"** used as the metric of choice (alternative includes "accuracy"). The error rate as well as the validation loss improved with each subsequent epoch (i.e. each pass through of the model across the dataset)

![image.png](attachment:0d526c68-b3b6-4964-882c-82a41253fe4b.png)


I plotted a confusion matrix to check for model error in mis-classification of normal CT brain imaging as ischaemic or haemorrhagic stroke containing brain imaging (or vice versa). This was calculated using the validation dataset. 


![image.png](attachment:9a05d17c-b9b0-4681-abda-0cc20c161b36.png)

The rows represent all the haemorrhagic strokes, ischaemic strokes and normal brain imaging in the dataset respectively. The columns represent the images which the model predicted as haemorrhagic strokes, ischaemic strokes and normal brain imaging respectively. The diagonal of the matrix shows the images which were classified correctly (28 ischaemic stroke brain images, 15 normal brain images and no haemorrhagic stroke brain images). It also shows the numbers that were incorrectly classified (2 ischaemic strokes incorrectly classified as haemorrhagic, 2 normal brain images incorrectly classified as haemorrhagic strokes etc).

A subsquent deep-dive into the errors yielded a combination of a dataset problem (data that was incorrectly labelled into one of the 3 categories) as well as a model problem (incorrect image classification). The images were subsequently sorted by their "loss".

This permitted an ability to clean the data based on model outputs and subsequently re-train the model to improve model accuracy. 

# CONCLUSION

The objective of this project was to explore the potential of Deep Learning in healthcare imaging/diagnostics. Specifically in addressing the problem of detection of stroke abnormalities on CT brain imaging. 

There is a role for AI in support of such diagnostic radiology workflows - by immediately flagging up imaging with detected abnormalities for urgent review by the radiologist.

This could potentially increase the efficiency of such radiology workflow processes, enhance the quality of clinical decision-making as well as the quality of patient care provision.

This is one way in which AI and other such tech enablers could be used to plug the ever increasing gap between demand for and supply of clinical services. Workforce growth is not currently at par with increasing demands on the health service, with such pressures projected to rise in the face of increasing life expectancies, multiplicity of chronic conditions per patient and other such trends. 

This is of course counterbalanced by the need to demonstrate clear evidence of the performance and efficacy of these AI-driven solutions through their ability to reduce time or costs of diagnosis and potentially improve the quality of clinical care provision.