This repo is to host the corresponding slide deck and code from my presentation in the below conference:
One Week ONLINE Short Term Training Programme (STTP) on “Research Issues and Challenges in Deep learning-based Medical Image analysis and Medical Diagnosis" organaised by the Department of Electronics and Communication Engineering of Sri Ramakrishna Engineering College, Coimbatore,
Tamilnadu
India
contact: rajesh.thennan@gmail.com
Image data obtained from https://www.kaggle.com/praveengovi/coronahack-chest-xraydataset and simplified
The images folder and chestXrayCoronaMetadata.csv should be accssible in the current working directory
Use Parallel processing if you have more than 4 CPU threads
Seq_Dataprep.py
1.Read All images and find smallest dimension
2.Tabular Data Read - Tabular Data with Image Path and class
3.Image data read and preprocessing
3.1 Reading corresponding images and converting to grayscale
3.2 resize image to smallest dimension
3.3 Scale down further if required (not doing here)
3.4 0-255 sclaing - data scaling
4.Cleanup - remove missing rows
5.Shuffle
6.Change Labels to integers
7.Balance
8.Shuffle again
9.Sequential combine -OPTIONAL
10.Split Train and test
11.Split X and Y
12.Scale Tabular Data
13.Export as Numpy Array, ready for model
Skip the Parallel Processing files and jump to modelBuild if you choose to use Sequential Processing
Parallel_Dataprep_1.py , Parallel_Dataprep_2.py and Parallel_Dataprep_3.py
-Simplified parallel processing equivalent of Seq_Dataprep.py
Parallel_Dataprep_V2.py
-Slightly more complex process.
-Tedious to troubleshoot.
-But gets the job done in one file.
modelBuild.py
1.Concatenated model, combining an ANN from tabular data and CNN a from image data
2.Used Tensorflow callbacks TensorBoard and checkpoint(ModelCheckpoint)