# CVR-Net: A deep convolutional neural network for coronavirus recognition from chest radiography images
**Authors:** Md. Kamrul Hasan and Md. Ashraful Alam


Corresponding Author- <br>
**Md. Kamrul Hasan**  <br>
Erasmus Scholar [2017-2019] <br>
M.Sc. in Medical Imaging and Applications (MAIA)(https://maiamaster.udg.edu/ ) <br>
& <br>
Assistant Professor <br>
Department of Electrical and Electronic Engineering (EEE) <br>
Khulna University of Engineering & Technology (KUET) <br>
Khulna-9203, Bangladesh <br>


E-mail: kamruleeekuet@gmail.com or m.k.hasan@eee.kuet.ac.bd<br>
G.Scholar: https://scholar.google.com/citations?user=36WXELIAAAAJ&hl=en


**Md. Ashraful Alam**  <br>
E-mail: ashrafulalam16e@gmail.com  <br>
Github: https://github.com/ashraful16


### Overview of this notebook

To fold data for training, validation, and testing, at first, we have to merge
data from single or multiple data sources, according to the class label.
That means, we have to separate the merged dataset classwise and save them in different  folder.
Finally, for each data class run this script and don't forget to change the source 
and destination directory according to the class. When the folded data is prepared, you can use it for training, validation, and testing for your experiment. We merged multiple datasets from [Kaggle](https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia), [GitHub](https://github.com/ieee8023/covid-chestxray-dataset), and [MICCAI grand challenge](https://covid-ct.grand-challenge.org/), whose description is presented in the following table. 

|  Datasets |    Task types   |         Class Categories         |     No. of images    |
|---------|---------------|--------------------------------|--------------------|
|           |                 |           Normal (NOR)           |         5,856        |
| Dataset-1 | Task-1: 2-class |    Novel Corona Positive (NCP)   |          500         |
|           |                 |           Normal (NOR)           |         1,583        |
|           | Task-2: 3-class |      Common Pneumonia (CPN)      |         4,273        |
|           |                 |    Novel Corona Positive (NCP)   |          500         |
|           |                 |           Normal (NOR)           |         1583         |
|           |                 | Common Pneumonia Bacterial (CPB) |         2780         |
|           | Task-3: 4-class |   Common Pneumonia Viral (CPV)   |         1493         |
|           |                 |    Novel Corona Positive (NCP)   |          500         |
|           |                 |           Normal (NOR)           |         1648         |
| Dataset-2 | Task-4: 3-class |      Common Pneumonia (CPN)      |         4371         |
|           |                 |    Novel Corona Positive (NCP)   |          500         |
|           |                 |           Normal (NOR)           | Train/Test=292/105   |
| Dataset-3 | Task-5: 2-class |    Novel Corona Positive (NCP)   |   Train/Test=251/98  |

### Loading of different packagaes and APIs

In [None]:
import os
from distutils.dir_util import copy_tree
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import KFold
import numpy as np
import glob
import shutil

### Define Source and Destination Directroy

In [None]:
################################  define the directory of folded training data ######################
desination_dir=os.getcwd()+"\\Folded_CPN"         #  define classwise directory for data folding
os.mkdir(desination_dir) 
################################ define class wise raw data directory ###############################

                '''  For different multiclass classification our target classes,
                                    01_NOR, 02_NCP, 
                                01_NOR, 02_CPN, 03_NCP,
                            01_NOR, 02_CPB, 03_CPV, 04_NCP,
                '''

train_class_dir="\\train\\02_CPN"                 #  define classwise raw data directory 
base_dir=os.getcwd()+train_class_dir
base_dir

'C:\\Users\\ml\\Desktop\\covid_temp_data\\data\\train\\02_CPN'

### Define data pattern and store matched data format files

In [None]:
######################  define the pattern to search recursively data from every folder  ##############
im_pat= base_dir+'\\*.*' 

total_imgaes =len(glob.glob(im_pat, recursive=True))
print('Total image found in this class is',total_imgaes)

Total image found in this class is 4273


### Shuffle data 

In [None]:

#########################  find every image directory and save into a list ############################
image_names=[ x  for x in glob.glob(im_pat, recursive=True)]

    
### Create a list using the index of all images and shuffle it to ensure random selection of images####
shuffle_indx=[X for X in range(len(image_names))]

np.random.shuffle(shuffle_indx)

In [None]:
############# Find the if there is any extra image while data is foled into five  ####################
a=len(shuffle_indx)
i=0
if a%5!=0:
    t_p=a
    while True:
        t_p=t_p-1
        i+=1
        
        if t_p%5 ==0:
            break
        
print(i)        

3


### Slice list into five fold

In [None]:
########################### define the list index range of every fold ##################################
sp1=int(len(shuffle_indx)/5)+i                # if there any extra image it goes to first fold
sp2=int(len(shuffle_indx)/5)+sp1
sp3=int(len(shuffle_indx)/5)+sp2
sp4=int(len(shuffle_indx)/5)+sp3
sp5=int(len(shuffle_indx)/5)+sp4

########################### Slice the image list according to fold data range ##########################
split1=shuffle_indx[:sp1]
split2=shuffle_indx[sp1:sp2]
split3=shuffle_indx[sp2:sp3]
split4=shuffle_indx[sp3:sp4]
split5=shuffle_indx[sp4:]

In [None]:
print(len(split1),len(split2),len(split3),len(split4),len(split5)) # print the number of images in each fold

(857, 854, 854, 854, 854)

### Save every fold to the respective directory

#### Fold 1

In [None]:
################ Read images that are selected for first fold and save corresponding directoy ############
for dirs in split1:
    bas_dir=image_names[dirs]
    des_dir=desination_dir+'\\Fold_1\\'
    shutil.copy(bas_dir, des_dir)


#### Fold 2

In [None]:
################ Read images that are selected for second fold and save corresponding directoy ############

for dirs in split2:
    bas_dir=image_names[dirs]
    des_dir=desination_dir+'\\Fold_2\\'
    shutil.copy(bas_dir, des_dir)


#### Fold 3

In [None]:
################ Read images that are selected for third fold and save corresponding directoy #############

for dirs in split3:
    bas_dir=image_names[dirs]
    des_dir=desination_dir+'\\Fold_3\\'
    shutil.copy(bas_dir, des_dir)


#### Fold 4

In [None]:
################ Read images that are selected for fourth fold and save corresponding directoy #############

for dirs in split4:
    bas_dir=image_names[dirs]
    des_dir=desination_dir+'\\Fold_4\\'
    shutil.copy(bas_dir, des_dir)


#### Fold 5

In [None]:
################ Read images that are selected for fifth fold and save corresponding directoy ##############

for dirs in split5:
    bas_dir=image_names[dirs]
    des_dir=desination_dir+'\\Fold_5\\'
    shutil.copy(bas_dir, des_dir)

print("Data is divded into 5 fold successfully.")

Data is divded into 5 fold successfully.
