# Lung X-Ray Database learning - ANN Classification

## Based on two sources
(1) https://www.kaggle.com/tawsifurrahman/covid19-radiography-database  
(2) https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

### (1) Chest x-ray (256x256) images of:   
    - 3616 covid positive patients  
    - 10.192 "normal" (healthy) patients   
    - 6012 Lung Opacity (Non-COVID lung infection)  
    - 1345 Viral Pneumonia  
### (2) Chest x-ray images (varying sizes in groups of test/train/validation) of:  
    - 234 + 1341 + 8 "normal" patients  
    - 390 + 3875 + 8 (bacterial/viral) pneumonia patients  
    
(1) 1341 normal patient images have been taken from source (2) (see readme.md.txt for detailed sources of images)

# 1. Data preparation

Both datasets shall be used! 4 Problems exist when attempting to use both datasets.  

### Problem 1:
Group 3 (non-covid) of set (1) does not specify the cause of the infection, whereas set (2) distinguishes between viral and bacterial pneumonia (lung inflamation).  
Two possible versions:  
1. Viral from sets (1) and (2) shall be merged. Bacterial (2) and non-covid (1) shall be merged.  
1. Viral from sets (1) and (2) shall be merged, all bacterial samples (2) will be separated into its own group. non-covid (1) shall be omitted  

As I would prefer to work with known variables, **I will for no go with variant 2**.  

The four categories will therefore be:
1. healthy/normal
1. bacterial pneumonia
1. viral pneumonia
1. COVID-19

###  Problem 2:
Image sizes differ between data sets.  
Set (1) mostly uses normalised sizes of 256x256. Set (2) uses larger, varying sizes.  

**Set (2) and (1) shall be normalised to fit 256x256.**  

### Problem 3:
Only set (2) is split up into training, testing and validation data.  

After resizing and merging of the two datasets, they will be split up into three groups:  
1. training data (~80% of total)  
1. test data (~15% of total)    
1. validation data (~5% of total)  

Whether the total amount of data per category will be balanced will be assessed once the final amount of resized and merged data in each category do exist.

### Problem 4:
Volume of data is possibly too high.  

Images are grayscale 256x256 images => Each image produces 65.536 pieces of input data.  
Since this project is to be finished in effectively less than 14 days, working with that many inputs creates tremendous load for the network.

**Idea: creating or re-training an autoencoder to reduce the amount of input data to three figures per image if possible. Effectively reducing data load by a factor of ~80.**  

## Solving Problem 2
Resizing images in set (2)

In [3]:
## resizing images of set(2)

from PIL import Image
import os

input1 = os.path.join("data", "chest_xray", "resize", "NORMAL")
input2 = os.path.join("data", "chest_xray", "resize", "PNEUMONIA")
output = os.path.join("data", "chest_xray", "resize", "OUTPUT")

#print(os.listdir(path2))

def resize_jpg(path, path_out, dir_out):
    dirs = os.listdir(path)
    
    if not dir_out in os.listdir(path_out):
            os.mkdir(os.path.join(path_out,dir_out))

    #print(dirs)
    for item in dirs:
        im = Image.open(os.path.join(path, item))
        imResize = im.resize((256,256), Image.ANTIALIAS)
        print(item[:-5])
    
        imResize.save(os.path.join(path_out,dir_out, item[:-5] + '_re.png'), 'PNG', quality=90)


In [34]:
resize_jpg(input1, output, "NORMAL")

IM-0001-0001
IM-0003-0001
IM-0005-0001
IM-0006-0001
IM-0007-0001
IM-0009-0001
IM-0010-0001
IM-0011-0001-0001
IM-0011-0001-0002
IM-0011-0001
IM-0013-0001
IM-0015-0001
IM-0016-0001
IM-0017-0001
IM-0019-0001
IM-0021-0001
IM-0022-0001
IM-0023-0001
IM-0025-0001
IM-0027-0001
IM-0028-0001
IM-0029-0001
IM-0030-0001
IM-0031-0001
IM-0033-0001-0001
IM-0033-0001-0002
IM-0033-0001
IM-0035-0001
IM-0036-0001
IM-0037-0001
IM-0039-0001
IM-0041-0001
IM-0043-0001
IM-0045-0001
IM-0046-0001
IM-0049-0001
IM-0050-0001
IM-0059-0001
IM-0061-0001
IM-0063-0001
IM-0065-0001
IM-0067-0001
IM-0069-0001
IM-0070-0001
IM-0071-0001
IM-0073-0001
IM-0075-0001
IM-0077-0001
IM-0079-0001
IM-0081-0001
IM-0083-0001
IM-0084-0001
IM-0085-0001
IM-0086-0001
IM-0087-0001
IM-0089-0001
IM-0091-0001
IM-0093-0001
IM-0095-0001
IM-0097-0001
IM-0099-0001
IM-0101-0001
IM-0102-0001
IM-0103-0001
IM-0105-0001
IM-0107-0001
IM-0109-0001
IM-0110-0001
IM-0111-0001
IM-0115-0001
IM-0117-0001
IM-0119-0001
IM-0122-0001
IM-0125-0001
IM-0127-0001
IM-01

IM-0685-0001
IM-0686-0001
IM-0687-0001
IM-0688-0001
IM-0689-0001
IM-0691-0001
IM-0692-0001
IM-0693-0001
IM-0694-0001
IM-0695-0001
IM-0696-0001
IM-0697-0001
IM-0698-0001
IM-0700-0001
IM-0701-0001
IM-0702-0001
IM-0703-0001
IM-0704-0001
IM-0705-0001
IM-0706-0001
IM-0707-0001
IM-0709-0001
IM-0710-0001
IM-0711-0001
IM-0712-0001
IM-0713-0001
IM-0714-0001
IM-0715-0001
IM-0716-0001
IM-0717-0001
IM-0718-0001
IM-0719-0001
IM-0721-0001
IM-0722-0001
IM-0724-0001
IM-0727-0001
IM-0728-0001
IM-0729-0001
IM-0730-0001
IM-0732-0001
IM-0733-0001
IM-0734-0001
IM-0735-0001
IM-0736-0001
IM-0737-0001
IM-0738-0001
IM-0739-0001
IM-0740-0001
IM-0741-0001
IM-0742-0001
IM-0746-0001
IM-0747-0001
IM-0748-0001
IM-0750-0001
IM-0751-0001
IM-0752-0001
IM-0753-0001
IM-0754-0001
IM-0755-0001
IM-0757-0001
IM-0761-0001
IM-0764-0001
IM-0766-0001
NORMAL2-IM-0007-0001
NORMAL2-IM-0012-0001
NORMAL2-IM-0013-0001
NORMAL2-IM-0019-0001
NORMAL2-IM-0023-0001
NORMAL2-IM-0027-0001
NORMAL2-IM-0028-0001
NORMAL2-IM-0029-0001
NORMAL2-IM-00

NORMAL2-IM-0642-0001
NORMAL2-IM-0643-0001
NORMAL2-IM-0645-0001
NORMAL2-IM-0647-0001
NORMAL2-IM-0648-0001
NORMAL2-IM-0649-0001
NORMAL2-IM-0650-0001
NORMAL2-IM-0651-0001
NORMAL2-IM-0651-0004
NORMAL2-IM-0652-0001
NORMAL2-IM-0653-0001
NORMAL2-IM-0654-0001
NORMAL2-IM-0655-0001
NORMAL2-IM-0657-0001
NORMAL2-IM-0659-0001
NORMAL2-IM-0660-0001
NORMAL2-IM-0661-0001
NORMAL2-IM-0662-0001
NORMAL2-IM-0663-0001
NORMAL2-IM-0664-0001
NORMAL2-IM-0665-0001
NORMAL2-IM-0666-0001
NORMAL2-IM-0667-0001
NORMAL2-IM-0668-0001
NORMAL2-IM-0669-0001
NORMAL2-IM-0671-0001
NORMAL2-IM-0672-0001
NORMAL2-IM-0673-0001
NORMAL2-IM-0675-0001
NORMAL2-IM-0678-0001
NORMAL2-IM-0680-0001
NORMAL2-IM-0682-0001
NORMAL2-IM-0683-0001
NORMAL2-IM-0684-0001-0001
NORMAL2-IM-0686-0001
NORMAL2-IM-0687-0001
NORMAL2-IM-0689-0001
NORMAL2-IM-0690-0001
NORMAL2-IM-0692-0001
NORMAL2-IM-0693-0001
NORMAL2-IM-0694-0001
NORMAL2-IM-0695-0001
NORMAL2-IM-0696-0001
NORMAL2-IM-0698-0001
NORMAL2-IM-0699-0001
NORMAL2-IM-0700-0001
NORMAL2-IM-0702-0001
NORMAL2-

NORMAL2-IM-1171-0001
NORMAL2-IM-1173-0001
NORMAL2-IM-1174-0001
NORMAL2-IM-1175-0001
NORMAL2-IM-1176-0001
NORMAL2-IM-1177-0001
NORMAL2-IM-1178-0001
NORMAL2-IM-1179-0001
NORMAL2-IM-1180-0001
NORMAL2-IM-1181-0001
NORMAL2-IM-1182-0001
NORMAL2-IM-1183-0001
NORMAL2-IM-1184-0001
NORMAL2-IM-1185-0001
NORMAL2-IM-1187-0001
NORMAL2-IM-1188-0001
NORMAL2-IM-1189-0001
NORMAL2-IM-1190-0001
NORMAL2-IM-1191-0001
NORMAL2-IM-1192-0001
NORMAL2-IM-1194-0001
NORMAL2-IM-1196-0001
NORMAL2-IM-1197-0001
NORMAL2-IM-1198-0001
NORMAL2-IM-1200-0001
NORMAL2-IM-1201-0001
NORMAL2-IM-1202-0001
NORMAL2-IM-1203-0001
NORMAL2-IM-1204-0001
NORMAL2-IM-1205-0001
NORMAL2-IM-1206-0001
NORMAL2-IM-1209-0001
NORMAL2-IM-1214-0001
NORMAL2-IM-1218-0001
NORMAL2-IM-1219-0001
NORMAL2-IM-1220-0001
NORMAL2-IM-1221-0001
NORMAL2-IM-1222-0001
NORMAL2-IM-1223-0001
NORMAL2-IM-1224-0001
NORMAL2-IM-1225-0001
NORMAL2-IM-1226-0001
NORMAL2-IM-1227-0001
NORMAL2-IM-1228-0001
NORMAL2-IM-1231-0001
NORMAL2-IM-1232-0001
NORMAL2-IM-1234-0001
NORMAL2-IM-12

In [36]:
resize_jpg(input2, output, "PNEUMONIA")

person1000_bacteria_2931
person1000_virus_1681
person1001_bacteria_2932
person1002_bacteria_2933
person1003_bacteria_2934
person1003_virus_1685
person1004_bacteria_2935
person1004_virus_1686
person1005_bacteria_2936
person1005_virus_1688
person1006_bacteria_2937
person1007_bacteria_2938
person1007_virus_1690
person1008_bacteria_2939
person1008_virus_1691
person1009_virus_1694
person100_bacteria_475
person100_bacteria_477
person100_bacteria_478
person100_bacteria_479
person100_bacteria_480
person100_bacteria_481
person100_bacteria_482
person100_virus_184
person1010_bacteria_2941
person1010_virus_1695
person1011_bacteria_2942
person1012_bacteria_2943
person1014_bacteria_2945
person1015_virus_1701
person1015_virus_1702
person1016_bacteria_2947
person1016_virus_1704
person1017_bacteria_2948
person1018_bacteria_2949
person1018_virus_1706
person1019_bacteria_2950
person1019_virus_1707
person1019_virus_1708
person101_bacteria_483
person101_bacteria_484
person101_bacteria_485
person101_bacteri

person1170_bacteria_3117
person1170_virus_1969
person1170_virus_1970
person1171_bacteria_3118
person1172_bacteria_3119
person1172_virus_1977
person1173_virus_1978
person1174_virus_1980
person1175_bacteria_3122
person1175_virus_1981
person1176_bacteria_3123
person1176_bacteria_3124
person1176_virus_1996
person1176_virus_1997
person1176_virus_1998
person1177_bacteria_3125
person1177_virus_1999
person1177_virus_2000
person1177_virus_2001
person1177_virus_2002
person1178_bacteria_3126
person1178_virus_2004
person1179_bacteria_3127
person1179_virus_2006
person117_bacteria_553
person117_bacteria_556
person117_bacteria_557
person117_virus_223
person1180_bacteria_3128
person1180_virus_2007
person1180_virus_2008
person1180_virus_2009
person1180_virus_2010
person1180_virus_2011
person1180_virus_2012
person1180_virus_2013
person1180_virus_2014
person1180_virus_2015
person1181_bacteria_3129
person1181_virus_2016
person1182_virus_2017
person1183_bacteria_3131
person1183_virus_2018
person1184_bacter

person1315_virus_2270
person1316_bacteria_3326
person1316_virus_2271
person1317_bacteria_3332
person1317_virus_2273
person1318_bacteria_3334
person1318_bacteria_3335
person1318_virus_2274
person1319_virus_2276
person131_bacteria_629
person131_virus_265
person1320_bacteria_3339
person1320_bacteria_3340
person1320_bacteria_3342
person1320_bacteria_3344
person1320_bacteria_3345
person1320_bacteria_3346
person1320_bacteria_3347
person1320_bacteria_3348
person1320_bacteria_3350
person1320_bacteria_3351
person1320_bacteria_3352
person1320_bacteria_3353
person1320_bacteria_3355
person1320_virus_2277
person1321_bacteria_3358
person1321_bacteria_3359
person1321_virus_2279
person1322_bacteria_3360
person1323_bacteria_3361
person1323_bacteria_3362
person1323_bacteria_3363
person1323_virus_2282
person1323_virus_2283
person1324_virus_2284
person1324_virus_2285
person1325_bacteria_3366
person1325_virus_2287
person1326_bacteria_3372
person1327_bacteria_3373
person1327_bacteria_3374
person1328_bacteri

person1447_bacteria_3741
person1448_virus_2468
person1449_bacteria_3743
person1449_bacteria_3745
person1449_bacteria_3746
person1449_bacteria_3747
person1449_virus_2474
person1449_virus_2476
person144_bacteria_690
person1450_bacteria_3753
person1451_virus_2479
person1451_virus_2480
person1451_virus_2482
person1452_virus_2484
person1453_bacteria_3770
person1453_bacteria_3771
person1453_bacteria_3772
person1453_virus_2485
person1454_bacteria_3774
person1454_bacteria_3778
person1454_bacteria_3779
person1454_bacteria_3780
person1454_bacteria_3781
person1454_bacteria_3782
person1454_virus_2486
person1455_bacteria_3784
person1455_virus_2487
person1455_virus_2488
person1455_virus_2489
person1455_virus_2490
person1455_virus_2492
person1455_virus_2496
person1457_virus_2498
person1458_virus_2501
person1458_virus_2502
person1458_virus_2503
person1459_bacteria_3796
person1459_bacteria_3797
person1459_virus_2506
person145_bacteria_696
person145_virus_294
person145_virus_295
person1460_bacteria_3801

person1583_bacteria_4144
person1584_bacteria_4146
person1584_bacteria_4148
person1585_bacteria_4149
person1585_bacteria_4151
person1585_bacteria_4155
person1588_virus_2762
person1589_bacteria_4171
person1589_bacteria_4172
person1589_virus_2763
person158_bacteria_742
person158_bacteria_743
person158_bacteria_744
person158_bacteria_745
person158_virus_312
person1590_bacteria_4174
person1590_bacteria_4175
person1590_bacteria_4176
person1590_virus_2764
person1591_bacteria_4177
person1591_virus_2765
person1592_bacteria_4178
person1592_virus_2766
person1593_virus_2767
person1594_bacteria_4182
person1594_virus_2768
person1595_bacteria_4183
person1595_virus_2771
person1596_bacteria_4184
person1597_bacteria_4187
person1597_bacteria_4188
person1597_bacteria_4189
person1597_bacteria_4190
person1597_bacteria_4191
person1597_bacteria_4192
person1597_bacteria_4193
person1597_bacteria_4194
person1598_bacteria_4195
person1598_bacteria_4197
person1598_bacteria_4198
person1599_bacteria_4200
person1599_b

person1787_bacteria_4634
person1790_bacteria_4638
person1796_bacteria_4644
person1799_bacteria_4647
person17_bacteria_56
person17_virus_48
person1803_bacteria_4651
person1803_bacteria_4652
person1810_bacteria_4664
person1812_bacteria_4667
person1814_bacteria_4669
person1816_bacteria_4673
person1816_bacteria_4674
person1817_bacteria_4675
person1818_bacteria_4676
person1819_bacteria_4677
person1823_bacteria_4682
person1830_bacteria_4693
person1835_bacteria_4699
person1838_bacteria_4703
person1839_bacteria_4705
person1841_bacteria_4708
person1843_bacteria_4710
person1847_bacteria_4716
person1848_bacteria_4719
person1850_bacteria_4721
person1851_bacteria_4722
person1852_bacteria_4724
person1855_bacteria_4727
person1857_bacteria_4729
person1858_bacteria_4730
person1859_bacteria_4731
person1860_bacteria_4732
person1863_bacteria_4735
person1864_bacteria_4736
person1865_bacteria_4737
person1865_bacteria_4739
person1866_bacteria_4740
person1867_bacteria_4741
person1868_bacteria_4743
person1869_

person292_virus_598
person292_virus_599
person292_virus_600
person292_virus_602
person293_bacteria_1379
person293_virus_604
person293_virus_605
person294_bacteria_1380
person294_bacteria_1381
person294_bacteria_1382
person294_bacteria_1383
person294_bacteria_1384
person294_bacteria_1385
person294_bacteria_1386
person294_bacteria_1388
person294_virus_606
person294_virus_610
person294_virus_611
person295_bacteria_1389
person295_bacteria_1390
person295_virus_612
person296_bacteria_1391
person296_bacteria_1392
person296_bacteria_1393
person296_bacteria_1394
person296_bacteria_1395
person296_bacteria_1396
person296_bacteria_1397
person296_virus_613
person297_bacteria_1400
person297_bacteria_1404
person297_virus_614
person298_bacteria_1408
person298_bacteria_1409
person298_bacteria_1410
person298_bacteria_1411
person298_bacteria_1412
person298_bacteria_1413
person298_virus_617
person298_virus_618
person299_bacteria_1414
person299_bacteria_1416
person299_bacteria_1417
person299_bacteria_1418


person371_virus_754
person372_bacteria_1704
person372_bacteria_1705
person372_bacteria_1706
person372_virus_755
person373_bacteria_1707
person373_bacteria_1708
person373_bacteria_1709
person373_virus_756
person374_bacteria_1710
person374_bacteria_1711
person374_bacteria_1712
person374_virus_757
person375_bacteria_1713
person375_virus_758
person376_bacteria_1715
person376_bacteria_1716
person376_virus_759
person377_bacteria_1717
person377_bacteria_1718
person377_virus_760
person378_virus_761
person379_bacteria_1721
person379_bacteria_1722
person379_virus_762
person37_bacteria_186
person37_bacteria_187
person37_bacteria_188
person37_bacteria_189
person37_virus_82
person380_virus_763
person381_bacteria_1730
person381_bacteria_1731
person382_bacteria_1737
person382_bacteria_1738
person382_bacteria_1739
person382_bacteria_1740
person382_bacteria_1741
person382_bacteria_1742
person382_bacteria_1745
person382_bacteria_1746
person383_bacteria_1747
person383_bacteria_1748
person383_bacteria_174

person467_bacteria_1989
person467_virus_961
person468_bacteria_1990
person468_bacteria_1991
person468_virus_963
person469_bacteria_1992
person469_bacteria_1993
person469_bacteria_1994
person469_bacteria_1995
person469_virus_965
person46_bacteria_224
person46_bacteria_225
person46_virus_96
person470_bacteria_1996
person470_bacteria_1998
person470_bacteria_1999
person470_bacteria_2000
person470_bacteria_2001
person470_bacteria_2002
person470_bacteria_2003
person470_virus_966
person471_bacteria_2004
person471_bacteria_2005
person471_bacteria_2006
person471_virus_967
person471_virus_968
person472_bacteria_2007
person472_bacteria_2008
person472_bacteria_2010
person472_bacteria_2014
person472_bacteria_2015
person472_virus_969
person473_bacteria_2018
person474_virus_971
person475_bacteria_2020
person475_bacteria_2021
person475_bacteria_2022
person475_bacteria_2023
person475_bacteria_2024
person475_bacteria_2025
person475_virus_972
person476_bacteria_2026
person476_virus_973
person477_bacteria

person554_bacteria_2320
person554_bacteria_2321
person554_bacteria_2322
person554_bacteria_2323
person554_virus_1094
person555_bacteria_2325
person556_bacteria_2326
person556_virus_1096
person557_bacteria_2327
person557_virus_1097
person558_bacteria_2328
person558_virus_1098
person559_bacteria_2329
person559_virus_1099
person55_bacteria_260
person55_bacteria_261
person55_bacteria_262
person55_bacteria_263
person55_bacteria_264
person55_bacteria_265
person55_bacteria_266
person55_virus_110
person560_bacteria_2330
person561_bacteria_2331
person562_bacteria_2332
person562_virus_1102
person563_bacteria_2333
person563_bacteria_2334
person563_bacteria_2335
person563_bacteria_2336
person563_bacteria_2337
person563_bacteria_2338
person563_bacteria_2339
person563_bacteria_2340
person563_virus_1103
person564_bacteria_2342
person564_bacteria_2343
person564_bacteria_2344
person564_bacteria_2345
person564_bacteria_2346
person564_bacteria_2347
person564_virus_1104
person565_bacteria_2348
person565_v

person66_bacteria_325
person66_bacteria_326
person66_virus_125
person670_bacteria_2563
person670_virus_1256
person670_virus_1259
person671_bacteria_2564
person671_virus_1260
person672_bacteria_2565
person672_virus_1261
person673_bacteria_2566
person673_virus_1263
person674_bacteria_2568
person675_bacteria_2569
person677_bacteria_2571
person677_virus_1268
person678_bacteria_2572
person679_virus_1270
person67_bacteria_328
person67_bacteria_329
person67_bacteria_330
person67_bacteria_331
person67_bacteria_332
person67_bacteria_333
person67_bacteria_334
person67_virus_126
person680_bacteria_2575
person681_bacteria_2576
person681_virus_1272
person682_virus_1273
person683_bacteria_2578
person684_bacteria_2580
person684_virus_1275
person685_bacteria_2581
person687_bacteria_2583
person688_bacteria_2584
person688_virus_1281
person688_virus_1282
person689_bacteria_2585
person689_bacteria_2586
person68_bacteria_335
person68_bacteria_336
person68_bacteria_337
person690_bacteria_2587
person691_bact

person852_virus_1497
person853_bacteria_2774
person853_bacteria_2775
person853_virus_1498
person854_bacteria_2776
person855_bacteria_2777
person855_virus_1500
person858_bacteria_2780
person859_virus_1504
person85_bacteria_417
person85_bacteria_419
person85_bacteria_421
person85_bacteria_422
person85_bacteria_423
person85_bacteria_424
person860_virus_1505
person861_virus_1506
person862_bacteria_2784
person862_virus_1507
person863_bacteria_2785
person863_virus_1508
person864_virus_1509
person866_bacteria_2788
person866_virus_1511
person867_bacteria_2789
person867_virus_1512
person868_virus_1513
person868_virus_1514
person86_bacteria_428
person86_bacteria_429
person86_virus_159
person870_bacteria_2792
person870_virus_1516
person871_bacteria_2793
person871_virus_1517
person872_bacteria_2795
person873_bacteria_2796
person874_bacteria_2797
person875_bacteria_2798
person876_bacteria_2799
person877_bacteria_2800
person877_virus_1525
person878_bacteria_2801
person878_virus_1526
person87_bacteri

In [7]:
# separate files in PNEUMONIA folder by filename containing "bacteria" or "virus"
dirs = os.path.join(output, "PNEUMONIA")
bac = "bacteria"
vir = "virus"

# if not bac in os.listdir(dirs):
#             os.mkdir(os.path.join(output, "PNEUMONIA_BAC"))

# if not vir in os.listdir(dirs):
#             os.mkdir(os.path.join(output, "PNEUMONIA_VIR"))

print(dirs)
for item in os.listdir(dirs):
    if bac in item:
        #print("BACTERIA: " + item)
        os.rename(os.path.join(dirs, item), os.path.join(output, "PNEUMONIA_BAC", item))
    elif vir in item:
        #print("VIRUS: " + item)
        os.rename(os.path.join(dirs, item), os.path.join(output, "PNEUMONIA_VIR", item))

data\chest_xray\resize\OUTPUT\PNEUMONIA


resizing images in set (1)

In [4]:
output_path_set1 = os.path.join("data", "COVID-19_Radiography_Dataset", "RESIZE")

input_set1_normal = os.path.join("data", "COVID-19_Radiography_Dataset", "Normal")
input_set1_virpneu = os.path.join("data", "COVID-19_Radiography_Dataset", "Viral Pneumonia")
input_set1_covid = os.path.join("data", "COVID-19_Radiography_Dataset", "COVID")

def resize_png(path, path_out, dir_out):
    dirs = os.listdir(path)
    
    if not dir_out in os.listdir(path_out):
            os.mkdir(os.path.join(path_out,dir_out))

    #print(dirs)
    for item in dirs:
        im = Image.open(os.path.join(path, item))
        imResize = im.resize((256,256), Image.ANTIALIAS)
        print(item[:-4])
    
        imResize.save(os.path.join(path_out,dir_out, item[:-4] + '_re.png'), 'PNG', quality=90)

In [60]:
resize_png(input_set1_normal,output_path_set1,"Normal")

Normal-1
Normal-10
Normal-100
Normal-1000
Normal-10000
Normal-10001
Normal-10002
Normal-10003
Normal-10004
Normal-10005
Normal-10006
Normal-10007
Normal-10008
Normal-10009
Normal-1001
Normal-10010
Normal-10011
Normal-10012
Normal-10013
Normal-10014
Normal-10015
Normal-10016
Normal-10017
Normal-10018
Normal-10019
Normal-1002
Normal-10020
Normal-10021
Normal-10022
Normal-10023
Normal-10024
Normal-10025
Normal-10026
Normal-10027
Normal-10028
Normal-10029
Normal-1003
Normal-10030
Normal-10031
Normal-10032
Normal-10033
Normal-10034
Normal-10035
Normal-10036
Normal-10037
Normal-10038
Normal-10039
Normal-1004
Normal-10040
Normal-10041
Normal-10042
Normal-10043
Normal-10044
Normal-10045
Normal-10046
Normal-10047
Normal-10048
Normal-10049
Normal-1005
Normal-10050
Normal-10051
Normal-10052
Normal-10053
Normal-10054
Normal-10055
Normal-10056
Normal-10057
Normal-10058
Normal-10059
Normal-1006
Normal-10060
Normal-10061
Normal-10062
Normal-10063
Normal-10064
Normal-10065
Normal-10066
Normal-10067
No

Normal-144
Normal-1440
Normal-1441
Normal-1442
Normal-1443
Normal-1444
Normal-1445
Normal-1446
Normal-1447
Normal-1448
Normal-1449
Normal-145
Normal-1450
Normal-1451
Normal-1452
Normal-1453
Normal-1454
Normal-1455
Normal-1456
Normal-1457
Normal-1458
Normal-1459
Normal-146
Normal-1460
Normal-1461
Normal-1462
Normal-1463
Normal-1464
Normal-1465
Normal-1466
Normal-1467
Normal-1468
Normal-1469
Normal-147
Normal-1470
Normal-1471
Normal-1472
Normal-1473
Normal-1474
Normal-1475
Normal-1476
Normal-1477
Normal-1478
Normal-1479
Normal-148
Normal-1480
Normal-1481
Normal-1482
Normal-1483
Normal-1484
Normal-1485
Normal-1486
Normal-1487
Normal-1488
Normal-1489
Normal-149
Normal-1490
Normal-1491
Normal-1492
Normal-1493
Normal-1494
Normal-1495
Normal-1496
Normal-1497
Normal-1498
Normal-1499
Normal-15
Normal-150
Normal-1500
Normal-1501
Normal-1502
Normal-1503
Normal-1504
Normal-1505
Normal-1506
Normal-1507
Normal-1508
Normal-1509
Normal-151
Normal-1510
Normal-1511
Normal-1512
Normal-1513
Normal-1514
No

Normal-2069
Normal-207
Normal-2070
Normal-2071
Normal-2072
Normal-2073
Normal-2074
Normal-2075
Normal-2076
Normal-2077
Normal-2078
Normal-2079
Normal-208
Normal-2080
Normal-2081
Normal-2082
Normal-2083
Normal-2084
Normal-2085
Normal-2086
Normal-2087
Normal-2088
Normal-2089
Normal-209
Normal-2090
Normal-2091
Normal-2092
Normal-2093
Normal-2094
Normal-2095
Normal-2096
Normal-2097
Normal-2098
Normal-2099
Normal-21
Normal-210
Normal-2100
Normal-2101
Normal-2102
Normal-2103
Normal-2104
Normal-2105
Normal-2106
Normal-2107
Normal-2108
Normal-2109
Normal-211
Normal-2110
Normal-2111
Normal-2112
Normal-2113
Normal-2114
Normal-2115
Normal-2116
Normal-2117
Normal-2118
Normal-2119
Normal-212
Normal-2120
Normal-2121
Normal-2122
Normal-2123
Normal-2124
Normal-2125
Normal-2126
Normal-2127
Normal-2128
Normal-2129
Normal-213
Normal-2130
Normal-2131
Normal-2132
Normal-2133
Normal-2134
Normal-2135
Normal-2136
Normal-2137
Normal-2138
Normal-2139
Normal-214
Normal-2140
Normal-2141
Normal-2142
Normal-2143
No

Normal-27
Normal-270
Normal-2700
Normal-2701
Normal-2702
Normal-2703
Normal-2704
Normal-2705
Normal-2706
Normal-2707
Normal-2708
Normal-2709
Normal-271
Normal-2710
Normal-2711
Normal-2712
Normal-2713
Normal-2714
Normal-2715
Normal-2716
Normal-2717
Normal-2718
Normal-2719
Normal-272
Normal-2720
Normal-2721
Normal-2722
Normal-2723
Normal-2724
Normal-2725
Normal-2726
Normal-2727
Normal-2728
Normal-2729
Normal-273
Normal-2730
Normal-2731
Normal-2732
Normal-2733
Normal-2734
Normal-2735
Normal-2736
Normal-2737
Normal-2738
Normal-2739
Normal-274
Normal-2740
Normal-2741
Normal-2742
Normal-2743
Normal-2744
Normal-2745
Normal-2746
Normal-2747
Normal-2748
Normal-2749
Normal-275
Normal-2750
Normal-2751
Normal-2752
Normal-2753
Normal-2754
Normal-2755
Normal-2756
Normal-2757
Normal-2758
Normal-2759
Normal-276
Normal-2760
Normal-2761
Normal-2762
Normal-2763
Normal-2764
Normal-2765
Normal-2766
Normal-2767
Normal-2768
Normal-2769
Normal-277
Normal-2770
Normal-2771
Normal-2772
Normal-2773
Normal-2774
No

Normal-3321
Normal-3322
Normal-3323
Normal-3324
Normal-3325
Normal-3326
Normal-3327
Normal-3328
Normal-3329
Normal-333
Normal-3330
Normal-3331
Normal-3332
Normal-3333
Normal-3334
Normal-3335
Normal-3336
Normal-3337
Normal-3338
Normal-3339
Normal-334
Normal-3340
Normal-3341
Normal-3342
Normal-3343
Normal-3344
Normal-3345
Normal-3346
Normal-3347
Normal-3348
Normal-3349
Normal-335
Normal-3350
Normal-3351
Normal-3352
Normal-3353
Normal-3354
Normal-3355
Normal-3356
Normal-3357
Normal-3358
Normal-3359
Normal-336
Normal-3360
Normal-3361
Normal-3362
Normal-3363
Normal-3364
Normal-3365
Normal-3366
Normal-3367
Normal-3368
Normal-3369
Normal-337
Normal-3370
Normal-3371
Normal-3372
Normal-3373
Normal-3374
Normal-3375
Normal-3376
Normal-3377
Normal-3378
Normal-3379
Normal-338
Normal-3380
Normal-3381
Normal-3382
Normal-3383
Normal-3384
Normal-3385
Normal-3386
Normal-3387
Normal-3388
Normal-3389
Normal-339
Normal-3390
Normal-3391
Normal-3392
Normal-3393
Normal-3394
Normal-3395
Normal-3396
Normal-3397

Normal-3947
Normal-3948
Normal-3949
Normal-395
Normal-3950
Normal-3951
Normal-3952
Normal-3953
Normal-3954
Normal-3955
Normal-3956
Normal-3957
Normal-3958
Normal-3959
Normal-396
Normal-3960
Normal-3961
Normal-3962
Normal-3963
Normal-3964
Normal-3965
Normal-3966
Normal-3967
Normal-3968
Normal-3969
Normal-397
Normal-3970
Normal-3971
Normal-3972
Normal-3973
Normal-3974
Normal-3975
Normal-3976
Normal-3977
Normal-3978
Normal-3979
Normal-398
Normal-3980
Normal-3981
Normal-3982
Normal-3983
Normal-3984
Normal-3985
Normal-3986
Normal-3987
Normal-3988
Normal-3989
Normal-399
Normal-3990
Normal-3991
Normal-3992
Normal-3993
Normal-3994
Normal-3995
Normal-3996
Normal-3997
Normal-3998
Normal-3999
Normal-4
Normal-40
Normal-400
Normal-4000
Normal-4001
Normal-4002
Normal-4003
Normal-4004
Normal-4005
Normal-4006
Normal-4007
Normal-4008
Normal-4009
Normal-401
Normal-4010
Normal-4011
Normal-4012
Normal-4013
Normal-4014
Normal-4015
Normal-4016
Normal-4017
Normal-4018
Normal-4019
Normal-402
Normal-4020
Norma

Normal-4577
Normal-4578
Normal-4579
Normal-458
Normal-4580
Normal-4581
Normal-4582
Normal-4583
Normal-4584
Normal-4585
Normal-4586
Normal-4587
Normal-4588
Normal-4589
Normal-459
Normal-4590
Normal-4591
Normal-4592
Normal-4593
Normal-4594
Normal-4595
Normal-4596
Normal-4597
Normal-4598
Normal-4599
Normal-46
Normal-460
Normal-4600
Normal-4601
Normal-4602
Normal-4603
Normal-4604
Normal-4605
Normal-4606
Normal-4607
Normal-4608
Normal-4609
Normal-461
Normal-4610
Normal-4611
Normal-4612
Normal-4613
Normal-4614
Normal-4615
Normal-4616
Normal-4617
Normal-4618
Normal-4619
Normal-462
Normal-4620
Normal-4621
Normal-4622
Normal-4623
Normal-4624
Normal-4625
Normal-4626
Normal-4627
Normal-4628
Normal-4629
Normal-463
Normal-4630
Normal-4631
Normal-4632
Normal-4633
Normal-4634
Normal-4635
Normal-4636
Normal-4637
Normal-4638
Normal-4639
Normal-464
Normal-4640
Normal-4641
Normal-4642
Normal-4643
Normal-4644
Normal-4645
Normal-4646
Normal-4647
Normal-4648
Normal-4649
Normal-465
Normal-4650
Normal-4651
No

Normal-5202
Normal-5203
Normal-5204
Normal-5205
Normal-5206
Normal-5207
Normal-5208
Normal-5209
Normal-521
Normal-5210
Normal-5211
Normal-5212
Normal-5213
Normal-5214
Normal-5215
Normal-5216
Normal-5217
Normal-5218
Normal-5219
Normal-522
Normal-5220
Normal-5221
Normal-5222
Normal-5223
Normal-5224
Normal-5225
Normal-5226
Normal-5227
Normal-5228
Normal-5229
Normal-523
Normal-5230
Normal-5231
Normal-5232
Normal-5233
Normal-5234
Normal-5235
Normal-5236
Normal-5237
Normal-5238
Normal-5239
Normal-524
Normal-5240
Normal-5241
Normal-5242
Normal-5243
Normal-5244
Normal-5245
Normal-5246
Normal-5247
Normal-5248
Normal-5249
Normal-525
Normal-5250
Normal-5251
Normal-5252
Normal-5253
Normal-5254
Normal-5255
Normal-5256
Normal-5257
Normal-5258
Normal-5259
Normal-526
Normal-5260
Normal-5261
Normal-5262
Normal-5263
Normal-5264
Normal-5265
Normal-5266
Normal-5267
Normal-5268
Normal-5269
Normal-527
Normal-5270
Normal-5271
Normal-5272
Normal-5273
Normal-5274
Normal-5275
Normal-5276
Normal-5277
Normal-5278

Normal-5831
Normal-5832
Normal-5833
Normal-5834
Normal-5835
Normal-5836
Normal-5837
Normal-5838
Normal-5839
Normal-584
Normal-5840
Normal-5841
Normal-5842
Normal-5843
Normal-5844
Normal-5845
Normal-5846
Normal-5847
Normal-5848
Normal-5849
Normal-585
Normal-5850
Normal-5851
Normal-5852
Normal-5853
Normal-5854
Normal-5855
Normal-5856
Normal-5857
Normal-5858
Normal-5859
Normal-586
Normal-5860
Normal-5861
Normal-5862
Normal-5863
Normal-5864
Normal-5865
Normal-5866
Normal-5867
Normal-5868
Normal-5869
Normal-587
Normal-5870
Normal-5871
Normal-5872
Normal-5873
Normal-5874
Normal-5875
Normal-5876
Normal-5877
Normal-5878
Normal-5879
Normal-588
Normal-5880
Normal-5881
Normal-5882
Normal-5883
Normal-5884
Normal-5885
Normal-5886
Normal-5887
Normal-5888
Normal-5889
Normal-589
Normal-5890
Normal-5891
Normal-5892
Normal-5893
Normal-5894
Normal-5895
Normal-5896
Normal-5897
Normal-5898
Normal-5899
Normal-59
Normal-590
Normal-5900
Normal-5901
Normal-5902
Normal-5903
Normal-5904
Normal-5905
Normal-5906
N

Normal-6461
Normal-6462
Normal-6463
Normal-6464
Normal-6465
Normal-6466
Normal-6467
Normal-6468
Normal-6469
Normal-647
Normal-6470
Normal-6471
Normal-6472
Normal-6473
Normal-6474
Normal-6475
Normal-6476
Normal-6477
Normal-6478
Normal-6479
Normal-648
Normal-6480
Normal-6481
Normal-6482
Normal-6483
Normal-6484
Normal-6485
Normal-6486
Normal-6487
Normal-6488
Normal-6489
Normal-649
Normal-6490
Normal-6491
Normal-6492
Normal-6493
Normal-6494
Normal-6495
Normal-6496
Normal-6497
Normal-6498
Normal-6499
Normal-65
Normal-650
Normal-6500
Normal-6501
Normal-6502
Normal-6503
Normal-6504
Normal-6505
Normal-6506
Normal-6507
Normal-6508
Normal-6509
Normal-651
Normal-6510
Normal-6511
Normal-6512
Normal-6513
Normal-6514
Normal-6515
Normal-6516
Normal-6517
Normal-6518
Normal-6519
Normal-652
Normal-6520
Normal-6521
Normal-6522
Normal-6523
Normal-6524
Normal-6525
Normal-6526
Normal-6527
Normal-6528
Normal-6529
Normal-653
Normal-6530
Normal-6531
Normal-6532
Normal-6533
Normal-6534
Normal-6535
Normal-6536
N

Normal-7089
Normal-709
Normal-7090
Normal-7091
Normal-7092
Normal-7093
Normal-7094
Normal-7095
Normal-7096
Normal-7097
Normal-7098
Normal-7099
Normal-71
Normal-710
Normal-7100
Normal-7101
Normal-7102
Normal-7103
Normal-7104
Normal-7105
Normal-7106
Normal-7107
Normal-7108
Normal-7109
Normal-711
Normal-7110
Normal-7111
Normal-7112
Normal-7113
Normal-7114
Normal-7115
Normal-7116
Normal-7117
Normal-7118
Normal-7119
Normal-712
Normal-7120
Normal-7121
Normal-7122
Normal-7123
Normal-7124
Normal-7125
Normal-7126
Normal-7127
Normal-7128
Normal-7129
Normal-713
Normal-7130
Normal-7131
Normal-7132
Normal-7133
Normal-7134
Normal-7135
Normal-7136
Normal-7137
Normal-7138
Normal-7139
Normal-714
Normal-7140
Normal-7141
Normal-7142
Normal-7143
Normal-7144
Normal-7145
Normal-7146
Normal-7147
Normal-7148
Normal-7149
Normal-715
Normal-7150
Normal-7151
Normal-7152
Normal-7153
Normal-7154
Normal-7155
Normal-7156
Normal-7157
Normal-7158
Normal-7159
Normal-716
Normal-7160
Normal-7161
Normal-7162
Normal-7163
No

Normal-7710
Normal-7711
Normal-7712
Normal-7713
Normal-7714
Normal-7715
Normal-7716
Normal-7717
Normal-7718
Normal-7719
Normal-772
Normal-7720
Normal-7721
Normal-7722
Normal-7723
Normal-7724
Normal-7725
Normal-7726
Normal-7727
Normal-7728
Normal-7729
Normal-773
Normal-7730
Normal-7731
Normal-7732
Normal-7733
Normal-7734
Normal-7735
Normal-7736
Normal-7737
Normal-7738
Normal-7739
Normal-774
Normal-7740
Normal-7741
Normal-7742
Normal-7743
Normal-7744
Normal-7745
Normal-7746
Normal-7747
Normal-7748
Normal-7749
Normal-775
Normal-7750
Normal-7751
Normal-7752
Normal-7753
Normal-7754
Normal-7755
Normal-7756
Normal-7757
Normal-7758
Normal-7759
Normal-776
Normal-7760
Normal-7761
Normal-7762
Normal-7763
Normal-7764
Normal-7765
Normal-7766
Normal-7767
Normal-7768
Normal-7769
Normal-777
Normal-7770
Normal-7771
Normal-7772
Normal-7773
Normal-7774
Normal-7775
Normal-7776
Normal-7777
Normal-7778
Normal-7779
Normal-778
Normal-7780
Normal-7781
Normal-7782
Normal-7783
Normal-7784
Normal-7785
Normal-7786

Normal-8339
Normal-834
Normal-8340
Normal-8341
Normal-8342
Normal-8343
Normal-8344
Normal-8345
Normal-8346
Normal-8347
Normal-8348
Normal-8349
Normal-835
Normal-8350
Normal-8351
Normal-8352
Normal-8353
Normal-8354
Normal-8355
Normal-8356
Normal-8357
Normal-8358
Normal-8359
Normal-836
Normal-8360
Normal-8361
Normal-8362
Normal-8363
Normal-8364
Normal-8365
Normal-8366
Normal-8367
Normal-8368
Normal-8369
Normal-837
Normal-8370
Normal-8371
Normal-8372
Normal-8373
Normal-8374
Normal-8375
Normal-8376
Normal-8377
Normal-8378
Normal-8379
Normal-838
Normal-8380
Normal-8381
Normal-8382
Normal-8383
Normal-8384
Normal-8385
Normal-8386
Normal-8387
Normal-8388
Normal-8389
Normal-839
Normal-8390
Normal-8391
Normal-8392
Normal-8393
Normal-8394
Normal-8395
Normal-8396
Normal-8397
Normal-8398
Normal-8399
Normal-84
Normal-840
Normal-8400
Normal-8401
Normal-8402
Normal-8403
Normal-8404
Normal-8405
Normal-8406
Normal-8407
Normal-8408
Normal-8409
Normal-841
Normal-8410
Normal-8411
Normal-8412
Normal-8413
No

Normal-8971
Normal-8972
Normal-8973
Normal-8974
Normal-8975
Normal-8976
Normal-8977
Normal-8978
Normal-8979
Normal-898
Normal-8980
Normal-8981
Normal-8982
Normal-8983
Normal-8984
Normal-8985
Normal-8986
Normal-8987
Normal-8988
Normal-8989
Normal-899
Normal-8990
Normal-8991
Normal-8992
Normal-8993
Normal-8994
Normal-8995
Normal-8996
Normal-8997
Normal-8998
Normal-8999
Normal-9
Normal-90
Normal-900
Normal-9000
Normal-9001
Normal-9002
Normal-9003
Normal-9004
Normal-9005
Normal-9006
Normal-9007
Normal-9008
Normal-9009
Normal-901
Normal-9010
Normal-9011
Normal-9012
Normal-9013
Normal-9014
Normal-9015
Normal-9016
Normal-9017
Normal-9018
Normal-9019
Normal-902
Normal-9020
Normal-9021
Normal-9022
Normal-9023
Normal-9024
Normal-9025
Normal-9026
Normal-9027
Normal-9028
Normal-9029
Normal-903
Normal-9030
Normal-9031
Normal-9032
Normal-9033
Normal-9034
Normal-9035
Normal-9036
Normal-9037
Normal-9038
Normal-9039
Normal-904
Normal-9040
Normal-9041
Normal-9042
Normal-9043
Normal-9044
Normal-9045
Norm

Normal-9599
Normal-96
Normal-960
Normal-9600
Normal-9601
Normal-9602
Normal-9603
Normal-9604
Normal-9605
Normal-9606
Normal-9607
Normal-9608
Normal-9609
Normal-961
Normal-9610
Normal-9611
Normal-9612
Normal-9613
Normal-9614
Normal-9615
Normal-9616
Normal-9617
Normal-9618
Normal-9619
Normal-962
Normal-9620
Normal-9621
Normal-9622
Normal-9623
Normal-9624
Normal-9625
Normal-9626
Normal-9627
Normal-9628
Normal-9629
Normal-963
Normal-9630
Normal-9631
Normal-9632
Normal-9633
Normal-9634
Normal-9635
Normal-9636
Normal-9637
Normal-9638
Normal-9639
Normal-964
Normal-9640
Normal-9641
Normal-9642
Normal-9643
Normal-9644
Normal-9645
Normal-9646
Normal-9647
Normal-9648
Normal-9649
Normal-965
Normal-9650
Normal-9651
Normal-9652
Normal-9653
Normal-9654
Normal-9655
Normal-9656
Normal-9657
Normal-9658
Normal-9659
Normal-966
Normal-9660
Normal-9661
Normal-9662
Normal-9663
Normal-9664
Normal-9665
Normal-9666
Normal-9667
Normal-9668
Normal-9669
Normal-967
Normal-9670
Normal-9671
Normal-9672
Normal-9673
No

In [61]:
resize_png(input_set1_virpneu,output_path_set1,"Viral Pneumonia")

Viral Pneumonia-1
Viral Pneumonia-10
Viral Pneumonia-100
Viral Pneumonia-1000
Viral Pneumonia-1001
Viral Pneumonia-1002
Viral Pneumonia-1003
Viral Pneumonia-1004
Viral Pneumonia-1005
Viral Pneumonia-1006
Viral Pneumonia-1007
Viral Pneumonia-1008
Viral Pneumonia-1009
Viral Pneumonia-101
Viral Pneumonia-1010
Viral Pneumonia-1011
Viral Pneumonia-1012
Viral Pneumonia-1013
Viral Pneumonia-1014
Viral Pneumonia-1015
Viral Pneumonia-1016
Viral Pneumonia-1017
Viral Pneumonia-1018
Viral Pneumonia-1019
Viral Pneumonia-102
Viral Pneumonia-1020
Viral Pneumonia-1021
Viral Pneumonia-1022
Viral Pneumonia-1023
Viral Pneumonia-1024
Viral Pneumonia-1025
Viral Pneumonia-1026
Viral Pneumonia-1027
Viral Pneumonia-1028
Viral Pneumonia-1029
Viral Pneumonia-103
Viral Pneumonia-1030
Viral Pneumonia-1031
Viral Pneumonia-1032
Viral Pneumonia-1033
Viral Pneumonia-1034
Viral Pneumonia-1035
Viral Pneumonia-1036
Viral Pneumonia-1037
Viral Pneumonia-1038
Viral Pneumonia-1039
Viral Pneumonia-104
Viral Pneumonia-1040
Vi

Viral Pneumonia-149
Viral Pneumonia-15
Viral Pneumonia-150
Viral Pneumonia-151
Viral Pneumonia-152
Viral Pneumonia-153
Viral Pneumonia-154
Viral Pneumonia-155
Viral Pneumonia-156
Viral Pneumonia-157
Viral Pneumonia-158
Viral Pneumonia-159
Viral Pneumonia-16
Viral Pneumonia-160
Viral Pneumonia-161
Viral Pneumonia-162
Viral Pneumonia-163
Viral Pneumonia-164
Viral Pneumonia-165
Viral Pneumonia-166
Viral Pneumonia-167
Viral Pneumonia-168
Viral Pneumonia-169
Viral Pneumonia-17
Viral Pneumonia-170
Viral Pneumonia-171
Viral Pneumonia-172
Viral Pneumonia-173
Viral Pneumonia-174
Viral Pneumonia-175
Viral Pneumonia-176
Viral Pneumonia-177
Viral Pneumonia-178
Viral Pneumonia-179
Viral Pneumonia-18
Viral Pneumonia-180
Viral Pneumonia-181
Viral Pneumonia-182
Viral Pneumonia-183
Viral Pneumonia-184
Viral Pneumonia-185
Viral Pneumonia-186
Viral Pneumonia-187
Viral Pneumonia-188
Viral Pneumonia-189
Viral Pneumonia-19
Viral Pneumonia-190
Viral Pneumonia-191
Viral Pneumonia-192
Viral Pneumonia-193
Viral

Viral Pneumonia-524
Viral Pneumonia-525
Viral Pneumonia-526
Viral Pneumonia-527
Viral Pneumonia-528
Viral Pneumonia-529
Viral Pneumonia-53
Viral Pneumonia-530
Viral Pneumonia-531
Viral Pneumonia-532
Viral Pneumonia-533
Viral Pneumonia-534
Viral Pneumonia-535
Viral Pneumonia-536
Viral Pneumonia-537
Viral Pneumonia-538
Viral Pneumonia-539
Viral Pneumonia-54
Viral Pneumonia-540
Viral Pneumonia-541
Viral Pneumonia-542
Viral Pneumonia-543
Viral Pneumonia-544
Viral Pneumonia-545
Viral Pneumonia-546
Viral Pneumonia-547
Viral Pneumonia-548
Viral Pneumonia-549
Viral Pneumonia-55
Viral Pneumonia-550
Viral Pneumonia-551
Viral Pneumonia-552
Viral Pneumonia-553
Viral Pneumonia-554
Viral Pneumonia-555
Viral Pneumonia-556
Viral Pneumonia-557
Viral Pneumonia-558
Viral Pneumonia-559
Viral Pneumonia-56
Viral Pneumonia-560
Viral Pneumonia-561
Viral Pneumonia-562
Viral Pneumonia-563
Viral Pneumonia-564
Viral Pneumonia-565
Viral Pneumonia-566
Viral Pneumonia-567
Viral Pneumonia-568
Viral Pneumonia-569
Vira

Viral Pneumonia-897
Viral Pneumonia-898
Viral Pneumonia-899
Viral Pneumonia-9
Viral Pneumonia-90
Viral Pneumonia-900
Viral Pneumonia-901
Viral Pneumonia-902
Viral Pneumonia-903
Viral Pneumonia-904
Viral Pneumonia-905
Viral Pneumonia-906
Viral Pneumonia-907
Viral Pneumonia-908
Viral Pneumonia-909
Viral Pneumonia-91
Viral Pneumonia-910
Viral Pneumonia-911
Viral Pneumonia-912
Viral Pneumonia-913
Viral Pneumonia-914
Viral Pneumonia-915
Viral Pneumonia-916
Viral Pneumonia-917
Viral Pneumonia-918
Viral Pneumonia-919
Viral Pneumonia-92
Viral Pneumonia-920
Viral Pneumonia-921
Viral Pneumonia-922
Viral Pneumonia-923
Viral Pneumonia-924
Viral Pneumonia-925
Viral Pneumonia-926
Viral Pneumonia-927
Viral Pneumonia-928
Viral Pneumonia-929
Viral Pneumonia-93
Viral Pneumonia-930
Viral Pneumonia-931
Viral Pneumonia-932
Viral Pneumonia-933
Viral Pneumonia-934
Viral Pneumonia-935
Viral Pneumonia-936
Viral Pneumonia-937
Viral Pneumonia-938
Viral Pneumonia-939
Viral Pneumonia-94
Viral Pneumonia-940
Viral P

In [62]:
resize_png(input_set1_covid,output_path_set1,"COVID")

COVID-1
COVID-10
COVID-100
COVID-1000
COVID-1001
COVID-1002
COVID-1003
COVID-1004
COVID-1005
COVID-1006
COVID-1007
COVID-1008
COVID-1009
COVID-101
COVID-1010
COVID-1011
COVID-1012
COVID-1013
COVID-1014
COVID-1015
COVID-1016
COVID-1017
COVID-1018
COVID-1019
COVID-102
COVID-1020
COVID-1021
COVID-1022
COVID-1023
COVID-1024
COVID-1025
COVID-1026
COVID-1027
COVID-1028
COVID-1029
COVID-103
COVID-1030
COVID-1031
COVID-1032
COVID-1033
COVID-1034
COVID-1035
COVID-1036
COVID-1037
COVID-1038
COVID-1039
COVID-104
COVID-1040
COVID-1041
COVID-1042
COVID-1043
COVID-1044
COVID-1045
COVID-1046
COVID-1047
COVID-1048
COVID-1049
COVID-105
COVID-1050
COVID-1051
COVID-1052
COVID-1053
COVID-1054
COVID-1055
COVID-1056
COVID-1057
COVID-1058
COVID-1059
COVID-106
COVID-1060
COVID-1061
COVID-1062
COVID-1063
COVID-1064
COVID-1065
COVID-1066
COVID-1067
COVID-1068
COVID-1069
COVID-107
COVID-1070
COVID-1071
COVID-1072
COVID-1073
COVID-1074
COVID-1075
COVID-1076
COVID-1077
COVID-1078
COVID-1079
COVID-108
COVID-1080
CO

COVID-1688
COVID-1689
COVID-169
COVID-1690
COVID-1691
COVID-1692
COVID-1693
COVID-1694
COVID-1695
COVID-1696
COVID-1697
COVID-1698
COVID-1699
COVID-17
COVID-170
COVID-1700
COVID-1701
COVID-1702
COVID-1703
COVID-1704
COVID-1705
COVID-1706
COVID-1707
COVID-1708
COVID-1709
COVID-171
COVID-1710
COVID-1711
COVID-1712
COVID-1713
COVID-1714
COVID-1715
COVID-1716
COVID-1717
COVID-1718
COVID-1719
COVID-172
COVID-1720
COVID-1721
COVID-1722
COVID-1723
COVID-1724
COVID-1725
COVID-1726
COVID-1727
COVID-1728
COVID-1729
COVID-173
COVID-1730
COVID-1731
COVID-1732
COVID-1733
COVID-1734
COVID-1735
COVID-1736
COVID-1737
COVID-1738
COVID-1739
COVID-174
COVID-1740
COVID-1741
COVID-1742
COVID-1743
COVID-1744
COVID-1745
COVID-1746
COVID-1747
COVID-1748
COVID-1749
COVID-175
COVID-1750
COVID-1751
COVID-1752
COVID-1753
COVID-1754
COVID-1755
COVID-1756
COVID-1757
COVID-1758
COVID-1759
COVID-176
COVID-1760
COVID-1761
COVID-1762
COVID-1763
COVID-1764
COVID-1765
COVID-1766
COVID-1767
COVID-1768
COVID-1769
COVID-177

COVID-2376
COVID-2377
COVID-2378
COVID-2379
COVID-238
COVID-2380
COVID-2381
COVID-2382
COVID-2383
COVID-2384
COVID-2385
COVID-2386
COVID-2387
COVID-2388
COVID-2389
COVID-239
COVID-2390
COVID-2391
COVID-2392
COVID-2393
COVID-2394
COVID-2395
COVID-2396
COVID-2397
COVID-2398
COVID-2399
COVID-24
COVID-240
COVID-2400
COVID-2401
COVID-2402
COVID-2403
COVID-2404
COVID-2405
COVID-2406
COVID-2407
COVID-2408
COVID-2409
COVID-241
COVID-2410
COVID-2411
COVID-2412
COVID-2413
COVID-2414
COVID-2415
COVID-2416
COVID-2417
COVID-2418
COVID-2419
COVID-242
COVID-2420
COVID-2421
COVID-2422
COVID-2423
COVID-2424
COVID-2425
COVID-2426
COVID-2427
COVID-2428
COVID-2429
COVID-243
COVID-2430
COVID-2431
COVID-2432
COVID-2433
COVID-2434
COVID-2435
COVID-2436
COVID-2437
COVID-2438
COVID-2439
COVID-244
COVID-2440
COVID-2441
COVID-2442
COVID-2443
COVID-2444
COVID-2445
COVID-2446
COVID-2447
COVID-2448
COVID-2449
COVID-245
COVID-2450
COVID-2451
COVID-2452
COVID-2453
COVID-2454
COVID-2455
COVID-2456
COVID-2457
COVID-245

COVID-3054
COVID-3055
COVID-3056
COVID-3057
COVID-3058
COVID-3059
COVID-306
COVID-3060
COVID-3061
COVID-3062
COVID-3063
COVID-3064
COVID-3065
COVID-3066
COVID-3067
COVID-3068
COVID-3069
COVID-307
COVID-3070
COVID-3071
COVID-3072
COVID-3073
COVID-3074
COVID-3075
COVID-3076
COVID-3077
COVID-3078
COVID-3079
COVID-308
COVID-3080
COVID-3081
COVID-3082
COVID-3083
COVID-3084
COVID-3085
COVID-3086
COVID-3087
COVID-3088
COVID-3089
COVID-309
COVID-3090
COVID-3091
COVID-3092
COVID-3093
COVID-3094
COVID-3095
COVID-3096
COVID-3097
COVID-3098
COVID-3099
COVID-31
COVID-310
COVID-3100
COVID-3101
COVID-3102
COVID-3103
COVID-3104
COVID-3105
COVID-3106
COVID-3107
COVID-3108
COVID-3109
COVID-311
COVID-3110
COVID-3111
COVID-3112
COVID-3113
COVID-3114
COVID-3115
COVID-3116
COVID-3117
COVID-3118
COVID-3119
COVID-312
COVID-3120
COVID-3121
COVID-3122
COVID-3123
COVID-3124
COVID-3125
COVID-3126
COVID-3127
COVID-3128
COVID-3129
COVID-313
COVID-3130
COVID-3131
COVID-3132
COVID-3133
COVID-3134
COVID-3135
COVID-313

COVID-499
COVID-5
COVID-50
COVID-500
COVID-501
COVID-502
COVID-503
COVID-504
COVID-505
COVID-506
COVID-507
COVID-508
COVID-509
COVID-51
COVID-510
COVID-511
COVID-512
COVID-513
COVID-514
COVID-515
COVID-516
COVID-517
COVID-518
COVID-519
COVID-52
COVID-520
COVID-521
COVID-522
COVID-523
COVID-524
COVID-525
COVID-526
COVID-527
COVID-528
COVID-529
COVID-53
COVID-530
COVID-531
COVID-532
COVID-533
COVID-534
COVID-535
COVID-536
COVID-537
COVID-538
COVID-539
COVID-54
COVID-540
COVID-541
COVID-542
COVID-543
COVID-544
COVID-545
COVID-546
COVID-547
COVID-548
COVID-549
COVID-55
COVID-550
COVID-551
COVID-552
COVID-553
COVID-554
COVID-555
COVID-556
COVID-557
COVID-558
COVID-559
COVID-56
COVID-560
COVID-561
COVID-562
COVID-563
COVID-564
COVID-565
COVID-566
COVID-567
COVID-568
COVID-569
COVID-57
COVID-570
COVID-571
COVID-572
COVID-573
COVID-574
COVID-575
COVID-576
COVID-577
COVID-578
COVID-579
COVID-58
COVID-580
COVID-581
COVID-582
COVID-583
COVID-584
COVID-585
COVID-586
COVID-587
COVID-588
COVID-589
C

## Merging datasets

In [19]:
import shutil

#setting up source and destination paths
path_dest = os.path.join("data", "DATASET")
path_set1 = os.path.join("data", "COVID-19_Radiography_Dataset", "RESIZE")
path_set2 = os.path.join("data", "chest_xray", "resize", "OUTPUT")

dest_normal = os.path.join(path_dest, "raw","HEALTHY")
dest_bacpneu = os.path.join(path_dest,"raw", "BACTERIAL_PNEUMONIA")
dest_virpneu = os.path.join(path_dest, "raw","VIRAL_PNEUMONIA")
dest_covid = os.path.join(path_dest, "raw","COVID19")

src_set1_normal = os.path.join(path_set1, "Normal")
src_set1_virpneu = os.path.join(path_set1, "Viral Pneumonia")
src_set1_covid = os.path.join(path_set1, "COVID")

#src_set2_normal = os.path.join(path_set2, "NORMAL") #won't be moved due to set 1 including all but 242 images in here
src_set2_bacpneu = os.path.join(path_set2, "PNEUMONIA_BAC")
src_set2_virpneu = os.path.join(path_set2, "PNEUMONIA_VIR")

In [6]:
#Copying data from source folders to merge datasets

# NORMAL dataset
shutil.copyfile(src_set1_normal, dest_normal)

#Bacterial pneumonia dataset
shutil.copyfile(src_set2_bacpneu, dest_bacpneu)

#viral pneumonia dataset
shutil.copyfile(src_set1_virpneu, dest_virpneu)
shutil.copyfile(src_set2_virpneu, dest_virpneu)

#covid19 dataset
shutil.copyfile(src_set1_covid, dest_covid)

PermissionError: [Errno 13] Permission denied: 'data\\COVID-19_Radiography_Dataset\\RESIZE\\Normal'

In [20]:
# count elements in folders
n_count = 0
for item in os.listdir(dest_normal):
    n_count +=1

print("Healthy images: " + str(n_count))

b_count = 0
for item in os.listdir(dest_bacpneu):
    b_count +=1

print("Bacterial Pneumonia images: " + str(b_count))

v_count = 0
for item in os.listdir(dest_virpneu):
    v_count +=1

print("Viral Pneumonia images: " + str(v_count))

c_count = 0
for item in os.listdir(dest_covid):
    c_count +=1

print("COVID19 images: " + str(c_count))

print()
print("Healthy splits: train(" + str(int(n_count*.8)) + "), test(" + 
                                     str(int(n_count*.15)) + "), val(" + str(int(n_count*.05)) + ")")
print("Bacterial splits: train(" + str(int(b_count*.8)) + "), test(" + 
                                     str(int(b_count*.15)) + "), val(" + str(int(b_count*.05)) + ")")
print("Viral splits: train(" + str(int(v_count*.8)) + "), test(" + 
                                     str(int(v_count*.15)) + "), val(" + str(int(v_count*.05)) + ")")
print("COVID splits: train(" + str(int(c_count*.8)) + "), test(" + 
                                     str(int(c_count*.15)) + "), val(" + str(int(c_count*.05)) + ")")

Healthy images: 10192
Bacterial Pneumonia images: 2780
Viral Pneumonia images: 2838
COVID19 images: 3616

Healthy splits: train(8153), test(1528), val(509)
Bacterial splits: train(2224), test(417), val(139)
Viral splits: train(2270), test(425), val(141)
COVID splits: train(2892), test(542), val(180)


### Create Datasplits based on percentages set out at the start
train(80%), test (15%), val(5%)


In [41]:
from sklearn.model_selection import train_test_split
import random

# NORMAL
all_normal = os.listdir(dest_normal)

random.shuffle(all_normal)

#split twice to get
rest_normal, val_normal = train_test_split(all_normal, train_size=0.95, test_size=0.05)
train_normal, test_normal = train_test_split(rest_normal, train_size=0.84, test_size=0.16)

print("NORMAL")
print("train:" + str(len(train_normal)))
print("test:" + str(len(test_normal)))
print("val:" + str(len(val_normal)))

NORMAL
train:8132
test:1550
val:510


In [65]:
# bacterial pneu
all_bacpneu = os.listdir(dest_bacpneu)

random.shuffle(all_bacpneu)

#split twice to get
rest_bacpneu, val_bacpneu = train_test_split(all_bacpneu, train_size=0.95, test_size=0.05)
train_bacpneu, test_bacpneu = train_test_split(rest_bacpneu, train_size=0.84, test_size=0.16)

print("PNEUMONIA_BAC")
print("train:" + str(len(train_bacpneu)))
print("test:" + str(len(test_bacpneu)))
print("val:" + str(len(val_bacpneu)))

PNEUMONIA_BAC
train:2218
test:423
val:139


In [68]:
# viral pneu
all_virpneu = os.listdir(dest_virpneu)

random.shuffle(all_virpneu)

#split twice to get
rest_virpneu, val_virpneu = train_test_split(all_virpneu, train_size=0.95, test_size=0.05)
train_virpneu, test_virpneu = train_test_split(rest_virpneu, train_size=0.84, test_size=0.16)

print("PNEUMONIA_VIR")
print("train:" + str(len(train_virpneu)))
print("test:" + str(len(test_virpneu)))
print("val:" + str(len(val_virpneu)))

PNEUMONIA_VIR
train:2264
test:432
val:142


In [71]:
# viral pneu
all_covid = os.listdir(dest_covid)

random.shuffle(all_covid)

#split twice to get
rest_covid, val_covid = train_test_split(all_covid, train_size=0.95, test_size=0.05)
train_covid, test_covid = train_test_split(rest_covid, train_size=0.84, test_size=0.16)

print("COVID")
print("train:" + str(len(train_covid)))
print("test:" + str(len(test_covid)))
print("val:" + str(len(val_covid)))

COVID
train:2885
test:550
val:181


In [61]:

def move_splits(train, test, val, source, dest_str):
    train_dir = os.path.join(path_dest,"splits","train")
    test_dir = os.path.join(path_dest,"splits","test")
    val_dir = os.path.join(path_dest,"splits","val")
    
    
    #training
    for item in train:
        if not dest_str in os.listdir(train_dir):
            os.mkdir(os.path.join(train_dir,dest_str))
        os.rename(os.path.join(source, item), os.path.join(train_dir, dest_str, item))

    #test
    
    for item in test:
        if not dest_str in os.listdir(test_dir):
            os.mkdir(os.path.join(test_dir,dest_str))
        os.rename(os.path.join(source, item), os.path.join(test_dir, dest_str, item))

    #validation
    
    for item in val:
        if not dest_str in os.listdir(val_dir):
            os.mkdir(os.path.join(val_dir,dest_str))
        os.rename(os.path.join(source, item), os.path.join(val_dir, dest_str, item))

In [62]:
move_splits(train_normal, test_normal, val_normal, dest_normal, "HEALTHY")

In [66]:
move_splits(train_bacpneu, test_bacpneu, val_bacpneu, dest_bacpneu, "PNEUMONIA_BAC")

In [69]:
move_splits(train_virpneu, test_virpneu, val_virpneu, dest_virpneu, "PNEUMONIA_VIR")

In [72]:
move_splits(train_covid, test_covid, val_covid, dest_covid, "COVID")