Skip to content

Lung Disease Detection From X-ray Images using Ensemble CNNS

Notifications You must be signed in to change notification settings

Songloading/Image-Classfication-Tasks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image-Classfication-Tasks

Week 1

  • Learning Objectives: Basic image classification tasks with simple models using Pytorch and Jax

  • Learning Outcomes: Finish using Pytorch and Jax to build a Lenet-5 model to do image classification tasks.

  • Findings & Conclusion:

Framework/Diff Model Bulding Model Training Others
Pytorch - Model is a class
- Define each layer as variables
- Foward Function manually goes through each layer
- Has dataloader that can be enumerate
- Usually use built-in loss function
- Usually we call optimizer.step() to update
- Provide plenty of datasets
Jax - Model is treated as a function
- stax for example returns model parameters and foward function
- Has to define data_stream method
- Has to self-define loss function
- Update each batch state and pass to the next
- Has to manually define dataset

Week 2

Week 3

  • Learning Objectives: Try to load self-defined data to each of the three previous learned pipeline.
  • Learning Outcomes: Finish using Julia, Pytorch, and Knet to load custome dataset.

Week 4-10: Classify the X-Ray dataset using different models w/ high accuracies

  • Outline:
    The NIH Chest X-rays dataset is composed of about 110,000 chest X-ray images with 15 classes (14 diseases, and one for "No findings"). We are going to build/utilize different models to perform classification.
  • Dataset & Preparation:
    For loading data only, Zipfile is an easy way to load data. You probably do not want to unzip the whole dataset (~90G) if you do not plan to train. The code below will help you orient all the data paths and, assuming you want to do binary classification, truning lables into binary.
 zf = z.ZipFile(data_path) 
 df = pd.read_csv(zf.open('Data_Entry_2017.csv')) # load paths&labels
 
 img_name = df.iloc[1, 0]
 df = df.loc[:, "Image Index":"Finding Labels"]

 # Data Preparation
 img_paths = {os.path.basename(name): name for name in zf.namelist() if name.endswith('.png')}
 df['path'] = df['Image Index'].map(img_paths.get)
 df.drop(['Image Index'], axis=1,inplace = True) # keep path and labels only

 # Make the data binary
 labels = df.loc[:,"Finding Labels"]
 one_hot = []
 for i in labels:
    if i == "No Finding":
         one_hot.append(0)
    else:
         one_hot.append(1)
 one_hot_series = pd.Series(one_hot)
 one_hot_series.value_counts()
 df['label'] = pd.Series(one_hot_series, index=df.index)
 df.drop(['Finding Labels'], axis=1,inplace = True)

If you print the data frame, you should see something like this:

                                 path  label
0  images_001/images/00000001_000.png      1
1  images_001/images/00000001_001.png      1
2  images_001/images/00000001_002.png      1
3  images_001/images/00000002_000.png      0
4  images_001/images/00000003_000.png      1

About

Lung Disease Detection From X-ray Images using Ensemble CNNS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published