Skip to content

Kaggle RSNA-STR Pulmonary Embolism detection code for 55th solution

Notifications You must be signed in to change notification settings

stanleyjzheng/RSNA-STR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RSNA-STR Code

About

I created this with the intention to centralize all code and make it more maintainable. Docstrings have been written where possible, and code is more organized and systematic than the Kaggle notebooks.

Progress

Done so far:

Todo in order of importance:

  • Make Stage 2 GRU training more representative of real inference (add train transforms, improve RSNADataset)
  • 3d augmentations (currently committed but not working and extremely experimental)
  • Docstrings

About config.json

Parameters to be changed:

  • img_size: Image size (square)
  • lr: Learning rate
  • epochs
  • train_bs/valid_bs: training/validation batch size
  • save_path: Model save path
  • verbose_step: Number of steps between printing metrics
  • num_workers: Number of threads to run concurrent processes with
  • efbnet: Which efficientnet architecture to use. For example, 'efbnet': 'efficientnet-b7'
  • train_folds: Nested list with folds to train with. Dimension 0 is the number of folds to run.

Using on Discovery Cluster

Please run ./run.sh to install dependencies. Note that this is untested, and I have no clue how to download the 910gb dataset. dataset.sh contains a small script to clone the Kaggle RSNA dataset into input as well as a few sanity check models. Then, simply run each script with python3 script_to_run.py

Using on Kaggle

To use these scripts on Kaggle, simply add the following datasets:

Then, the following lines can be added to the beginning of a script or notebook. The config file will be created during each run (since updating config.json requires a new Kaggle dataset).

package_path = '../input/efficientnet-pytorch-07/efficientnet_pytorch-0.7.0'
import sys; sys.path.append(package_path); sys.path.append('./')

bash_commands = [
            'cp ../input/gdcm-conda-install/gdcm.tar .',
            'tar -xvzf gdcm.tar',
            'conda install --offline ./gdcm/gdcm-2.8.9-py37h71b2a6d_0.tar.bz2',
            'cp ../input/rsna-str-github/utils.py .'
            ]

import subprocess
for bashCommand in bash_commands:
    process = subprocess.Popen(bashCommand.split(), stdout=subprocess.PIPE)
    output, error = process.communicate()

CFG = {
    'train': True,
    'train_img_path': '../input/rsna-str-pulmonary-embolism-detection/train',
    'test_img_path': '../input/rsna-str-pulmonary-embolism-detection/test',
    'cv_fold_path': '../input/rsna-str-github/rsna_train_splits_fold_20.csv',
    'train_path': '../input/rsna-str-pulmonary-embolism-detection/train.csv',
    'test_path': '../input/rsna-str-pulmonary-embolism-detection/test.csv',
    'image_target_cols': [
        'pe_present_on_image', # only image level
        'rv_lv_ratio_gte_1', # exam level
        'rv_lv_ratio_lt_1', # exam level
        'leftsided_pe', # exam level
        'chronic_pe', # exam level
        'rightsided_pe', # exam level
        'acute_and_chronic_pe', # exam level
        'central_pe', # exam level
        'indeterminate' # exam level
    ],
    'img_num': 200,
    'img_size': 256,
    'lr': 0.0005,
    'epochs': 1,
    'device': 'cuda', # cuda, cpu
    'train_bs': 64,
    'valid_bs': 256,
    'accum_iter': 1,
    'verbose_step': 1,
    'num_workers': 4,
    'efbnet': 'efficientnet-b0',
    
    'train_folds': [list(range(0, 16))],
    
    'valid_folds': [list(range(17, 21))],
    
    'model_path': '../input/kh-rsna-model',
    'save_path': '.',
    'tag': 'efb0_stage1_multilabel',
}
import json
with open('config.json', 'w+') as outfile:
    json.dump(CFG, outfile, indent=4)

About

Kaggle RSNA-STR Pulmonary Embolism detection code for 55th solution

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages