HuBMAP - Hacking the Kidney

Source code for 2-nd place winner

We did this project locally, not in kaggle environment, so it can be resource heavy for that usage

Our rig was single node machine:

Linux 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
16CPU
4GPU V100 32gb
256gb RAM
a lot of memory, one folded model produce ~50gb of date at the end of inference phase

Result of LB .925 / PB .949 can be achieved without ensembling with single 4fold model Unet-timm-regnety16 under 2hrs of training, default config

Result of LB .926 / PB .950 was ensemble of 4 4folded models:

Unet-regnety16
Unet-regnetx32
UnetPlusPlus-regnety16
Unet-regnety16 with scse attention decoder

Training time for one model ~ 2hrs, 50-80 epochs of mixed data from grid sampling and object sampling Inference time on kaggle was ~7hrs for all images, locally ~30min for 5 images

Data structure:

-src // soruce code
-input
    -HUBMAP // source folder extracted from zip file, can be read-only
    -bigmasks // tiff full-size masks; generated by data_gen.py
    -CUTS // tiles cut off from tiffs, paired with masks; data_gen.py
        -glomi_x33_1024
            -imgs
                -1e2425f28
                    0000001.png
                    0000002.png
                    ...
                -2f6ecfcdf
                    0000001.png
                    0000002.png
                    ...
                ...
            -masks
                ...
        -grid_x33_1024
            ...

    -SPLITS // train and val splits, we use 4-fold split, see split_gen.py; data_gen.py
        -glomi_split
            -0e
                -train
                    -imgs
                        -1e2425f28
                        ...
                    -masks
                        ...
                -val
                    -imgs
                        -2f6ecfcdf
                        ...
                    -masks
                        ...
            -2a
            -18
            -cc
        -grid_split
            -0e
            -2a
            -18
            -cc
            
-output // results folder
    -2021_feb_07_12_13_55 // trained model folder, IT MEANT TO BE ONE STRUCTURE, we are selecting this whole folder in inference module
        -model // model checkpoints, selected by best val / EMAval model score
            e100.pth
        -logs // tensorboard logs
        -src // copy of source code from root/src; for reproducing
        -cfg.yaml // model config
    ...

Usage

All scripts should run from root directory:

python3 src/data_gen.py

Install Requirements

One specific requirement is shallow library, src/shallow

pip3 install -r requirements.txt
cd src/shallow/ && pip3 install .

We are using rasterio as tiff reader and installing rasterio through pip should be enough.

But it uses GDAL under the hood, which can be pain to install, so just in case:

apt-get update
apt-get install libgdal-dev
export CPLUS_INCLUDE_PATH=/usr/include/gdal
export C_INCLUDE_PATH=/usr/include/gdal
gdal-config --version
pip install GDAL==2.2.3

Create data

This line will read input/HUBMAP folder with source data and produce cutted tiles in 4 fold splits for training.

python3 src/data_gen.py

Train model

Pytorch, albumentations, BCE+Dice, 4folds, Full precision

This will start training in backgorund (SCREEN), default mode is 4 GPUS (we used V100 32gb)

ALl default params in src/configs/unet_gelb.yaml

Input data keys that connects 'train_0e_33' to folder input/SPLITS/glomi_split/0e/ in data.py

start.sh

This will create timestamped folder with results and logs in output

Single GPU usage:

PARALLEL:
  DDP: False

Single fold usage:

DATA:
  TRAIN:
    DATASETS: ['train_0e_33',  'grid_0e_33']
    GPU_PRELOAD: False
    PRELOAD: False
    CACHE: False
    MULTIPLY: {'rate':1}
  VALID:
    FOLDS: ['val_0e_33' ]
    PRELOAD: False

TRAIN:
  NUM_FOLDS: 1

Inference

Inference working like this:

python3 src/run_inference.py --model_folder TIMESTAMPED_FOLDER_WITH_RESULTS_FROM_TRAINING --test_folder FOLDER_WITH_TEST_DATA

i.e.

python3 src/run_inference.py  --model_folder output/2021_feb_07_12_13_55  --test_folder input/HUBMAP/test

Script will select best model from checkpoints based on val score, and run inference on .tiff images in specified folder

Results will be placed in model_folder

There are switchable options in script, such as:

save RLE
do TTA
save Masks
multiprocessing with one image per GPU

This script is quite different from kaggle inference notebook, since we own and can use more resources locally

Ensembling

Ensembling of the results of several models:

Each model of future ensemble should go through inference phase
Models folders should be listed in src/ensemble.py
Run python3 src/ensemble.py
Models predictions will be averaged and saved to new folder
Run

python3 src/run_inference.py  --do_rle --model_folder output/2021_feb_07_12_13_55  --test_folder input/HUBMAP/test

Gleb Sokolov, gleb.m.sokolov@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 288 Commits
input		input
notebooks		notebooks
output		output
src		src
.gitignore		.gitignore
README.md		README.md
req2.txt		req2.txt
reqs.txt		reqs.txt
sample_trainng_log_5epochs.log		sample_trainng_log_5epochs.log
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HuBMAP - Hacking the Kidney

Source code for 2-nd place winner

We did this project locally, not in kaggle environment, so it can be resource heavy for that usage

Result of LB .925 / PB .949 can be achieved without ensembling with single 4fold model Unet-timm-regnety16 under 2hrs of training, default config

Result of LB .926 / PB .950 was ensemble of 4 4folded models:

Usage

Install Requirements

Create data

Train model

Inference

Ensembling

About

Releases

Packages

Languages

MikhailKulyabin/HuBMAP

Folders and files

Latest commit

History

Repository files navigation

HuBMAP - Hacking the Kidney

Source code for 2-nd place winner

We did this project locally, not in kaggle environment, so it can be resource heavy for that usage

Result of LB .925 / PB .949 can be achieved without ensembling with single 4fold model Unet-timm-regnety16 under 2hrs of training, default config

Result of LB .926 / PB .950 was ensemble of 4 4folded models:

Usage

Install Requirements

Create data

Train model

Inference

Ensembling

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages