Adapted preprocess_fn support for Stage1 and refactored some code #14

jchen42703 · 2019-08-21T16:06:44Z

Major Changes

refactored the prediction portion of Stage1 into run_classification_prediction
refactored the prediction portion of Stage2 to use run_seg_prediction like in segmentation_only.SegmentationOnlyInference
ensemble_classification_from_df now saves the output .csv file using index=False. (
a16a482) to prevent an extra column that yields an error when submitting to Kaggle.
Made it so that output .csv's from segmentation_only.SegmentationOnlyInference no longer has NaNs. (<- wasn't a fatal bug because Kaggle handles NaNs as empty masks, but it's just cleaner this way).
refactored load_pretrained_classification_model to handle the edge case where model_name="efficientnet" and pretrained="nih" better.

Additions

inference.classification.Stage1
- New args: fpaths_batch_size, n_tta_iter_per_image, tta_then_preprocess, preprocess_fn, **kwargs
- Now we can run Stage1 in batched lists of filepaths to reduce the memory overhead.
- preprocess_fn is now compatible with Stage1.
  - Did so because I changed the resizing (cv2 to skimage) when training some classification models.
- Now, we can properly use Stage1(tta=True) for the grayscale classification models that are trained with data augmentation using tta_then_preprocess.
- TTA_Classification, TTA_Classification_All
io.utils.resize_and_preprocess
- New function to train the grayscale classification models with downsampled images that have anti-aliasing enabled.

Did so because I changed the resizing (cv2 to skimage) when training some classification models

jchen42703 · 2019-08-21T22:48:10Z

Received TypeError: ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' when I ran

    sub_df = Stage1(models[model_name], test_fpaths, channels=3, img_size=512, batch_size=32, tta=True, threshold=0.5, model_name=model_name, save_p=True,
                    preprocess_fn=resize_and_preprocess,)

Also fixed model_name.lower() to only apply when model_name is a string so the function will run properly when model_name

jchen42703 · 2019-08-21T22:50:22Z

I think we need to change the workflow from preprocess->TTA->predict to TTA->preprocess->predict, as done in io.generators_grayscale.GrayscaleClassificationGenerator. Also, it's extremely memory-intensive. File batching would be a nice addition as well.

Inconsistent with Stage2, but it's not really necessary to do so for models in this repository.

…nce.classification Because inference.classification.Stage1 is extremely memory intensive from extensive TTA.

…on code into run_classification_prediction Also, fixed import bug with preprocess_input (from wrong module). I refactored the prediction code into run_classification_prediction to be consistent with segmentation-only's run_segmentation_prediction and also because it's cleaner. The major change is refactoring the prediction process to be done with batched filepaths lists to deal with the memory issues when doing TTA.

Did so for more autonomy over the inference speed

Refactored the save path into the var save_path. Also, updated the rest of the functions to be compatible with TTA_Classification, such as TTA_Classification_All, run_classification_prediction, and Stage1

…der of TTA/preprocessing This was to address the issue where grayscale models were being trained with [TTA->preprocess_input] instead of the inference pipeline's [preprocess_input->TTA]. It's also cleaner this way because np.invert is supposed to only work with integers in [0, 255], not the preprocesed/resized inputs. The current implementation is pretty hacky, so it'll need some refactoring in the future, but it's fine for now.

…s_fn in the arguments Reduced the number of **kwargs occurrences in the functions arguments (only Stage1 has it) by making it so that Stage1 has it made into a partial'd function. Fixed documentation to accomodate and added support in TTA_Classification to handle cases where preprocess_fn is None. Also, fixed a bug where batch_test_fpaths was called instead of test_fpaths_batched.

…ation dataframe If index=True, then there will be an extra column "Unnamed: 0" which causes an error when submitting to kaggle.

Did so because I use it for grayscale classification model inference anyways.

…ng it Did so to make Stage2 less bulky and reduce repeat code.

Kaggle handled NaN's as empty masks anyways, but this just makes the results cleaner.

jchen42703 · 2019-08-23T16:42:31Z

ValueError: need at least one array to stack after running segmentation.Stage2. It traces back to run_seg_prediction. This is for commit 3eb3f6e.
Edit: This is not a bug. Just a problem with how I ran it. Accidentally fed an empty list into seg_model.

jchen42703 · 2019-08-23T17:49:02Z

cascade.create_submission needs to be updated with the surge of new arguments.

jchen42703 · 2019-08-23T18:20:46Z

That's currently not a priority. I'll raise it in an issue for the future I guess.

Added preprocess_fn support for Stage1

3dac3a9

Did so because I changed the resizing (cv2 to skimage) when training some classification models

jchen42703 added the enhancement New feature or request label Aug 21, 2019

jchen42703 self-assigned this Aug 21, 2019

Merge branch 'master' into preprocess-fn-classification

2468d89

Fixed bug where if pretrained=None, it still loaded NIH weights.

59081e9

Also fixed model_name.lower() to only apply when model_name is a string so the function will run properly when model_name

jchen42703 added 11 commits August 22, 2019 22:14

Made model_name a mandatory argument for preprocess_fn for Stage1

27ff206

Inconsistent with Stage2, but it's not really necessary to do so for models in this repository.

Moving batch_test_fpaths to inference.utils to be reusable for infere…

c0762be

…nce.classification Because inference.classification.Stage1 is extremely memory intensive from extensive TTA.

Added the n_tta_iter_per_image arg for Stage1

67840ae

Did so for more autonomy over the inference speed

Added the preprocess_fn and **kwargs for TTA_Classification

f4ba7a4

Refactored the save path into the var save_path. Also, updated the rest of the functions to be compatible with TTA_Classification, such as TTA_Classification_All, run_classification_prediction, and Stage1

Quick bug fix, missed index=False when saving the ensembled classific…

a16a482

…ation dataframe If index=True, then there will be an extra column "Unnamed: 0" which causes an error when submitting to kaggle.

Added the resize_and_preprocess function

3779726

Did so because I use it for grayscale classification model inference anyways.

Moved run_seg_prediction to segmentation.py and refactored Stage2 usi…

3eb3f6e

…ng it Did so to make Stage2 less bulky and reduce repeat code.

Fixed a non-fatal bug where neg masks were NaN instead of "-1

a236348

Kaggle handled NaN's as empty masks anyways, but this just makes the results cleaner.

jchen42703 changed the title ~~Added preprocess_fn support for Stage1~~ Adapted preprocess_fn support for Stage1 and refactored some code. Aug 23, 2019

jchen42703 changed the title ~~Adapted preprocess_fn support for Stage1 and refactored some code.~~ Adapted preprocess_fn support for Stage1 and refactored some code Aug 23, 2019

jchen42703 mentioned this pull request Aug 23, 2019

cascade.create_submission needs to be updated to support the influx of new arguments to Stage1 #15

Open

jchen42703 merged commit fee0766 into master Aug 23, 2019

jchen42703 deleted the preprocess-fn-classification branch August 23, 2019 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapted preprocess_fn support for Stage1 and refactored some code #14

Adapted preprocess_fn support for Stage1 and refactored some code #14

jchen42703 commented Aug 21, 2019 •

edited

jchen42703 commented Aug 21, 2019

jchen42703 commented Aug 21, 2019 •

edited

jchen42703 commented Aug 23, 2019 •

edited

jchen42703 commented Aug 23, 2019 •

edited

jchen42703 commented Aug 23, 2019

Adapted preprocess_fn support for Stage1 and refactored some code #14

Adapted preprocess_fn support for Stage1 and refactored some code #14

Conversation

jchen42703 commented Aug 21, 2019 • edited

Major Changes

Additions

jchen42703 commented Aug 21, 2019

jchen42703 commented Aug 21, 2019 • edited

jchen42703 commented Aug 23, 2019 • edited

jchen42703 commented Aug 23, 2019 • edited

jchen42703 commented Aug 23, 2019

jchen42703 commented Aug 21, 2019 •

edited

jchen42703 commented Aug 21, 2019 •

edited

jchen42703 commented Aug 23, 2019 •

edited

jchen42703 commented Aug 23, 2019 •

edited