Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapted preprocess_fn support for Stage1 and refactored some code #14

Merged
merged 14 commits into from
Aug 23, 2019

Conversation

jchen42703
Copy link
Owner

@jchen42703 jchen42703 commented Aug 21, 2019

Major Changes

  • refactored the prediction portion of Stage1 into run_classification_prediction
  • refactored the prediction portion of Stage2 to use run_seg_prediction like in segmentation_only.SegmentationOnlyInference
  • ensemble_classification_from_df now saves the output .csv file using index=False. (
    a16a482) to prevent an extra column that yields an error when submitting to Kaggle.
  • Made it so that output .csv's from segmentation_only.SegmentationOnlyInference no longer has NaNs. (<- wasn't a fatal bug because Kaggle handles NaNs as empty masks, but it's just cleaner this way).
  • refactored load_pretrained_classification_model to handle the edge case where model_name="efficientnet" and pretrained="nih" better.

Additions

  • inference.classification.Stage1
    • New args: fpaths_batch_size, n_tta_iter_per_image, tta_then_preprocess, preprocess_fn, **kwargs
    • Now we can run Stage1 in batched lists of filepaths to reduce the memory overhead.
    • preprocess_fn is now compatible with Stage1.
      • Did so because I changed the resizing (cv2 to skimage) when training some classification models.
    • Now, we can properly use Stage1(tta=True) for the grayscale classification models that are trained with data augmentation using tta_then_preprocess.
    • TTA_Classification, TTA_Classification_All
  • io.utils.resize_and_preprocess
    • New function to train the grayscale classification models with downsampled images that have anti-aliasing enabled.

Did so because I changed the resizing (cv2 to skimage) when training some classification models
@jchen42703 jchen42703 added the enhancement New feature or request label Aug 21, 2019
@jchen42703 jchen42703 self-assigned this Aug 21, 2019
@jchen42703
Copy link
Owner Author

Received TypeError: ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' when I ran

    sub_df = Stage1(models[model_name], test_fpaths, channels=3, img_size=512, batch_size=32, tta=True, threshold=0.5, model_name=model_name, save_p=True,
                    preprocess_fn=resize_and_preprocess,)

Also fixed model_name.lower() to only apply when model_name is a string so the function will run properly when model_name
@jchen42703
Copy link
Owner Author

jchen42703 commented Aug 21, 2019

I think we need to change the workflow from preprocess->TTA->predict to TTA->preprocess->predict, as done in io.generators_grayscale.GrayscaleClassificationGenerator. Also, it's extremely memory-intensive. File batching would be a nice addition as well.

Inconsistent with Stage2, but it's not really necessary to do so for models in this repository.
…nce.classification

Because inference.classification.Stage1 is extremely memory intensive from extensive TTA.
…on code into run_classification_prediction

Also, fixed import bug with preprocess_input (from wrong module). I refactored the prediction code into run_classification_prediction to be consistent with segmentation-only's run_segmentation_prediction and also because it's cleaner. The major change is refactoring the prediction process to be done with batched filepaths lists to deal with the memory issues when doing TTA.
Did so for more autonomy over the inference speed
Refactored the save path into the var save_path.  Also, updated the rest of the functions to be compatible with TTA_Classification, such as TTA_Classification_All, run_classification_prediction, and Stage1
…der of TTA/preprocessing

This was to address the issue where grayscale models were being trained with [TTA->preprocess_input] instead of the inference pipeline's [preprocess_input->TTA]. It's also cleaner this way because np.invert is supposed to only work with integers in [0, 255], not the preprocesed/resized inputs. The current implementation is pretty hacky, so it'll need some refactoring in the future, but it's fine for now.
…s_fn in the arguments

Reduced the number of **kwargs occurrences in the functions arguments (only Stage1 has it) by making it so that Stage1 has it made into a partial'd function. Fixed documentation to accomodate and added support in TTA_Classification to handle cases where preprocess_fn is None.
Also, fixed a bug where batch_test_fpaths was called instead of test_fpaths_batched.
…ation dataframe

If index=True, then there will be an extra column "Unnamed: 0" which causes an error when submitting to kaggle.
Did so because I use it for grayscale classification model inference anyways.
…ng it

Did so to make Stage2 less bulky and reduce repeat code.
Kaggle handled NaN's as empty masks anyways, but this just makes the results cleaner.
@jchen42703
Copy link
Owner Author

jchen42703 commented Aug 23, 2019

ValueError: need at least one array to stack after running segmentation.Stage2. It traces back to run_seg_prediction. This is for commit 3eb3f6e.
Edit: This is not a bug. Just a problem with how I ran it. Accidentally fed an empty list into seg_model.

@jchen42703 jchen42703 changed the title Added preprocess_fn support for Stage1 Adapted preprocess_fn support for Stage1 and refactored some code. Aug 23, 2019
@jchen42703 jchen42703 changed the title Adapted preprocess_fn support for Stage1 and refactored some code. Adapted preprocess_fn support for Stage1 and refactored some code Aug 23, 2019
@jchen42703
Copy link
Owner Author

jchen42703 commented Aug 23, 2019

cascade.create_submission needs to be updated with the surge of new arguments.

@jchen42703
Copy link
Owner Author

That's currently not a priority. I'll raise it in an issue for the future I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant