# Inference with a trained fastai/PyTorch model
This notebook is paired with a [training notebook](https://www.kaggle.com/bjoernholzhauer/fastai-how-to-set-up-efficientnet-b4-0-945-lb) that trains a fastai/PyTorch model using transfer learning on an EfficientNet-B4 pre-trained on ImageNet. This follows the strategy of training your model in one notebook (that can have the internet on) and then to run in the inference in another notebook (that can have internet off, as needed for submission). I borrowed a lot of the set-up from [a notebook by Zach Mueller in another competition](https://www.kaggle.com/muellerzr/fastai-abhishek-inference).

Further improvements could include taking into account the correlation of outputs - at the moment, I ignore that the different diagnoses can sometimes just not co-occur or do more often co-occur.

# Loading packages and defining augmentations

I'm probably re-defining too many things from the training notebook here, but it does little harm.

In [None]:
import torch
torch.cuda.is_available()


In [None]:
%cd ../input/efficientnet-pytorch/EfficientNet-PyTorch/EfficientNet-PyTorch-master
from efficientnet_pytorch import EfficientNet
%cd -


In [None]:
from fastai.vision.all import *
import albumentations
import re

In [None]:
class AlbumentationsTransform(RandTransform):
    "A transform handler for multiple `Albumentation` transforms"
    split_idx,order=None,2
    def __init__(self, train_aug, valid_aug): store_attr()
    
    def before_call(self, b, split_idx):
        self.idx = split_idx
    
    def encodes(self, img: PILImage):
        if self.idx == 0:
            aug_img = self.train_aug(image=np.array(img))['image']
        else:
            aug_img = self.valid_aug(image=np.array(img))['image']
        return PILImage.create(aug_img)
    
#def get_x(row): return data_path/row['image_id']
#def get_y(row): return row['label']

class MyModel(Module):
    def __init__(self, num_classes):

        self.effnet = EfficientNet.from_pretrained("efficientnet-b4")
        self.dropout = nn.Dropout(0.1)
        self.out = nn.Linear(1792, num_classes)

    def forward(self, image):
        batch_size, _, _, _ = image.shape

        x = self.effnet.extract_features(image)
        x = F.adaptive_avg_pool2d(x, 1).reshape(batch_size, -1)
        outputs = self.out(self.dropout(x))
        return outputs
    
class AlbumentationsTransform(RandTransform):
    "A transform handler for multiple `Albumentation` transforms"
    split_idx,order=None,2
    def __init__(self, train_aug, valid_aug): store_attr()
    
    def before_call(self, b, split_idx):
        self.idx = split_idx
    
    def encodes(self, img: PILImage):
        if self.idx == 0:
            aug_img = self.train_aug(image=np.array(img))['image']
        else:
            aug_img = self.valid_aug(image=np.array(img))['image']
        return PILImage.create(aug_img)    
    
def get_train_aug(): return albumentations.Compose([            
            albumentations.RandomResizedCrop(380,380, scale=(0.85, 1.0)),
            albumentations.ShiftScaleRotate(shift_limit=0.025, scale_limit=0.1, rotate_limit=10, p=0.5),
            albumentations.HueSaturationValue(
                hue_shift_limit=0.2, 
                sat_shift_limit=0.2, 
                val_shift_limit=0.2, 
                p=0.5
            ),
            albumentations.RandomBrightnessContrast(
                brightness_limit=(-0.1,0.1),
                contrast_limit=(-0.1, 0.1), 
                p=0.5
            ),
            albumentations.CoarseDropout(p=0.5),
            albumentations.Cutout(p=0.5)
])    
    
def get_valid_aug(): return albumentations.Compose([
    albumentations.Resize(385,385),
    albumentations.CenterCrop(380,380, p=1.)    
], p=1.)

item_tfms = AlbumentationsTransform(get_train_aug(), get_valid_aug())    
    

# Load the pre-trained model

Now we use the `load_learner` function in fastai to load the pre-trained model. During training, we used mixed precision (i.e. both 16-bit and 32-bit floating-point in the model) to make the training run faster and use less memory, but for inference we switch back to completely 32-bit

In [None]:
learn = load_learner(Path('../input/fastai-first-try/baseline-b4'), 
                     cpu=False)
learn.to_native_fp32()

# Set-up data loading

The only tricky bit here is that for the training data I had specified a path of `../input/ranzcr-clip-catheter-line-classification/train`, but here I want to use images from `../input/ranzcr-clip-catheter-line-classification/test`, so I pre-pend `../test/` before the image names, so that the column in the dataframe becomes e.g. `../test/1.2.826.0.1.3680043.8.498.10003659706701445041816900371598078663` and the dataloader will then turn this into `../input/ranzcr-clip-catheter-line-classification/train/../test/1.2.826.0.1.3680043.8.498.10003659706701445041816900371598078663.jpg`. Clearly, I set this up a bit stupidly and you can easily make this more elegant.

In [None]:
path = Path("../input")
data_path = path/'ranzcr-clip-catheter-line-classification/test'
df = pd.DataFrame({'StudyInstanceUID': ['../test/' + re.sub(r'.jpg', '', file) for file in os.listdir('../input/ranzcr-clip-catheter-line-classification/test')]})
df


In [None]:
test_dl = learn.dls.test_dl(df)
test_dl.show_batch(figsize=(12,12))

# Get predictions using test-time-augmentation (TTA)

Now we create augmented versions of each image in the test set, do predictions for each of these version (=[test-time-augmentation](https://docs.fast.ai/learner.html#TTA)) using the training augmentations and then also using the less "aggressive" test dataset augmentations, then we average the predicted probabilities with a ratio 75:25 ratio.

In [None]:
preds, _ = learn.tta(dl=test_dl, n=15, beta=0.25)

# Create submission.csv
Note, that here I'm taking out the '../test/' string I added above to facilitate data loading.

In [None]:
submit = pd.DataFrame(preds, columns=['ETT - Abnormal', 'ETT - Borderline',
       'ETT - Normal', 'NGT - Abnormal', 'NGT - Borderline',
       'NGT - Incompletely Imaged', 'NGT - Normal', 'CVC - Abnormal',
       'CVC - Borderline', 'CVC - Normal', 'Swan Ganz Catheter Present'])

submit.insert(0, 'StudyInstanceUID', [re.sub( '../test/', '', astring) for astring in df.StudyInstanceUID])

submit


In [None]:
submit.to_csv('submission.csv',
              index=False)