# SIIM-ISIC Melanoma Classification

This is my solution to the [SIIM-ISIC Melanoma Classification](https://www.kaggle.com/c/siim-isic-melanoma-classification) competition using ResNet34.

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
from fastai import *
from fastai.vision import *

In [None]:
import warnings
warnings.filterwarnings("ignore", category=UserWarning, module="torch.nn.functional")

In [None]:
path = Path('../input/siim-isic-melanoma-classification')
path_512 = Path('../input/siim-isic-melanoma-classification-jpeg512')

In order to make it easier to train,we will use [resized images](https://www.kaggle.com/itacdonev/siim-isic-melanoma-classification-jpeg512) (Thank you [stats](https://www.kaggle.com/itacdonev)). Working with the current images in the `jpeg/train` folder, it would take 5 hours to run one complete epoch (See Version 1) because the images are of different sizes and fastai would need to resize each batch on the fly. If the images are resized beforehand, it will take less time to train, meaning we can run more epochs.

In [None]:
np.random.seed(2)
data = ImageDataBunch.from_csv(
            path_512, folder='train512', csv_labels='train.csv', ds_tfms=get_transforms(), label_col=7, size=128, suffix='.jpg', num_workers=0
        ).normalize(imagenet_stats)

In [None]:
data.classes, data.c, len(data.train_ds), len(data.valid_ds), data.batch_size

In [None]:
data.show_batch(rows=3, figsize=(12,9))

In [None]:
learn = cnn_learner(data, models.resnet34, metrics=AUROC(), model_dir = '/kaggle/working')

I set up the `stage-8` model I created in the previous version. I will use this model to attempt to create a better model.

In [None]:
! wget link-to-stage-8.pth

In [None]:
learn.load('stage-8')
learn.data = data

We add a callback called `SaveModelCallback` that will save the best model generated by `fit_one_cycle`.

We will experiment with much lower learning rates.

In [None]:
learn.fit_one_cycle(
    10, slice(1e-7), callbacks=[callbacks.SaveModelCallback(learn, every='improvement', monitor='auroc', name='stage-9')]
)

In [None]:
learn.load('stage-9')

In [None]:
learn.export('/kaggle/working/export.pkl')

In [None]:
learner = load_learner('/kaggle/working')

As has been pointed out [here](https://www.kaggle.com/edkahara/fast-ai-v3-melanoma-classification#912974), the probability of malignancy will always be `outputs[1]`. This means we may have submitted probabilities that were not the target probabilities in previous versions. This terrible oversight has been corrected. 

In [None]:
img = open_image(path/'jpeg/test/ISIC_0052060.jpg')


pred_class,pred_idx,outputs = learner.predict(img)

# Get the probability of malignancy

prob_malignant = float(outputs[1])

print(pred_class)
print(prob_malignant)

In [None]:
test = os.listdir(path/'jpeg/test')
test.sort(key=lambda f: int(re.sub('\D', '', f)))

with open('/kaggle/working/submission.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(['image_name', 'target'])
    
    for image_file in test:
        image = os.path.join(path/'jpeg/test', image_file) 
        image_name = Path(image).stem

        img = open_image(image)
        pred_class,pred_idx,outputs = learner.predict(img)
        target = float(outputs[1])

        
        writer.writerow([image_name, target])