# Melanoma Classification : The Naive Approach
## Strictly for Beginners

![Imgur](https://i.imgur.com/7U9tlVA.jpg)

This notebook is ideal for beginners who want to start out with Computer vision. This kernel uses fastai library for computer vision processing. It is one of the easy library which can be used for cv. You can check out the official page of fastai to get a hang of it or to know more about it. But for the sake of your understanding, I have explained most of the stuffs used here. Thank me later.

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        os.path.join(dirname, filename)

**In this Competition, as you can see, there are meta data as well as images. I am trying to process both and attain results, so that we can use the combined result to build our classifier.**

# Training the Metadata

**I used XGBoost to train the meta data and predict the targets.**

In [None]:
import warnings;
warnings.filterwarnings('ignore');

import numpy as np
import pandas as pd

from sklearn.datasets import load_iris
import xgboost as xgb
from sklearn.metrics import accuracy_score

In [None]:
train= pd.read_csv('../input/siim-isic-melanoma-classification/train.csv')
test= pd.read_csv('../input/siim-isic-melanoma-classification/test.csv')
sub   = pd.read_csv('../input/siim-isic-melanoma-classification/sample_submission.csv')
train.head()

In [None]:
train.target.value_counts()

In [None]:
train['sex'] = train['sex'].fillna('na')
train['age_approx'] = train['age_approx'].fillna(0)
train['anatom_site_general_challenge'] = train['anatom_site_general_challenge'].fillna('na')

test['sex'] = test['sex'].fillna('na')
test['age_approx'] = test['age_approx'].fillna(0)
test['anatom_site_general_challenge'] = test['anatom_site_general_challenge'].fillna('na')

In [None]:
train['sex'] = train['sex'].astype("category").cat.codes +1
train['anatom_site_general_challenge'] = train['anatom_site_general_challenge'].astype("category").cat.codes +1
train.head()

In [None]:
test['sex'] = test['sex'].astype("category").cat.codes +1
test['anatom_site_general_challenge'] = test['anatom_site_general_challenge'].astype("category").cat.codes +1
test.head()

In [None]:
x_train = train[['sex', 'age_approx','anatom_site_general_challenge']]
y_train = train['target']

x_test = test[['sex', 'age_approx','anatom_site_general_challenge']]

train_DMatrix = xgb.DMatrix(x_train, label= y_train)
test_DMatrix = xgb.DMatrix(x_test)


In [None]:
param = {
    'booster':'gbtree', 
    'eta': 0.3,
    'num_class': 2,
    'max_depth': 5
}
epochs = 100

In [None]:
clf = xgb.XGBClassifier(n_estimators=1000, 
                        max_depth=8, 
                        objective='multi:softprob',
                        seed=0,  
                        nthread=-1, 
                        learning_rate=0.15, 
                        num_class = 2, 
                        scale_pos_weight = (32542/584))

In [None]:
clf.fit(x_train, y_train)

In [None]:
sub.head()

In [None]:
sub["meta_target"] = clf.predict_proba(x_test)[:,1]

In [None]:
#sub.to_csv('submission.csv', index = False)

Now that we predict the target from the meta data, we can move on to image training. For this we are using the fastai library.

# Training the Images


In [None]:
from fastai.imports import *
from fastai import *
from fastai.vision import *
from torchvision.models import *

**Since the quantity of images are gigantic, it will take some time to train the whole dataset even with the GPU's. So I have used some images for training and the predictions are done with the aid of that.**

In [None]:
train.head()

In [None]:
train.shape

In [None]:
label1 = train[train['target']==1]

In [None]:
label1.shape

In [None]:
label2 = train[train['target']==0].iloc[:584]

In [None]:
label2.head()

In [None]:
train_small = pd.concat([label1,label2])

In [None]:
train_small.head()

In [None]:
train_small.shape

In [None]:
train_small['image_name'] = train_small['image_name']+'.jpg'

In [None]:
test['image_name'] = test['image_name']+'.jpg'

In [None]:
train_small.head()

In [None]:
train_small.to_csv('train_jpg.csv', index = False)

In [None]:
test.to_csv('test_jpg.csv', index = False)

In [None]:
tfms = get_transforms(flip_vert=True)

In [None]:
path = "/kaggle/"
data = ImageDataBunch.from_csv(path, folder= 'input/siim-isic-melanoma-classification/jpeg/train', 
                              valid_pct = 0.2,
                              csv_labels = 'working/train_jpg.csv',
                              ds_tfms = tfms, 
                              fn_col = 'image_name',
                              label_col = 'target',
                              bs = 32,
                              size=256).normalize(imagenet_stats);
test_data = ImageDataBunch.from_csv(path, folder= 'input/siim-isic-melanoma-classification/jpeg/test', 
                              valid_pct = 0.2,
                              csv_labels = 'working/test_jpg.csv',
                              ds_tfms = tfms, 
                              fn_col = 'image_name',
                              bs = 32,
                              size=256).normalize(imagenet_stats);


In [None]:
data.show_batch(rows=3,figsize=(8,8));

In [None]:
learn = cnn_learner(data, models.resnet34, metrics=error_rate)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
learn.model.to(device)

In [None]:
#learn.model_dir = "/kaggle/working"
learn.lr_find()

In [None]:
learn.recorder.plot()

In [None]:
learn.fit_one_cycle(8,slice(0.015));

In [None]:
learn.freeze()

# Resources

[Fastai documentation](https://docs.fast.ai/tutorial.resources.html)


[Fastai Course](https://course.fast.ai/)

## Happy Coding!