In [1]:
import os
from pathlib import Path
import sys

# If we're using Google Colab, we set the environment variable to point to the relevant folder in our Google Drive:
if 'COLAB_GPU' in os.environ:
    from google.colab import drive
    drive.mount('/content/drive')
    os.environ['SKIN_LESION_CLASSIFICATION'] = '/content/drive/MyDrive/Colab Notebooks/skin-lesion-classification'

# Otherwise, we use the environment variable on our local system:
project_environment_variable = "SKIN_LESION_CLASSIFICATION"

# Path to the root directory of the project:
project_path = Path(os.environ.get(project_environment_variable))

# Relative path to /scripts (from where custom modules will be imported):
scripts_path = project_path.joinpath("scripts")

# Add this path to sys.path so that Python will look there for modules:
sys.path.append(str(scripts_path))

# Now import path_step from our custom utils module to create a dictionary to all subdirectories in our root directory:
from utils import path_setup
path = path_setup.subfolders(project_path)

path['project'] : D:\projects\skin-lesion-classification
path['images'] : D:\projects\skin-lesion-classification\images
path['models'] : D:\projects\skin-lesion-classification\models
path['expository'] : D:\projects\skin-lesion-classification\expository
path['literature'] : D:\projects\skin-lesion-classification\literature
path['notebooks'] : D:\projects\skin-lesion-classification\notebooks
path['presentation'] : D:\projects\skin-lesion-classification\presentation
path['scripts'] : D:\projects\skin-lesion-classification\scripts


In [2]:
from typing import Type, Union      # For type hints
from processing import process      # Custom module for processing metadata

data_dir: Path = path["images"]     # Path to directory containing metadata.csv file
csv_filename: str = "metadata.csv"  # The filename
tvr: int = 3                        # Ratio of training set to validation set. See discussion below for explanation.
seed: int = 0                       # Random seed for parts of the process where randomness is called for.
keep_first: bool = False            # If False, then, for each lesion, we choose a random image to assign to our training set. 
stratified: bool = True             # If True, we stratify classes so that the proportions remain as stable as possible after train/val split. 
                                    # If False, the proportions will be roughly similar.
to_classify: list = ["mel",         # These are the lesion types we are interested in classifying. Any missing ones will be grouped together as the 0-label class.
                     "bcc", 
                     "akiec", 
                     "nv"]

<details>
    <summary><b><i>Train test split explanation: click here to expand/collapse</i></b></summary>
    
We partition our dataset based on ```lesion_id```, **not** on ```image_id```: that way, every lesion will be represented in training or in validation, but not both.

For each binary classification task, we will train a model on
* **exactly one** image for every lesion in our training set;
* **all** images of every lesion in our training set.

In both cases, we will vaildate our model on 
* **exactly one** image for every lesion in our validation set;
* **all** images of every lesion in our validation set. 

**However**, we will make only one prediction per lesion (```lesion_id```) in our validation set, i.e. in the second case (validate on all images), if there are multiple images of a lesion in the validation set, we will combine the predictions for the multiple images into a single prediction for the lesion.

Accordingly, we proceed as follows: 
1. Randomly select (without replacement) a proportion of our $7470$ distinct ```lesion_id```s and label them with ```t``` (train).
2. Label the remaining ```lesion_id```s with ```v``` (validate).
3. For each ```lesion_id``` labeled with a ```t```:
    * Select an ```image_id``` and label it ```t1```.
    * Label all (if any) remaining ```image_id```s corresponding to this ```lesion_id``` with ```ta```.
4.  For each ```lesion_id``` labeled with a ```v```:
    * Select an ```image_id``` and label it ```v1```.
    * Label all (if any) remaining ```image_id```s corresponding to this ```lesion_id``` with ```va```.

In Step 1, the number of ```lesion_id```s randomly selected to be labeled ```t``` will be such that the ratio of ```t```s to ```v```s is as close as possible to a specified ratio (we default to $3$, i.e. $\approx75\%$ of lesions are represented in training). In Steps 3 and 4, the first substep can be done randomly (our default choice), or we can simply choose the "first" image in our table that corresponds to the lesion. 

The four train/val scenarios we consider are:
* ```t1v1```: train on precisely those images labeled ```t1``` and validate on precisely those labeled ```v1```.
* ```t1va```: train on precisely those images labeled ```t1``` and validate on precisely those labeled ```v1``` **or** ```va```.
* ```tav1```: train on precisely those images labeled ```t1``` **or** ```ta``` and validate on precisely those labeled ```v1```.
* ```tava```: train on precisely those images labeled ```t1``` **or** ```ta``` and validate on precisely those labeled ```v1``` ***or*** ```va```.

The mnemonic is ```t``` for training, ```v``` for validation, ```1``` for one-image-per-lesion, and ```a``` for all images.
</details>

In [3]:
# Create an instance of the process class with attribute values as above.
metadata = process(data_dir=data_dir,
                   csv_filename=csv_filename,
                   tvr=tvr,
                   seed=seed,
                   keep_first=keep_first,
                   stratified=stratified,
                   to_classify=to_classify)

Successfully loaded file 'D:\projects\skin-lesion-classification\images\metadata.csv'.
Inserted 'num_images' column in dataframe, to the right of 'lesion_id' column.
Created label_dict (maps labels to indices).
Inserted 'label' column in dataframe, to the right of 'dx' column.
Added 'set' column to dataframe, with values 't1', 'v1', 'ta', and 'va', to the right of 'localization' column.


In [5]:
# Let's have a look at our metadata dataframe, which is now just an attribute of the metadata instance of the process class.
metadata.df.head()

Unnamed: 0,lesion_id,num_images,image_id,dx,label,dx_type,age,sex,localization,set
0,HAM_0000118,2,ISIC_0027419,bkl,0,histo,80.0,male,scalp,ta
1,HAM_0000118,2,ISIC_0025030,bkl,0,histo,80.0,male,scalp,t1
2,HAM_0002730,2,ISIC_0026769,bkl,0,histo,80.0,male,scalp,va
3,HAM_0002730,2,ISIC_0025661,bkl,0,histo,80.0,male,scalp,v1
4,HAM_0001466,2,ISIC_0031633,bkl,0,histo,75.0,male,ear,va


In [6]:
for across in ["lesions", "images"]:
    for subset in ["all", "train", "val"]:
        process.dx_dist(metadata, subset = subset, across = across)

DISTRIBUTION OF LESIONS BY DIAGNOSIS: OVERALL


dx,nv,other,mel,bcc,akiec
freq,5403.0,898.0,614.0,327.0,228.0
%,72.33,12.02,8.22,4.38,3.05


Total lesions: 7470.

DISTRIBUTION OF LESIONS BY DIAGNOSIS: TRAIN


dx,nv,other,mel,bcc,akiec
freq,4052.0,673.0,460.0,245.0,171.0
%,72.34,12.02,8.21,4.37,3.05


Total lesions: 5601 (74.98% of all lesions).

DISTRIBUTION OF LESIONS BY DIAGNOSIS: VAL


dx,nv,other,mel,bcc,akiec
freq,1351.0,225.0,154.0,82.0,57.0
%,72.28,12.04,8.24,4.39,3.05


Total lesions: 1869 (25.02% of all lesions).

DISTRIBUTION OF IMAGES BY DIAGNOSIS: OVERALL


dx,nv,other,mel,bcc,akiec
freq,6705.0,1356.0,1113.0,514.0,327.0
%,66.95,13.54,11.11,5.13,3.27


Total images: 10015.

DISTRIBUTION OF IMAGES BY DIAGNOSIS: TRAIN


dx,nv,other,mel,bcc,akiec
freq,5007.0,1008.0,831.0,384.0,250.0
%,66.94,13.48,11.11,5.13,3.34


Total images: 7480 (74.69% of all images).

DISTRIBUTION OF IMAGES BY DIAGNOSIS: VAL


dx,nv,other,mel,bcc,akiec
freq,1698.0,348.0,282.0,130.0,77.0
%,66.98,13.73,11.12,5.13,3.04


Total images: 2535 (25.31% of all images).



In [8]:
# There are some implicit attributes of our process class:
metadata_hidden_attributes = metadata.get_hidden_attributes()
print(list(metadata_hidden_attributes.keys()))
# E.g.:
print(metadata_hidden_attributes["_label_codes"])

['_csv_file_path', '_label_dict', '_label_codes', '_num_labels', '_df_train1', '_df_train_a', '_df_val1', '_df_val_a', '_df_sample_batch']
{0: 'other', 1: 'mel', 2: 'nv', 3: 'bcc', 4: 'akiec'}


In [9]:
# Now let's set values for the attributes of our resnet18 class (the model we will use with out processed data).
# One of the attributes has to do with image transformations.

import torchvision.transforms as transforms

transform = transforms.Compose([
transforms.CenterCrop((300, 300)),
transforms.Resize((224,224)), # Resize images to fit ResNet input size
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),  # Normalize with ImageNet stats
])    

In [28]:
import pandas as pd
from typing import List, Callable

df: pd.DataFrame = metadata.df                   # Background dataset for the model. metadata._df_sample_batch is a random selection of 64 rows of metadata.df. We use it for testing our code.
train_set: Union[pd.DataFrame, list, str] = "t1" # "t1" (one image per lesion in training set); ["t1", "ta"] (all images for each lesion in training set); can also specify another sub-dataframe of self.df.
val_set: Union[pd.DataFrame, list, str] = "v1"   # Similar to train_set above.
label_codes: dict = metadata._label_codes        # Correspondence between label codes like 0 and label words like 'other'.
data_dir: Path = path["images"]                  # Path to directory where images are stored.
model_dir: Path = path["models"]                 # Path to directory where models/model info/model results are stored.
transform: List[Callable] = transform            # Transform to be applied to images before feeding to ResNet-18
batch_size: int = 32                             # Mini-batch size: default 32.
epochs: int = 3                                  # Number of epochs (all layers unfrozen from the start): default 10.
base_learning_rate: float = 0.001                # Learning rate to start with: default 0.001. Using Adam optimizer.
filename_stem: str = "rn18mc"                    # For saving model and related files. train set and num epochs will be appended automatically. Default "rn18mc".
filename_suffix: str = "test"                    # Something descriptive and unique for future reference and to avoid over-writing other files. Default empty string "".

In [336]:
# To reload custom module after editing, so we don't have to restart kernel and start over from the beginning every time.
import importlib
import multiclass_models
# from multiclass_models import resnet18
importlib.reload(multiclass_models);

In [337]:
# Create an instance of the resnet18 class with attribute values as above.
from multiclass_models import resnet18

resnet18mc_test = resnet18(                                  # This instance is for testing the code.
    df=metadata.df.sample(n=16, random_state=metadata.seed), # Just a small number of rows for testing the code.
    train_set=train_set,
    val_set=val_set,
    label_codes=label_codes,
    data_dir=data_dir,
    model_dir=model_dir,
    transform=transform,
    batch_size=batch_size,
    epochs=3,                                                # Just a few epochs for testing the code.
    base_learning_rate=base_learning_rate,
    filename_stem=filename_stem,
    filename_suffix="test",                                  # Suffix "test" because we're just testing the code.
    Print = True,                                            # Print stuff while testing code to help find errors.
)

<details><summary>Click here to read about an error we found and (hopefully) corrected.</summary>
    
Below, we were catching a ```RuntimeError``` when we tested on dataset of size ```k``` (as in ```metadata.df.sample(n=k, random_state=metadata.seed)``` above) for these values of ```k```: ```1,12,13,14,15,16```. 

Sizes ```2,3,...,11,17,18,19,20,...,33``` were fine. Unfortunately, we were been thrown the same error when training on the full training set (on Google Colab) of size ~7,500, after a good 30 to 60 minutes. 

We thought it might be something to do with the size of the dataset module the batch size (```32```), but received no errors for sizes ```44,45```. Also, the number of epochs seemed irrelevant: changing number of epochs to ```1``` or ```2``` while keeping all else constant, had no effect as far as this error was concerned.

We noted that the error was thrown, at least in some cases (we did not re-check all sizes ```k``` again), not during the training loop, but during validation. We checked the validation sets ```resnet18mc_test._df_val``` that were being fed into the dataloader at this stage, and noticed that for all of the problematic ```k```, and none of the others, there was precisely one image in the validation set. We changed the line ```loss = criterion(outputs.squeeze(), labels)``` to ```loss = criterion(outputs, labels)``` in the validation loop, and found that the ```RuntimeError``` was no longer being thrown. We also made the same change to the training loop. We notice that, regardless of the use of the ```.squeeze()``` method, the ```outputs.shape``` seems to always be of the form ```torch.Size([m, 5])``` (when the code works), which is the right shape, making squeezing redundant.
</details>
<details><summary>Click here to see the error details.</summary>

```
    
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [83], in <cell line: 2>()
      1 # Train the model on the specified training data by calling the train method:
----> 2 resnet18mc_test.train()

File D:\projects\skin-lesion-classification\scripts\multiclass_models.py:195, in resnet18.train(self)
    193                     val_outputs = model(val_images)
    194 #                     val_loss = criterion(val_outputs.squeeze(), val_labels.long())
--> 195                     val_loss = criterion(val_outputs.squeeze(), val_labels)
    196                     if self.Print:
    197                         print(f"outputs.shape: {outputs.shape}")

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\loss.py:1179, in CrossEntropyLoss.forward(self, input, target)
   1178 def forward(self, input: Tensor, target: Tensor) -> Tensor:
-> 1179     return F.cross_entropy(input, target, weight=self.weight,
   1180                            ignore_index=self.ignore_index, reduction=self.reduction,
   1181                            label_smoothing=self.label_smoothing)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py:3059, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   3057 if size_average is not None or reduce is not None:
   3058     reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 3059 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

RuntimeError: 0D or 1D target tensor expected, multi-target not supported
```
</details>

In [338]:
print("Training set".upper())
display(resnet18mc_test._df_train)
print("Validation set".upper())
display(resnet18mc_test._df_val)

TRAINING SET


Unnamed: 0,lesion_id,num_images,image_id,dx,label,dx_type,age,sex,localization,set
6144,HAM_0002695,1,ISIC_0028664,nv,2,follow_up,45.0,male,back,t1
4658,HAM_0000370,1,ISIC_0025998,nv,2,follow_up,70.0,male,trunk,t1
2767,HAM_0005536,1,ISIC_0026798,bcc,3,histo,45.0,male,lower extremity,t1
1359,HAM_0005084,2,ISIC_0027261,mel,1,histo,75.0,male,ear,t1
4097,HAM_0007158,1,ISIC_0027206,nv,2,follow_up,50.0,female,lower extremity,t1
4503,HAM_0003274,1,ISIC_0031348,nv,2,follow_up,40.0,female,upper extremity,t1
5529,HAM_0003796,1,ISIC_0030696,nv,2,follow_up,50.0,male,trunk,t1
5508,HAM_0005574,1,ISIC_0027823,nv,2,follow_up,35.0,female,trunk,t1
1087,HAM_0000215,1,ISIC_0025484,bkl,0,consensus,70.0,female,lower extremity,t1
1328,HAM_0002838,1,ISIC_0030818,mel,1,histo,65.0,female,lower extremity,t1


VALIDATION SET


Unnamed: 0,lesion_id,num_images,image_id,dx,label,dx_type,age,sex,localization,set
1070,HAM_0003691,1,ISIC_0027428,bkl,0,consensus,75.0,female,back,v1


In [339]:
# Train the model on the specified training data by calling the train method:
resnet18mc_test.train()

image_id, label, ohe-label: ISIC_0028664, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0024505, 0, tensor([1., 0., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0031348, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0025484, 0, tensor([1., 0., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0027206, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0030696, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0026798, 3, tensor([0., 0., 0., 1., 0.])
image_id, label, ohe-label: ISIC_0027261, 1, tensor([0., 1., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0025998, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0030818, 1, tensor([0., 1., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0027823, 2, tensor([0., 0., 1., 0., 0.])
outputs.shape: torch.Size([11, 5])
loss: 1.4874520301818848
Validating...
image_id, label, ohe-label: ISIC_0027428, 0, tensor([1., 0., 0., 0., 0.])
outputs.shape: torch.Size([11, 5])
val

In [340]:
# Let's look at the training and validation loss for each epoch:
resnet18mc_test.epoch_losses

{'train_loss': array([1.48745203, 0.05748529, 0.00850889]),
 'val_loss': array([3.96511078, 7.2349515 , 7.86598682])}

In [341]:
# The model will be saved as a .pth file in the directory given by model_dir attribute.
# Sans .pth extension, the filename is
resnet18mc_test._filename

'rn18mc_t1_3e_test'

In [342]:
# We can feed our entire dataframe through the trained model to obtain predictions for all lesions/images.
# Data can be loaded from a pre-saved .pth file if it is not still in memory.
inference_df = resnet18mc_test.inference()
display(inference_df)

image_id, label, ohe-label: ISIC_0028664, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0025998, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0032817, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0026577, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0026798, 3, tensor([0., 0., 0., 1., 0.])
image_id, label, ohe-label: ISIC_0027261, 1, tensor([0., 1., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0027206, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0031348, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0025752, 3, tensor([0., 0., 0., 1., 0.])
image_id, label, ohe-label: ISIC_0030696, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0027823, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0027428, 0, tensor([1., 0., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0025484, 0, tensor([1., 0., 0., 0., 0.])
image_id, label, ohe-label: ISIC_00298

Unnamed: 0,image_id,prob_other,prob_mel,prob_nv,prob_bcc,prob_akiec
0,ISIC_0028664,0.000146,1.4e-05,0.999183,0.000205,0.000452
1,ISIC_0025998,7.6e-05,3.1e-05,0.999511,0.000121,0.000261
2,ISIC_0032817,0.020995,0.010367,0.80987,0.101917,0.056851
3,ISIC_0026577,0.003524,0.000575,0.985103,0.006438,0.004359
4,ISIC_0026798,4.6e-05,7e-06,2.2e-05,0.999108,0.000818
5,ISIC_0027261,0.011225,0.913997,0.027289,0.022313,0.025177
6,ISIC_0027206,0.00028,0.000102,0.996684,0.001162,0.001773
7,ISIC_0031348,8.9e-05,7.3e-05,0.998228,0.000425,0.001186
8,ISIC_0025752,0.000249,0.000788,0.989949,0.003074,0.00594
9,ISIC_0030696,0.001458,4.3e-05,0.99241,0.004207,0.001881


In [343]:
# Or we can make predictions for individual lesions/images:
display(resnet18mc_test.prediction("HAM_0005084"))
display(resnet18mc_test.prediction("ISIC_0032817"))

image_id, label, ohe-label: ISIC_0027261, 1, tensor([0., 1., 0., 0., 0.])


Unnamed: 0,image_id,prob_other,prob_mel,prob_nv,prob_bcc,prob_akiec
0,ISIC_0027261,0.011225,0.913997,0.027289,0.022313,0.025177


image_id, label, ohe-label: ISIC_0032817, 2, tensor([0., 0., 1., 0., 0.])


Unnamed: 0,image_id,prob_other,prob_mel,prob_nv,prob_bcc,prob_akiec
0,ISIC_0032817,0.020995,0.010367,0.80987,0.101917,0.056851


In [344]:
resnet18mc_test._filename

'rn18mc_t1_3e_test'

In [345]:
resnet18mc_test.state_dict

OrderedDict([('conv1.weight',
              tensor([[[[-1.2556e-02, -8.2634e-03, -3.9593e-03,  ...,  5.4480e-02,
                          1.4946e-02, -1.4829e-02],
                        [ 8.9535e-03,  7.4038e-03, -1.1206e-01,  ..., -2.7338e-01,
                         -1.3121e-01,  1.6132e-03],
                        [-9.0797e-03,  5.6960e-02,  2.9335e-01,  ...,  5.1757e-01,
                          2.5418e-01,  6.1433e-02],
                        ...,
                        [-2.9677e-02,  1.3925e-02,  7.0462e-02,  ..., -3.3498e-01,
                         -4.2270e-01, -2.5994e-01],
                        [ 2.8468e-02,  3.8837e-02,  6.0719e-02,  ...,  4.1171e-01,
                          3.9146e-01,  1.6394e-01],
                        [-1.5873e-02, -5.8048e-03, -2.6217e-02,  ..., -1.5282e-01,
                         -8.4355e-02, -7.8847e-03]],
              
                       [[-9.0741e-03, -2.4283e-02, -3.2384e-02,  ...,  3.5077e-02,
                          7.4047

In [347]:
# Let's check that the code works if our state dictionary is no longer in memory, so that we have to load it from a .pth file.
resnet18mc_test.state_dict = None
display(resnet18mc_test.inference(filename = resnet18mc_test._filename))

image_id, label, ohe-label: ISIC_0028664, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0025998, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0032817, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0026577, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0026798, 3, tensor([0., 0., 0., 1., 0.])
image_id, label, ohe-label: ISIC_0027261, 1, tensor([0., 1., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0027206, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0031348, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0025752, 3, tensor([0., 0., 0., 1., 0.])
image_id, label, ohe-label: ISIC_0030696, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0027823, 2, tensor([0., 0., 1., 0., 0.])
image_id, label, ohe-label: ISIC_0027428, 0, tensor([1., 0., 0., 0., 0.])
image_id, label, ohe-label: ISIC_0025484, 0, tensor([1., 0., 0., 0., 0.])
image_id, label, ohe-label: ISIC_00298

Unnamed: 0,image_id,prob_other,prob_mel,prob_nv,prob_bcc,prob_akiec
0,ISIC_0028664,0.000146,1.4e-05,0.999183,0.000205,0.000452
1,ISIC_0025998,7.6e-05,3.1e-05,0.999511,0.000121,0.000261
2,ISIC_0032817,0.020995,0.010367,0.80987,0.101917,0.056851
3,ISIC_0026577,0.003524,0.000575,0.985103,0.006438,0.004359
4,ISIC_0026798,4.6e-05,7e-06,2.2e-05,0.999108,0.000818
5,ISIC_0027261,0.011225,0.913997,0.027289,0.022313,0.025177
6,ISIC_0027206,0.00028,0.000102,0.996684,0.001162,0.001773
7,ISIC_0031348,8.9e-05,7.3e-05,0.998228,0.000425,0.001186
8,ISIC_0025752,0.000249,0.000788,0.989949,0.003074,0.00594
9,ISIC_0030696,0.001458,4.3e-05,0.99241,0.004207,0.001881
