*   The present script was developed and used on Google Colab. The purpose of the script is to allow the user to train the models described in our paper "DamageMap: A post-wildfire damaged buildings classifier". The models will output "0" for an undamaged building, and "1" for a damaged building.

When training the models to classify the images of a chosen dataset, the dataset should consist of separate images of building roofs. The user should prepare the dataset in the following way:

*   The images of the dataset should be placed in a folder that contains 2 subfolders. The first (in alphabetical order) subfolder should contain the images of the undamaged buildings, because they will automatically get the label "0" (and we want the model to predict "0" for undamaged buildings). The second (in alphabetical order) subfolder should contain the images of damaged buildings.

The following cell allows Google Colab to get access to the files of your Google Drive.

In [None]:
from google.colab import drive

drive.mount('/content/drive', force_remount=True)

%cd drive/My\ Drive

Mounted at /content/drive
/content/drive/My Drive


Importing *necessary* libraries.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torchvision import datasets, models, transforms
from torch.utils.data import DataLoader, Dataset
from torch.utils.data import sampler, RandomSampler, SubsetRandomSampler
from torch.utils.tensorboard import SummaryWriter
from PIL import Image, ImageOps
import torchvision.datasets as dset
import torchvision.transforms as T
import cv2

import numpy as np
from sklearn.metrics import confusion_matrix
import time

import seaborn as sns
from __future__ import print_function 
from __future__ import division
import matplotlib.pyplot as plt
import time
import os
import copy
print("PyTorch Version: ",torch.__version__)
print("Torchvision Version: ",torchvision.__version__)

PyTorch Version:  1.8.1+cu101
Torchvision Version:  0.9.1+cu101


If a GPU is available then the following cell will allow our model to use it, to train faster. It is not advised to train such a deep network if a GPU is not available.

In [None]:
USE_GPU = True

if USE_GPU and torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

print('using device:', device)

using device: cuda


Load and prepare the dataset that the model will train on.

In [None]:
FOLDERNAME = 'damaged_structures_detector/xbd_images'      # This folder contains 3 subfolders with the 'train', 'val', and 'test' sets. Each one consists of two subfolders, one for undamaged (first alphabetically) and one for damaged bulding images.
BATCH_SIZE = 128                                           # Capped by GPU memory

first_transform = transforms.Compose([                     # This is the first transform that we apply to the dataset. It just resizes and crops the building images
        transforms.Resize(224),
        transforms.CenterCrop(224),
        transforms.ToTensor()
    ])

train_set = datasets.ImageFolder(root=os.path.join(FOLDERNAME,'train'), transform = first_transform)
train_loader = DataLoader(train_set, batch_size=BATCH_SIZE, shuffle=False, num_workers = 16)

# The next lines calculate the mean and standard deviation of the training set images. We will later normalize the images based on these two values.
mean = 0.
std = 0.
nb_samples = 0.
for data, labels in train_loader:
    batch_samples = data.size(0)
    data = data.view(batch_samples, data.size(1), -1)
    mean += data.mean(2).sum(0)
    std += data.std(2).sum(0)
    nb_samples += batch_samples

mean /= nb_samples
std /= nb_samples

The next cell applies the second transformation (the one described in the paper) and prepares the dataset for the training process.

In [None]:
data_transform = transforms.Compose([       # This is the 2nd and last transformation. It contains the addition of random noise and the normalization by the mean and std of the train set
        transforms.Resize(224),
        transforms.CenterCrop(224),
        transforms.ColorJitter(brightness=0.7, contrast=0.6, saturation=0.3, hue=0),
        transforms.ToTensor(),
        transforms.Normalize(mean=mean,
                             std=std)
    ])

train_dataset = datasets.ImageFolder(root=os.path.join(FOLDERNAME,'train'),transform = data_transform)  # using this transformation we prepare the dataset on which we will train the model
val_dataset = datasets.ImageFolder(root=os.path.join(FOLDERNAME,'val'),transform = data_transform)      # using this transformation we prepare the dataset on which we will tune the training hyperparameters
test_dataset = datasets.ImageFolder(root=os.path.join(FOLDERNAME,'test'),transform = data_transform)    # using this transformation we prepare the dataset on which we will evaluate the model's performance

The following cell is just repeating the previous 3 lines, because sometimes Google Colab fails to load datasets in first attempt

In [None]:
train_dataset = datasets.ImageFolder(root=os.path.join(FOLDERNAME,'train'),transform = data_transform)
val_dataset = datasets.ImageFolder(root=os.path.join(FOLDERNAME,'val'),transform = data_transform)
test_dataset = datasets.ImageFolder(root=os.path.join(FOLDERNAME,'test'),transform = data_transform)

In the next few cells we are using the created datasets to prepare the Pytorch Dataloaders

In [None]:
# Intended Input Image Size
HEIGHT = 224
WIDTH = 224
CHANNELS = 3

# Number of training examples
NUM_TRAIN = len(train_dataset)

# Number of Validation examples
NUM_VAL = len(val_dataset)

# Number of Test Examples
NUM_TEST = len(test_dataset)

# Batch Size for Training (capped by GPU)
BATCH_SIZE = 64


# Shuffle the Data with Random Samplers
train_indices = list(range(NUM_TRAIN))
np.random.shuffle(train_indices)

val_indices = list(range(NUM_VAL))
np.random.shuffle(val_indices)

test_indices = list(range(NUM_TEST))
np.random.shuffle(test_indices)

train_idx, val_idx, test_idx = train_indices, val_indices, test_indices

In [None]:
train_sampler = SubsetRandomSampler(train_idx)
val_sampler   = SubsetRandomSampler(val_idx)
test_sampler  = SubsetRandomSampler(test_idx)


loader_train = DataLoader(train_dataset, batch_size=BATCH_SIZE, sampler=train_sampler, num_workers = 4) ## can switch to "shuffle = True" if sampler does not work 

loader_val = DataLoader(val_dataset, batch_size=BATCH_SIZE, sampler=val_sampler, num_workers = 4)

loader_test = DataLoader(test_dataset, batch_size=BATCH_SIZE, sampler=test_sampler, num_workers = 4)


# Creating a dictionary that contains the train-val-test dataloaders, to call them efficiently when neccessary
dataloaders_dict = {}
dataloaders_dict.update( {'train' : loader_train} )
dataloaders_dict.update( {'val' : loader_val} )
dataloaders_dict.update( {'test' : loader_test} )

In [None]:
loader_val.dataset.classes  # just printing the two classes to notice that we put undamaged folder 1st to take the "0" output!

['a_not_destroyed', 'b_destroyed']

Decide the type of the Model that we will train and a few other parameters

In [None]:
# Models to choose from [resnet, alexnet, vgg, squeezenet, densenet, inception]
MODEL_NAME = "resnet"

# Number of classes in the dataset
NUM_CLASSES = 2 # (undamaged "0", damaged "1")

# Flag for feature extracting. When False, we finetune the whole model, 
# when True we only update the reshaped (output) layer params
FEATURE_EXTRACT = False

The following helper function sets the ``.requires_grad`` attribute of the
parameters in the model to False when we are feature extracting. By
default, when we load a pretrained model all of the parameters have
``.requires_grad=True``, which is fine if we are training from scratch
or finetuning. However, if we are feature extracting and only want to
compute gradients for the newly initialized layer then we want all of
the other parameters to not require gradients.

In [None]:
def set_parameter_requires_grad(model, feature_extracting):
    if feature_extracting:
        for param in model.parameters():
            param.requires_grad = False

The Following Function takes the model name, input dimensions, number of classes, feature extraction or finetuning, and pretrained or not and initializes the model. It only requires that the input images will have a square shape and even number of pixels in each dimensions (can be easily modified if input needs to be different).

In [None]:
def initialize_model(model_name, num_classes, width, channels, feature_extract, use_pretrained=True):
    # Initialize these variables which will be set in this if statement. Each of these
    #   variables is model specific.
    model_ft = None
    input_size = 0

    if model_name == "resnet":
        """ Resnet18
        """                
        model_ft = models.resnet18(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.fc.in_features
        model_ft.fc = nn.Linear(num_ftrs, num_classes)
        input_size = 224

        if(width != input_size):       
          W1 = width
          if (input_size > W1):
            F = 1
            P = int((input_size - W1) / 2)
          else:
            P = 0
            F = W1 - input_size +1
          
          first_conv_layer = nn.Conv2d(channels, 3, kernel_size=F, stride=1, padding=P, dilation=1, groups=1, bias=True)
          model_ft = nn.Sequential(first_conv_layer, model_ft)
  

    elif model_name == "alexnet":
        """ Alexnet
        """
        model_ft = models.alexnet(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier[6].in_features
        model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)
        input_size = 224

        if(width != input_size):       
          W1 = width
          if (input_size > W1):
            F = 1
            P = int((input_size - W1) / 2)
          else:
            P = 0
            F = W1 - input_size +1
          
          first_conv_layer = [nn.Conv2d(channels, 3, kernel_size=F, stride=1, padding=P, dilation=1, groups=1, bias=True)]
          first_conv_layer.extend(list(model_ft.features))  
          model_ft.features= nn.Sequential(*first_conv_layer)

    elif model_name == "vgg":
        """ VGG11_bn
        """
        model_ft = models.vgg11_bn(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier[6].in_features
        model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)
        input_size = 224

        if(width != input_size):       
          W1 = width
          if (input_size > W1):
            F = 1
            P = int((input_size - W1) / 2)
          else:
            P = 0
            F = W1 - input_size +1
          
          first_conv_layer = [nn.Conv2d(channels, 3, kernel_size=F, stride=1, padding=P, dilation=1, groups=1, bias=True)]
          first_conv_layer.extend(list(model_ft.features))  
          model_ft.features= nn.Sequential(*first_conv_layer)

    elif model_name == "squeezenet":
        """ Squeezenet
        """
        model_ft = models.squeezenet1_0(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        model_ft.classifier[1] = nn.Conv2d(512, num_classes, kernel_size=(1,1), stride=(1,1))
        model_ft.num_classes = num_classes
        input_size = 224

        if(width != input_size):       
          W1 = width
          if (input_size > W1):
            F = 1
            P = int((input_size - W1) / 2)
          else:
            P = 0
            F = W1 - input_size +1
          
          first_conv_layer = [nn.Conv2d(channels, 3, kernel_size=F, stride=1, padding=P, dilation=1, groups=1, bias=True)]
          first_conv_layer.extend(list(model_ft.features))  
          model_ft.features= nn.Sequential(*first_conv_layer)
        

    elif model_name == "densenet":
        """ Densenet
        """
        model_ft = models.densenet121(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier.in_features
        model_ft.classifier = nn.Linear(num_ftrs, num_classes) 
        input_size = 224

        if(width != input_size):       
          W1 = width
          if (input_size > W1):
            F = 1
            P = int((input_size - W1) / 2)
          else:
            P = 0
            F = W1 - input_size +1
          
          first_conv_layer = [nn.Conv2d(channels, 3, kernel_size=F, stride=1, padding=P, dilation=1, groups=1, bias=True)]
          first_conv_layer.extend(list(model_ft.features))  
          model_ft.features= nn.Sequential(*first_conv_layer)

    elif model_name == "inception":
        """ Inception v3 
        Be careful, expects (299,299) sized images and has auxiliary output
        """
        model_ft = models.inception_v3(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        # Handle the auxilary net
        num_ftrs = model_ft.AuxLogits.fc.in_features
        model_ft.AuxLogits.fc = nn.Linear(num_ftrs, num_classes)
        # Handle the primary net
        num_ftrs = model_ft.fc.in_features
        model_ft.fc = nn.Linear(num_ftrs,num_classes)
        input_size = 299

        if(width != input_size):       
          W1 = width
          if (input_size > W1):
            F = 1
            P = int((input_size - W1) / 2)
          else:
            P = 0
            F = W1 - input_size +1
          
          first_conv_layer = nn.Conv2d(channels, 3, kernel_size=F, stride=1, padding=P, dilation=1, groups=1, bias=True)
          model_ft.AuxLogits = nn.Sequential(first_conv_layer, model_ft.AuxLogits)
          model_ft = nn.Sequential(first_conv_layer, model_ft)


    else:
        print("Invalid model name, exiting...")
        exit()
    
    return model_ft, input_size

The next cell initializes a new model.

In [None]:
%%capture
# Initialize the model for training

model_ft, input_size = initialize_model(MODEL_NAME, NUM_CLASSES, WIDTH, CHANNELS, FEATURE_EXTRACT, use_pretrained=True)

# uncomment the next 2 lines if you want to print the parameters of the model we just instantiated
#for param in model_ft.parameters():
#  print(param.data)

The next cell loads the latest state of the model we are about to train. Run it only if you have saved some checkpoints of the same model from previous training sessions.

In [None]:
CHECKPOINT_PATH = "damaged_structures_detector/checkpoints/LinearModel_checkpoint.pth"    # path of the saved checkpoint
loaded_checkpoint = torch.load(CHECKPOINT_PATH, map_location=device)                      # print the parameters saved in the checkpoint

{'epoch': 6,
 'model_state': OrderedDict([('linear1.weight',
               tensor([[ 0.0658,  0.0528,  0.0408,  ..., -0.0403, -0.0279, -0.0068],
                       [-0.0685, -0.0506, -0.0438,  ...,  0.0382,  0.0291,  0.0069]],
                      device='cuda:0')),
              ('linear1.bias', tensor([-2.4732,  2.4736], device='cuda:0'))]),
 'optim_state': {'param_groups': [{'dampening': 0,
    'lr': 0.001,
    'momentum': 0.9,
    'nesterov': True,
    'params': [139815825681000, 139815825681072],
    'weight_decay': 0.001}],
  'state': {139815825681000: {'momentum_buffer': tensor([[-0.0674, -0.0668, -0.0662,  ..., -0.3024, -0.3098, -0.3107],
            [ 0.0674,  0.0669,  0.0662,  ...,  0.3024,  0.3098,  0.3107]],
           device='cuda:0')},
   139815825681072: {'momentum_buffer': tensor([ 0.2182, -0.2182], device='cuda:0')}}},
 'val_acc': tensor(0.7051, device='cuda:0', dtype=torch.float64),
 'val_loss': 18.30215422591746}

The next cell uses the loaded checkpoint to update our model, so that we do not have to train it from scratch.

In [None]:
# Update Parameters of the model

model_ft.load_state_dict(loaded_checkpoint['model_state'])

The next cell moves the model to the GPU memory were it will be trained. Then it prepares and prints the parameters that will be trained (based on our previous decision of feature extraction or finetuning). Afterwards in initiates or loads an optimizer for the training.

In [None]:
# Send the model to GPU
model_ft = model_ft.to(device)

#  Gather the parameters to be optimized/updated in this run. If we are
#  finetuning we will be updating all parameters. However, if we are 
#  doing feature extract method, we will only update the parameters
#  that we have just initialized, i.e. the parameters with requires_grad
#  is True.
params_to_update = model_ft.parameters()
print("Params to learn:")
if FEATURE_EXTRACT:
    params_to_update = []
    for name,param in model_ft.named_parameters():
        if param.requires_grad == True:
            params_to_update.append(param)
            print("\t",name)
else:
    for name,param in model_ft.named_parameters():
        if param.requires_grad == True:
            print("\t",name)

# Initiate the optimizer (this one was used to create the model of the paper but below there are other options that were tried)
optimizer_ft = optim.SGD(params_to_update, lr=0.001, momentum=0.9,nesterov = True, weight_decay=1e-3)

# Load optimizer parameters from checkpoint (uncomment if there is a checkpoint and you want to use its optimizer)
#optimizer_ft.load_state_dict(loaded_checkpoint['optim_state'])

## If we want Adam optimizer
#optimizer_ft = optim.Adam(params_to_update, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0.01)

## Adagrad
#optimizer_ft = optim.Adagrad(params_to_update, lr=0.01, lr_decay=0, weight_decay=0, eps=1e-10)

## RMSProp
#optimizer_ft = optim.RMSprop(params_to_update, lr=0.01, alpha=0.99, eps=1e-08, weight_decay=0, momentum=0)

print(optimizer_ft)

Params to learn:
	 linear1.weight
	 linear1.bias
SGD (
Parameter Group 0
    dampening: 0
    lr: 0.001
    momentum: 0.9
    nesterov: True
    weight_decay: 0.001
)


The following cell defines the function that will be used for the training of the model

In [None]:
def train_model(model, dataloaders, criterion, optimizer, num_train, num_val, best_acc=100.0, num_epochs=25, save_checkpoint=False, is_inception=False):
    since = time.time()

    val_acc_history = []
    val_loss_history = []
    train_acc_history = []
    train_loss_history = []
    
    best_model_wts = copy.deepcopy(model.state_dict())

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch + 1, num_epochs))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    # Get model outputs and calculate loss
                    # Special case for inception because in training it has an auxiliary output. In train
                    #   mode we calculate the loss by summing the final output and the auxiliary output
                    #   but in testing we only consider the final output.
                    if is_inception and phase == 'train':
                        # From https://discuss.pytorch.org/t/how-to-optimize-inception-model-with-auxiliary-classifiers/7958
                        outputs, aux_outputs = model(inputs)
                        loss1 = criterion(outputs, labels)
                        loss2 = criterion(aux_outputs, labels)
                        loss = loss1 + 0.4*loss2
                    else:
                        outputs = model(inputs)
                        loss = criterion(outputs, labels)
                        
                    _, preds = torch.max(outputs, 1)
          
                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # train statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            
            
            epoch_loss = running_loss / len(dataloaders[phase].dataset)
            epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

            if phase == 'train':
              epoch_loss = running_loss / num_train
              epoch_acc = running_corrects.double() / num_train

            if phase == 'val':
              epoch_loss = running_loss / num_val
              epoch_acc = running_corrects.double() / num_val

            
            print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
               
                if save_checkpoint:
                  checkpoint = {
                     "epoch" : epoch,
                     "model_state" : model_ft.state_dict(),
                     "optim_state" : optimizer_ft.state_dict(),
                     "val_loss" : epoch_loss,
                     "val_acc" : epoch_acc
                  }
                  CHECKPOINT_PATH = "damaged_structures_detector/checkpoints/checkpoint.pth"  # if during an epoch of training a model performs better than the previous ones we create a checkpoint and save its parameters
                  torch.save(checkpoint, CHECKPOINT_PATH)

                  MODEL_PATH = "damaged_structures_detector/checkpoints/best_model.pth"       # we also save the new best model in a form that it is ready to be loaded and used to predict
                  torch.save(model_ft, MODEL_PATH)

            if phase == 'val':
                val_acc_history.append(epoch_acc)
                val_loss_history.append(epoch_loss)
            
            if phase == 'train':
                train_acc_history.append(epoch_acc)
                train_loss_history.append(epoch_loss)


    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # When training is over load the best model weights, and return it
    model.load_state_dict(best_model_wts)
    return model, (val_acc_history, val_loss_history, train_acc_history, train_loss_history)

In [None]:
# Number of epochs to train for 
NUM_EPOCHS = 15

# If the following parameter is True, then every time during training that the model has accuracy higher than the BEST_ACC, the model will be saved overwriting the previous best one.
SAVE_CHECKPOINT = True

BEST_ACC = 0 # Best accuracy of current model. Replace with "loaded_checkpoint['val_acc']" if you are using a loaded model and not a new one!!!

# Setup the loss function
criterion = nn.CrossEntropyLoss()

# Train and evaluate
model_ft, hist = train_model(model_ft, dataloaders_dict, criterion, optimizer_ft, NUM_TRAIN, NUM_VAL, BEST_ACC, NUM_EPOCHS, SAVE_CHECKPOINT, is_inception=(MODEL_NAME=="inception"))


# Plot the training curves of validation accuracy vs. number 
#  of training epochs for the transfer learning method


plt.title(" Accuracy vs. Number of Training Epochs")
plt.xlabel("Training Epochs")
plt.ylabel("Accuracy")
plt.plot(range(1,NUM_EPOCHS+1),hist[0],label="Validation")
plt.plot(range(1,NUM_EPOCHS+1),hist[2],label="Train")
plt.ylim((0,1.))
plt.xticks(np.arange(1, NUM_EPOCHS+1, 1.0))
plt.legend()
plt.show()


plt.title(" Loss vs. Number of Training Epochs")
plt.xlabel("Training Epochs")
plt.ylabel("Loss")
plt.plot(range(1,NUM_EPOCHS+1),hist[1],label="Validation")
plt.plot(range(1,NUM_EPOCHS+1),hist[3],label="Train")
plt.xticks(np.arange(1, NUM_EPOCHS+1, 1.0))
plt.legend()
plt.show()

All of the following cells demonstrate how to perform some useful actions related to saving/loading checkpoints and models. Note that none of the following cells are mandatory for training, but they can provide some useful tools for anyone that is interested in using this code!

Save Model or Checkpoint

In [None]:
# FILE = "model_ft.pth"       # File name
# # torch.save(model_ft.state_dict(), FILE)   # Save the current state/parameter of the model

# checkpoint = {           # note that you can add to the checkpoint whatever you think that might be useful
#     "epoch" : num_epochs,
#     "model_state" : model_ft.state_dict(),
#     "optim_state" : optimizer_ft.state_dict()
# }
# CHECKPOINT_PATH = "damaged_structures_detector/checkpoints/checkpoint.pth"
# torch.save(checkpoint, CHECKPOINT_PATH)
# MODEL_PATH = "damaged_structures_detector/checkpoints/best_model.pth"  # Here we are not saving the state of the model, but instead we are saving the model as an object. That means that when we load it we can use it easily but we can not change anything on it!
# torch.save(model_ft, MODEL_PATH)

Load Model (Needs initialization first, because here we are just loading the saved state of an old model to a new model)

In [None]:
# # loaded_model, input_size = initialize_model(model_name, num_classes, width, channels, feature_extract, use_pretrained=True)
# # loaded_model.load_state_dict(torch.load(FILE))
# # loaded_model.eval()

# loaded_checkpoint = torch.load(CHECKPOINT_PATH)
# epoch = checkpoint["epoch"]
# model_state = checkpoint["model_state"]
# optim_state = checkpoint["optim_state"]

# optimizer_ft = optim.SGD(params_to_update, lr=0.001, momentum=0.9)

# optimizer_ft.load_state_dict(optim_state)

Save on GPU, Load on CPU

In [None]:
# device = torch.device("cuda")
# model.to(device)
# torch.save(model.state_dict(), FILE)

# device = torch.device('cpu')
# model = initialize_model(model_name, num_classes, width, channels, feature_extract, use_pretrained=True)
# model.load_state_dict(torch.load(FILE, map_location=device))

Save GPU, Load GPU

In [None]:
# device = torch.device("cuda")
# model.to(device)
# torch.save(model.state_dict(), FILE)

# model = initialize_model(model_name, num_classes, width, channels, feature_extract, use_pretrained=True)
# model.load_state_dict(torch.load(FILE))
# model.to(device)

Save CPU, Load GPU

In [None]:
# torch.save(model_ft.state_dict(), FILE)

# device = torch.device("cuda")
# model = initialize_model(model_name, num_classes, width, channels, feature_extract, use_pretrained=True)
# model.load_state_dict(torch.load(FILE, map_location="cuda:0"))
# model.to(device)