## Neural Rock Train Model Notebook

The following cell sets up the entire repository from githubg and links to the google drive where the dataset it stored. After all the requirements get installed.

In [1]:
import os

if 'google.colab' in str(get_ipython()):
    print('Running on CoLab')
    import os
    from getpass import getpass
    import urllib

    user = input('User name: ')
    password = getpass('Password: ')
    password = urllib.parse.quote(password) # your password is converted into url format

    cmd_string = 'git clone https://{0}:{1}@github.com/LukasMosser/neural_rock_typing.git'.format(user, password)

    os.system(cmd_string)
    cmd_string, password = "", "" # removing the password from the variable
    os.chdir("./neural_rock_typing")
    os.system('pip install -r requirements.txt')
    os.system('pip install -e .')

    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
else:
    print('Not running on CoLab')
    %load_ext autoreload
    %autoreload 2

### A Hack needed to make Pytorch Lightning work with Colab again

In [2]:
!pip install wandb
!pip install git+https://github.com/PyTorchLightning/pytorch-lightning


Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
^C
[31mERROR: Operation cancelled by user[0m
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting git+https://github.com/PyTorchLightning/pytorch-lightning
  Cloning https://github.com/PyTorchLightning/pytorch-lightning to /tmp/pip-req-build-2shk7l4f
  Running command git clone -q https://github.com/PyTorchLightning/pytorch-lightning /tmp/pip-req-build-2shk7l4f
  Running command git submodule update --init --recursive -q
^C
[31mERROR: Operation cancelled by user[0m


In [2]:
import pytorch_lightning as pl

## Login to Weights & Biases for Logging

In [3]:
!wandb login 

[34m[1mwandb[0m: Currently logged in as: [33mlukas-mosser[0m (use `wandb login --relogin` to force relogin)


## Basic Imports

In [4]:
import sys
import os
import argparse
from pathlib import Path
import json
import pandas as pd
import wandb
from torchvision import transforms
from torch.utils.data import DataLoader, ConcatDataset
import pytorch_lightning as pl
from pytorch_lightning.loggers import WandbLogger, TensorBoardLogger
from pytorch_lightning.callbacks import ModelCheckpoint
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

from neural_rock.dataset import SimpleThinSectionDataset
from neural_rock.model import NeuralRockModel, make_vgg11_model, make_resnet18_model
from neural_rock.plot import visualize_batch
from neural_rock.utils import MEAN_TRAIN, STD_TRAIN

## Hyperparameters

In [5]:
wandb_name = 'lukas-mosser'
project_name = 'neural_rock_simple'

labelset = "Lucia_class"
dataset_fname = "Leg194_dataset.csv"
learning_rate = 3e-4
batch_size = 16
weight_decay = 1e-5
dropout = 0.5

model = 'vgg'
frozen = True
    
train_dataset_mult = 50
val_dataset_mult = 50

seed_dataset = 42

base_path = "../data"

In [6]:
pl.seed_everything(seed_dataset)

df = pd.read_csv(base_path+"/"+dataset_fname)
df.head()


label_encoder = LabelEncoder()

valid_rows = df[df[labelset].notnull() & df["Xppl"].notnull()]

valid_rows["y"] = label_encoder.fit_transform(valid_rows[labelset])

index = valid_rows.index

train_index, test_index = train_test_split(index, test_size=0.5, stratify=valid_rows["y"])

df_train = valid_rows.loc[train_index].reset_index()
df_val = valid_rows.loc[test_index].reset_index()

print(len(df_train), len(df_val))

Global seed set to 42


40 41


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  # This is added back by InteractiveShellApp.init_path()


## Perform Training Sweep across 12 Models

We train a Resnet and a VGG network each with a frozen feature extractor for each labelset: Lucia, Dunham, and DominantPore Type. 

This leads to a total of 12 models.

In [8]:
# Data Augmentation used for Training
data_transforms = {
  'train': transforms.Compose([
      transforms.RandomHorizontalFlip(),
      transforms.RandomRotation(degrees=360),
      transforms.RandomCrop((512, 512)),
      transforms.ColorJitter(hue=0.5),
      transforms.Resize((224, 224)),
      transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
  ]),
  'val':
      transforms.Compose([
          transforms.RandomCrop((512, 512)),
          transforms.Resize((224, 224)),
          transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
      ])
}

# Load the Datasets
train_dataset_base = SimpleThinSectionDataset(base_path, df_train, transform=data_transforms['train'])

val_dataset_base = SimpleThinSectionDataset(base_path, df_val, transform=data_transforms['val'])

In [11]:
# Setup dataloaders
train_loader = DataLoader(train_dataset_base, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=False)
val_loader = DataLoader(val_dataset_base, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=False)

In [None]:
for seed in range(10):
    # Set the base path for the models to be stored in the Google Drive
    path = Path("./data/models/{0:}/{1:}/{2:}".format(labelset, model, str(frozen)))
    path.mkdir(parents=True, exist_ok=True)

    # Set the Random Seed on Everything
    pl.seed_everything(seed)


    # Setup Weights and Biases Logger
    wandb_logger = WandbLogger(name=wandb_name, project='neural_rock_simple', entity='ccg')
    wandb_logger.experiment.config.update({"labelset": labelset, "model": model, 'frozen': str(frozen)})
    tensorboard_logger = TensorBoardLogger("lightning_logs", name=labelset)

    # Checkpoint based on validation F1 score
    checkpointer = ModelCheckpoint(dirpath=path, filename='best', monitor="val/f1", verbose=True, mode="max")

    # Setup the Pytorch Lightning Dataloader
    trainer = pl.Trainer(gpus=-1, 
                       max_steps=15000, 
                       benchmark=True,
                      logger=[wandb_logger, tensorboard_logger],
                      callbacks=[checkpointer],
                      progress_bar_refresh_rate=20,
                      check_val_every_n_epoch=1)

    # Select which model to run
    if model == 'vgg':
        feature_extractor, classifier = make_vgg11_model(train_dataset_base.num_classes, dropout=dropout)
    elif model == 'resnet':
        feature_extractor, classifier = make_resnet18_model(train_dataset_base.num_classes)

    # Create the model itself, ready for training
    model_ = NeuralRockModel(feature_extractor,
                           classifier, 
                           num_classes=train_dataset_base.num_classes, 
                           freeze_feature_extractor=frozen)

    # Train the model
    trainer.fit(model_, train_dataloader=train_loader, val_dataloaders=val_loader)

    # Clean up Weights and Biases Logging
    wandb.finish()

Global seed set to 0
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type       | Params
-------------------------------------------------
0 | feature_extractor | Sequential | 9.2 M 
1 | classifier        | Sequential | 6.4 M 
2 | train_f1          | F1         | 0     
3 | val_f1            | F1         | 0     
-------------------------------------------------
6.4 M     Trainable params
9.2 M     Non-trainable params
15.7 M    Total params
62.614    Total estimated model params size (MB)


Validation sanity check: 0it [00:00, ?it/s]



Training: 0it [00:00, ?it/s]

Epoch 23, global step 68: val/f1 reached 0.00000 (best 0.00000), saving model to "/home/lmoss/neural_rock_typing/notebooks/data/models/Lucia_class/vgg/True/best-v1.ckpt" as top 1


VBox(children=(Label(value=' 0.00MB of 0.00MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
train/loss,0.739
train/f1,0.63043
epoch,22.0
trainer/global_step,68.0
_runtime,1364.0
_timestamp,1638121773.0
_step,22.0


0,1
train/loss,█▃▂▂▂▁▂▁▂▂▂▂▁▂▂▁▂▁▁▁▃▂▂
train/f1,▁▅▆▇▇▇▇▇▇▇▇████████████
epoch,▁▁▂▂▂▃▃▃▄▄▄▅▅▅▅▆▆▆▇▇▇██
trainer/global_step,▁▁▂▂▂▃▃▃▄▄▄▅▅▅▅▆▆▆▇▇▇██
_runtime,▁▁▂▂▂▃▃▃▄▄▄▅▅▅▆▆▆▆▇▇▇██
_timestamp,▁▁▂▂▂▃▃▃▄▄▄▅▅▅▆▆▆▆▇▇▇██
_step,▁▁▂▂▂▃▃▃▄▄▄▅▅▅▅▆▆▆▇▇▇██


Global seed set to 1
[34m[1mwandb[0m: wandb version 0.12.7 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade


GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type       | Params
-------------------------------------------------
0 | feature_extractor | Sequential | 9.2 M 
1 | classifier        | Sequential | 6.4 M 
2 | train_f1          | F1         | 0     
3 | val_f1            | F1         | 0     
-------------------------------------------------
6.4 M     Trainable params
9.2 M     Non-trainable params
15.7 M    Total params
62.614    Total estimated model params size (MB)


Validation sanity check: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]