# Google Colab

You can use the button below to open this notebook in Google Colab. Note that changes made to the notebook in Colab will not be reflected in Github, nor can the notebook be saved on Colab without first making a copy. 

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nikitalokhmachev-ai/radio-map-estimation-public/blob/main/notebooks/Train_Model.ipynb)

If opened in Colab, set `using_colab` to `True` in the code block below, then run the second and (optionally) third blocks. The second block will clone the github repository into Colab's local storage in order to load the models and other functions. The third block will connect to Google Drive (user login required), which allows the Colab notebook to read and write data to the drive (e.g. training data or evaluation results).

In [None]:
using_colab = False

In [None]:
if using_colab:
    %cd /content/
    !rm -rf /content/radio-map-estimation-public
    !git clone https://github.com/nikitalokhmachev-ai/radio-map-estimation-public.git
    !pip install -q -r /content/radio-map-estimation-public/colab_requirements.txt

In [None]:
if using_colab:
    from google.colab import drive
    drive.mount('/content/drive')

# Check GPU

It is recommended to run this notebook with GPU support. If you have an Nvidea graphics card and drivers installed, the following block of code should show the details of the installed GPU.

In [None]:
!nvidia-smi

# Untar Training Data

In the code block below, specify the path to the saved training data in tar format. This will untar the data into a folder of the same name in the parent directory of this notebook.

In [2]:
# Train set
!tar -xkf '/path/to/saved/tar/file' -C '/path/to/save/untarred/files'

# Import Packages

In [3]:
# Import packages

import torch
import numpy as np

import os
import glob
import joblib
import random

In [4]:
# Import model architectures and data structures

os.chdir('path/to/repository')
from data_utils import MapDataset

from models.autoencoders import BaselineAutoencoder
from models.autoencoders import SkipAutoencoder, SkipResidualAutoencoder, SkipMaskAutoencoder, SkipMaskMapAutoencoder
from models.autoencoders import SkipMapAutoencoder, SkipMapMaskAutoencoder, SkipInputAutoencoder
from models.autoencoders import DualMaskAutoencoder, DualMaskMapAutoencoder, DualMapAutoencoder, DualMaskMapAutoencoder, DualInputAutoencoder

# Set Hyperparameters

In [5]:
# Set random seed, define device

seed = 3
torch.manual_seed(seed)
torch.use_deterministic_algorithms(True)
np.random.seed(seed)
random.seed(seed)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [6]:
# Set batch size, learning rate, and number of epochs
train_batch_size = 256
num_epochs = 1
lr = 5e-4

# Manually set values for buildings, unsampled locations, and sampled locations in the environment mask. 
# For the models in the PIMRC paper, these are set to "None", meaning they keep the default values of -1, 0, and 1 respectively.
building_value = None
unsampled_value = None
sampled_value = None

Specify the model architecture by selecting one of the classes imported above from models.autoencoders. Different hyperparameters can be set for each model, but the default values match the ones used in our experiments.

In [7]:
# Specify model type. Below we give an example for one of the models from the paper.
model = SkipResidualAutoencoder().to(device)
model_name = 'Skip_Residual.pth'

Before running the following code block, create a folder to save the trained models, then enter the path to that folder in the variable `model_folder`.

In [8]:
# Set where to save the trained model weights
model_folder = 'path/to/save/trained/model'

if not os.path.exists(model_folder):
    os.makedirs(model_folder)

In [9]:
# Identify paths to untarred training data and data scaler
train_data_folder = 'path/to/untarred/training/data'
scaler_path = 'scalers/minmax_scaler_zero_min134.joblib'

assert os.path.isdir(train_data_folder)
assert os.path.exists(scaler_path)

# Load Training data into DataLoader

In [35]:
train_pickle_path = os.path.join(train_data_folder, '*.pickle')
train_pickles = glob.glob(train_pickle_path)

with open(scaler_path, 'rb') as f:
  scaler = joblib.load(f)

train_ds = MapDataset(train_pickles, scaler=scaler, building_value=building_value, sampled_value=sampled_value)
train_dl = torch.utils.data.DataLoader(train_ds, batch_size=train_batch_size, shuffle=False)

optimizer = torch.optim.AdamW(model.parameters(), lr=lr)

https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


# Train Model

In [None]:
model.fit(train_dl, optimizer, epochs=num_epochs, loss='mse')

# Save Model

In [39]:
model.save_model(os.path.join(model_folder, model_name))