# Deep Learning Framework for Predicting Himalayan Summit Success 


## 0. Setup

The following section of the notebook covers all relevant imports required for the project. 

In [1]:
# The following code is only for Google Collab whenever wanting to perform runs there
# !git clone https://github.com/Aaron-Serpilin/DeepSummit.git
# !pip install --upgrade pip
# !pip install -e .

In [5]:
from torch import nn
from torchvision import transforms

try:
    import torch
    import torchvision
    assert int(torch.__version__.split(".")[0]) >= 2, "torch version should be 2.+"
    assert int(torchvision.__version__.split(".")[1]) >= 15, "torchvision version should be 0.15+"
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")
except:
    print(f"[INFO] torch/torchvision versions not correct. Installing correct versions.")
    !pip3 install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
    import torch
    import torchvision
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")

try:
    import matplotlib.pyplot as plt
except ImportError:
    print("[INFO] Couldn't find matplotlib...installing it")
    !pip install -q matplotlib
    import matplotlib.pyplot as plt

try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it")
    !pip install -q torchinfo
    from torchinfo import summary

try:
    from tqdm.auto import tqdm
except:
    print(f"[INFO] Couldnt't find tqdm... installing it ")
    !pip install tqdm
    from tqdm.auto import tqdm

try:
    from torchinfo import summary
except ImportError:
    print("[INFO] Couldn't find torchinfo... installing it")
    !pip install -q torchinfo
    from torchinfo import summary

try:
    from dbfread import DBF
except ImportError:
    print("[INFO] Coudln't find dbfread...installing it")
    !pip install -q dbfread
    from dbfread import DBF


try:
    from torch.utils.tensorboard import SummaryWriter
except:
    print("[INFO] Couldn't find tensorboard... installing it.")
    !pip install -q tensorboard
    from torch.utils.tensorboard import SummaryWriter

try:
    import torchmetrics, mlxtend
    print(f"mlextend version: {mlxtend.__version__}")
    assert int(mlxtend.__version__.split(".")[1]) >- 19
except:
    !pip install -q torchmetrics -U mlxtend
    import torchmetrics, mlxtend
    print(f"mlextend version: {mlxtend.__version__}")

try:
    import cdsapi
except ImportError:
    print("[INFO] Coudldn't find cdsapi...installing it.")
    !pip install -q cdsapi
    import cdsapi

try:
    import pandas as pd
except ImportError:
    print("[INFO] Couldn't find pandas... installing it")
    !pip install -q pandas
    import pandas as pd

try:
    from einops import rearrange, repeat
except ImportError:
    print("[INFO] Couldn't find einops... installing it")
    !pip install -q einops
    from einops import rearrange, repeat

try:
    import pygrib
except ImportError:
    print("[INFO] Couldn't find pygrib... installing it")
    !pip install -q pygrib
    import pygrib


torch version: 2.6.0
torchvision version: 0.21.0


  from .autonotebook import tqdm as notebook_tqdm


mlextend version: 0.23.4


In [6]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

## 1. Himalayan Data Setup

The following section covers the retrieval and processing of the tabular data from the Himalayan Database that will use the SAINT architecture to carry out inference. The raw data was obtained through the `get_data.py` file. 

In [5]:
# We adjust the PYTHONPATH to keep absolute imports
import sys
sys.path.append("src")

In [6]:
from pathlib import Path

himalayan_train_dir = Path("data/himalayas_data/train")
himalayan_val_dir = Path("data/himalayas_data/val")
himalayan_test_dir = Path("data/himalayas_data/test")

himalayan_train_file = himalayan_train_dir / "train.csv"
himalayan_val_file = himalayan_val_dir / "val.csv"
himalayan_test_file = himalayan_test_dir / "test.csv"

In [7]:
df_train = pd.read_csv(himalayan_train_file)

print(f"First 10 rows:\n{df_train.head(10)}")
print(f"First training instance:\n{df_train.iloc[0]}")
print(f"Instance shape:\n{df_train.iloc[0].shape}")

First 10 rows:
  SEX      CITIZEN      STATUS  MO2USED  MROUTE1  SEASON  O2USED  CALCAGE  \
0   M        Japan     Climber     True        1       3    True       49   
1   M        Spain     Climber    False        1       3   False       54   
2   F  Switzerland     Climber    False        1       3    True       25   
3   M        Nepal  H-A Worker     True        1       1    True       31   
4   M          USA     Climber     True        1       1    True       60   
5   M      Bahrain     Climber    False        1       3    True       34   
6   M        Chile     Climber     True        1       1    True       42   
7   F        Japan      Leader     True        0       4    True       43   
8   M        India     Climber     True        1       1    True       48   
9   M      S Korea     Climber    False        0       3   False       29   

   HEIGHTM  MDEATHS  HDEATHS  SMTMEMBERS  SMTHIRED  Target  
0     8163        0        0          14        11       1  
1     8163     

The `load_himalayan_data()` function defined within the `tab_data_setup` file transforms the raw data from .DBF to .csv, filters the relevant features, filters the relevant peaks, develops a data frame with this data, and then creates the corresponding train, val, and test splits by using the `create_dataloaders()` helper function.

In [8]:
from src.tab_transformer.tab_data_setup import load_himalayan_data

train_dataloader, val_dataloader, test_dataloader = load_himalayan_data()

[INFO] Training set saved to: data/himalayas_data/train/train.csv
[INFO] Validation set saved to: data/himalayas_data/val/val.csv
[INFO] Test set saved to: data/himalayas_data/test/test.csv
[INFO] Himalaya Data has already been processed


## 2. SAINT Model instantiation

The following section instantiates the SAINT model architecture based on the hyperparameter selection defined in the relevant paper. This is broken down in the `tab_breakdown.ipynb` file where the core equations and backbone of the architecture is explained. 

In [8]:
cat_batch, cont_batch, label_batch, cat_mask_batch, cont_mask_batch = next(iter(train_dataloader))

# First instance
cat_instance = cat_batch[0]
cont_instance = cont_batch[0]
label_instance = label_batch[0]

print(f"Categorical instance: {cat_instance}\nCategorical instance shape: {cat_instance.shape}\n")
print(f"Continuous instance: {cont_instance}\nContinuous instance shape: {cont_instance.shape}\n")
print(f"Label instance: {label_instance}\nLabel instance shape: {label_instance.shape}\n")

Categorical instance: tensor([  0,   1, 147,  60,   0,   1,   2,   0])
Categorical instance shape: torch.Size([8])

Continuous instance: tensor([ 0.2272, -1.1185, -0.2801, -0.1664, -0.8072, -0.6175])
Continuous instance shape: torch.Size([6])

Label instance: 0
Label instance shape: torch.Size([])



In [11]:
from src.tab_transformer.tab_model import SAINT
import numpy as np

categorical_columns = ['SEX', 'CITIZEN', 'STATUS', 'MO2USED', 'MROUTE1', 'SEASON', 'O2USED']
continuous_columns = ['CALCAGE', 'HEIGHTM', 'MDEATHS', 'HDEATHS', 'SMTMEMBERS', 'SMTHIRED']

# Returns the amount of unique values per categorical column
cat_dims = [len(np.unique(df_train[col])) for col in categorical_columns]

# Hyperparameter selection based on default original architecture instantiation
saint = SAINT(
    categories = tuple(cat_dims), 
    num_continuous = len(continuous_columns),                
    dim = 32,                           
    dim_out = 1,                       
    depth = 6,                       
    heads = 8,  
    num_special_tokens=1,                      
    attn_dropout = 0.1,             
    ff_dropout = 0.1,                  
    mlp_hidden_mults = (4, 2),       
    cont_embeddings = 'MLP',
    attentiontype = 'colrow',
    final_mlp_style = 'sep',
    y_dim = 2 # Binary classification
)

## 3. SAINT Model Training

The following section trains the SAINT model with the `TabularDataset` class while saving the corresponding files in the `runs` directory using the `SummaryWriter()` setup.

In [None]:
from src.tab_transformer.tab_train import train_step, test_step

loss_fn = nn.CrossEntropyLoss().to(device)
optimizer = torch.optim.AdamW(saint.parameters(),lr=0.0001, betas=(0.9, 0.999), weight_decay=0.01)

train_step(model=saint,
           dataloader=train_dataloader,
           loss_fn=loss_fn,
           optimizer=optimizer,
           device=device
)

val_step = test_step(model=saint,
                     dataloader=val_dataloader,
                     loss_fn=loss_fn,
                     device=device)

val_step

(0.0005842532558613513, 0.9128137550200803)

In [12]:
writer = SummaryWriter()

In [13]:
def create_writer(experiment_name: str,
                  model_name: str,
                  extra: str=None) -> torch.utils.tensorboard.writer.SummaryWriter():

                  from datetime import datetime
                  import os

                  timestamp = datetime.now().strftime("%Y-%m-%d--%H:%M:%S")

                  if extra:
                       log_dir = os.path.join("runs", timestamp, experiment_name, model_name, extra) # Create the log directory path
                  else:
                       log_dir = os.path.join("runs", timestamp, experiment_name, model_name) # Create the log directory path

                  print(f"[INFO] Created SummaryWriter, saving to: {log_dir}")
                  return SummaryWriter(log_dir=log_dir)

In [None]:
%%time
from helper_functions import set_seeds
from src.tab_transformer.tab_train import train

set_seeds(42)


results = train(model=saint,
      train_dataloader=train_dataloader,
      val_dataloader=val_dataloader,
      test_dataloader=test_dataloader,
      optimizer=optimizer,
      loss_fn=loss_fn,
      epochs=10,
      writer=create_writer(experiment_name="first_run",
                                   model_name="saint")
)

[INFO] Created SummaryWriter, saving to: runs/2025-04-23--17:33:24/first_run/saint


 10%|█         | 1/10 [05:01<45:11, 301.27s/it]

Epoch: 1 | train_loss: 0.1886 | train_acc: 0.9252 | test_loss: 0.0008 | test_acc: 0.9183


 20%|██        | 2/10 [09:49<39:08, 293.62s/it]

Epoch: 2 | train_loss: 0.1860 | train_acc: 0.9254 | test_loss: 0.0007 | test_acc: 0.9170


 30%|███       | 3/10 [14:13<32:40, 280.11s/it]

Epoch: 3 | train_loss: 0.1818 | train_acc: 0.9277 | test_loss: 0.0008 | test_acc: 0.9189


 40%|████      | 4/10 [18:50<27:53, 279.00s/it]

Epoch: 4 | train_loss: 0.1781 | train_acc: 0.9295 | test_loss: 0.0009 | test_acc: 0.9157


 50%|█████     | 5/10 [23:32<23:19, 279.99s/it]

Epoch: 5 | train_loss: 0.1766 | train_acc: 0.9283 | test_loss: 0.0009 | test_acc: 0.9202


 60%|██████    | 6/10 [28:44<19:22, 290.72s/it]

Epoch: 6 | train_loss: 0.1738 | train_acc: 0.9305 | test_loss: 0.0007 | test_acc: 0.9172


 70%|███████   | 7/10 [33:21<14:19, 286.51s/it]

Epoch: 7 | train_loss: 0.1693 | train_acc: 0.9322 | test_loss: 0.0008 | test_acc: 0.9192


 80%|████████  | 8/10 [37:50<09:21, 280.84s/it]

Epoch: 8 | train_loss: 0.1666 | train_acc: 0.9328 | test_loss: 0.0005 | test_acc: 0.9168


 90%|█████████ | 9/10 [42:25<04:39, 279.00s/it]

Epoch: 9 | train_loss: 0.1633 | train_acc: 0.9338 | test_loss: 0.0006 | test_acc: 0.9140


100%|██████████| 10/10 [47:06<00:00, 282.61s/it]

Epoch: 10 | train_loss: 0.1603 | train_acc: 0.9356 | test_loss: 0.0007 | test_acc: 0.9147
CPU times: user 1h 37min 16s, sys: 27min 29s, total: 2h 4min 45s
Wall time: 47min 6s





{'train_loss': [0.18862296600337983,
  0.18602193555800267,
  0.181752410843542,
  0.17810713039873818,
  0.176648545670828,
  0.17383407515961766,
  0.16925857581215045,
  0.16661058767160108,
  0.1633284955933488,
  0.1602522330709273],
 'train_acc': [0.9252192593905032,
  0.9253576807228916,
  0.927734375,
  0.9295310838944011,
  0.9282783708362864,
  0.9304931121545005,
  0.9321583207831325,
  0.9328172063253012,
  0.9338055346385542,
  0.9356174698795181],
 'val_loss': [0.0005662141435117606,
  0.0005597891847053206,
  0.0005514946687652404,
  0.0005323321794170931,
  0.0006148527903729174,
  0.0003084733364093735,
  0.0003268823296908873,
  0.0010838294962802566,
  0.0005407305097723581,
  0.00048101495906531093],
 'val_acc': [0.9141315261044177,
  0.9120732931726908,
  0.9167796184738956,
  0.9060491967871486,
  0.9175326305220883,
  0.9184864457831325,
  0.9179216867469879,
  0.9073544176706827,
  0.9109437751004016,
  0.9147088353413655],
 'test_loss': [0.0007675058511366327,


With a `test_acc` of ~92%, the SAINT Model architecture achieves better results than the benchmark Regression model from the Julia project. Therefore, prior to fine tuning the model, and without the incorporation of contextual meteorological data, the model already has a better performance than the baseline. 

In [None]:
from helper_functions import plot_loss_curves

plot_loss_curves(results)

In [None]:
%load_ext tensorboard
%tensorboard --logdir runs

## 4. Era5 Data Setup

The following section covers the data preparation of the era5 dataset. The raw data was obtained through the `get_data.py` file. 

The entire data setup from the era5 dataset is condensed into the `load_era5_data` function, which does multiple things. 

Firstly, it uses`file_to_grib()` to convert all the raw era5 files into .grib files, which facilitate it for processing afterwards. 

This is followed by `get_variable_mapping()` which obtains the corresponding mapping of the internal grib abbreviations and the actual features from the era5 dataset. This is used to then filter out the desired features when we transform the .grib file into a .csv file. 

For convenience, we can run `get_variable_mapping()` each time, but in this study case the mapping doesn't change every time. Therefore, we can store the result of the first run in a variable and then merely refer to it, which saves up ~3 minutes of runtime.  

This mapping is this passed as a parameter to `process_grib_to_csv()` to filter out the desired features and convert the .grib files to .csv, storing them in `data/era5_data/instances/raw_instances`.

Afterwards, due to the API request limitations, the original requests in `get_data.py` were done in batches of 5 years, leading to each mountain to have 17 different weather data files. Therefore, in order to condense them, we then use `merge_daily_instances()` which condenses all of these files into a single .csv file in `data/era5_data/instances/merged_instances`. 

The final step of the function is to then use `build_event_instances()` to create the final ML table by the weather and tabular data are matched based on their `date` and `peakid`. It then creates the instances with the target variable from the tabular dataset, and also adds the relevant 7-day context window to analyze the weather trends that affect summit probabilities. 

In [None]:
from src.met_transformer.met_data_setup import load_era5_data

ml_table = load_era5_data()
instances_output_path = Path('data/era5_data/instances')
instances_output_path.mkdir(parents=True, exist_ok=True)
ml_table_output_file = instances_output_path / 'ml_table.csv'
ml_table.to_csv(ml_table_output_file, index=False)
print(f"Wrote {len(ml_table)} event instances to {ml_table_output_file}")