# Basics

MLP & Pairwise NIF, Size Analysis

    - MLPs with sizes reflecting that of NeuralFoil:
        - xxsmall
        - xsmall
        - small
        - medium
        - large
        - xlarge
        - xxlarge
        - xxxlarge
        - sub-enormous
        - enormous

    - Pairwise NIF models with the following sizes (paramnet+shapenet):
        - xxsmall + xsmall
        - xsmall + small
        - small + medium
        - medium +large
        - large + xlarge
        - xlarge + xxlarge
        - xxlarge + xxxlarge
        - xxxlarge + sub-enormous
        - sub-enormous + enormous

    - these sizes are:
        - xxsmall = 2 layers, 32 neurons
        - x small =  3 layers, 32 neurons
        - small = 3 layers, 48 neurons 
        - medium = 4 layers, 64 neurons
        - large = 4 layers, 128 neurons 
        - xlarge = 4 layers, 256 neurons
        - xxlarge = 5 layers, 256 neurons
        - xxxlarge = 5 layers, 512 neurons     
        - sub-enormous = 5 layers, 1024 neurons
        - enormous = 5 layers, 2048 neurons

    - with the following hyperparams/datasets:

        - AdamW optimizer:
            - constant lr = 1e-4
            - constant wd = 1e-2

        - airfoil_dataset_8bern.csv
            - 9 bernstein coefficients * 2 + trailing edge * 2
            - set mach number = 0.1
            - reynolds: [3m, 6m, 9m]
            - aoa: [-4, -3, -2, -1, 0 , 1, ..., 20]

# Imports, Important Definitions, Model Definitions

In [1]:
%load_ext autoreload
%autoreload 2

# Define autroreload so that it doesn't cause pain in the ass when we change the functions and run this notebook

In [2]:
# Import all the models, training functions, manipulators here

# Define the relative paths, append it to the system path
import sys
from pathlib import Path
project_root = Path.cwd().resolve().parents[2]
github_root = Path.cwd().resolve().parents[1]
sys.path.append(str(project_root))
sys.path.append(str(github_root))

print(project_root)
print(github_root)

# Import shenanigans
from defs.helper_functions.training_functions import *
from defs.helper_functions.data_loaders import *
from defs.models.MLP import *
from defs.models.NIF import *

# Time, to precisely: time
import time

# Garbage collector
import gc

C:\SenkDosya\Projects\AeroML
C:\SenkDosya\Projects\AeroML\initial-project


In [3]:
# Set device

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
#device = torch.device("cpu")  # Force CPU for testing purposes
print(f"Using device: {device}")

Using device: cuda


In [4]:
# For the entire exploration, define the dictionaries to store stuff

exploration_dict = {
    'cfg_models':  {
        'cfg_mlps': {},
        'cfg_pairnifs': {}
    },
    'perf_models': {
        'r2s': {
            'mlp': {},
            'pairnif': {}
        },
        'maes': {
            'mlp': {},
            'pairnif': {}
        },
        'preds': {
            'mlp': {},
            'pairnif': {}
        },
        'flops': {
            'mlp': {},
            'pairnif': {}
        },
        'params': {
            'mlp': {},
            'pairnif': {}
        }
    }
}

In [5]:
# Define the model sizes, append them to the exploration dictionary

model_size_list = [
    [32,32],
    [32,32,32],
    [48,48,48],
    [64,64,64,64],
    [128,128,128,128],
    [256,256,256,256],
    [256,256,256,256,256],
    [512,512,512,512,512],
    [1024,1024,1024,1024,1024],
    [2048,2048,2048,2048,2048]
]

model_name_list = [
    'xxsmall',
    'xsmall',
    'small',
    'medium',
    'large',
    'xlarge',
    'xxlarge',
    'xxxlarge',
    'sub_enormous',
    'enormous'
]

for i in range(len(model_name_list)):
    cfg_name = model_name_list[i]
    exploration_dict['cfg_models']['cfg_mlps'][cfg_name] = {
            'input_dim': 23,
            'output_dim': 3,
            'hidden_units': model_size_list[i],
            'activation': nn.GELU
        }

for i in range(len(model_name_list)-1):
    cfg_name = f'{model_name_list[i]}_{model_name_list[i+1]}'
    exploration_dict['cfg_models']['cfg_pairnifs'][cfg_name] = {
        'cfg_shape_net': {
            'input_dim': 20,
            'output_dim': 3,
            'hidden_units': model_size_list[i+1],
            'shape_activation': nn.GELU
        },
        'cfg_param_net': {
            'input_dim': 3,
            'hidden_units': model_size_list[i],
            'param_activation': nn.GELU
        }
    }


# Data Pre-Processing, Imports for Metrics

In [None]:
# Figure out the data
df = pd.read_csv(rf"{str(project_root)}\airfoil_data\airfoil_dataset_8bern.csv")

df = df.drop(['N1', 'N2'], axis=1) # Remove N1 and N2 since all the airfoils are subsonic

df['Reynolds'] = df['Reynolds'] / 1000000 # Normalize the Reynolds feature, too dominant

print(df.head)

geom, cond, perf, names= get_dataset(df, loc_geometry=[1,20], loc_cond=[21,23], loc_perf_coeffs=[24,26], loc_names=0) # Get the necessary stuff for the dataset
print(df.shape); print(geom.shape); print(cond.shape); print(perf.shape); print(len(names))

ds = AirfoilDataset(geom, cond, perf, names)

del df, geom, cond, perf, names # Delete these to preserve memory

cfg_loader = {
    'n_epoch': 100,
    'n_train': 1000,
    'n_test': 17250,
    'train_batch': 1
}

dl_train, dl_val, dl_test = get_dataloaders(ds=ds, cfg_loader=cfg_loader)


<bound method NDFrame.head of         airfoil_name      Bu_0      Bu_1      Bu_2      Bu_3      Bu_4  \
0       airfoil_0001  0.273636 -0.200785  1.229138 -1.445725  1.790261   
1       airfoil_0001  0.273636 -0.200785  1.229138 -1.445725  1.790261   
2       airfoil_0001  0.273636 -0.200785  1.229138 -1.445725  1.790261   
3       airfoil_0001  0.273636 -0.200785  1.229138 -1.445725  1.790261   
4       airfoil_0001  0.273636 -0.200785  1.229138 -1.445725  1.790261   
...              ...       ...       ...       ...       ...       ...   
137245  airfoil_1830  0.327410  1.499799 -0.780311  3.149627 -1.415475   
137246  airfoil_1830  0.327410  1.499799 -0.780311  3.149627 -1.415475   
137247  airfoil_1830  0.327410  1.499799 -0.780311  3.149627 -1.415475   
137248  airfoil_1830  0.327410  1.499799 -0.780311  3.149627 -1.415475   
137249  airfoil_1830  0.327410  1.499799 -0.780311  3.149627 -1.415475   

            Bu_5      Bu_6      Bu_7     Bu_8  ...      Bl_7      Bl_8  \
0      

In [7]:
# Define the configs of train, get dataloaders
cfg_train = {
    'cfg_loader': cfg_loader,
    'dtype': torch.float32,
    'device': device
}

dataloaders = [dl_train, dl_val, dl_test]

In [8]:
# Imports for analysis

# Metrics import
from sklearn.metrics import r2_score, mean_absolute_error
from torch.utils.flop_counter import FlopCounterMode
from defs.helper_functions.auxiliary import get_n_params

# Import plots and set up indices
from defs.helper_functions.plot import all_plots_ex_analysis
index= np.arange(cfg_train['cfg_loader']['n_epoch'] * cfg_train['cfg_loader']['n_train'])


# MLP Exploration

In [11]:
# Train and store all the MLP models

for i in exploration_dict['cfg_models']['cfg_mlps'].keys():
    
    print(f"Model: {i} | Started loop")
    model = MLP(cfg_mlp=exploration_dict['cfg_models']['cfg_mlps'][i])
    optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4, weight_decay=1e-2)
    model.to(device)
    loss_fn = nn.MSELoss()
    print(f"Model: {i} | Initialization done for model and optimizer")

    collector_dict, model = train(cfg_train=cfg_train, model=model, optimizer=optimizer, loss_fn=loss_fn, dataloaders=dataloaders)
    print(f"Model: {i} | Training complete")

    # Calculate and export all the R^2 and MAE results
    r2 = r2_score(collector_dict['actual']['test'].cpu(), collector_dict['preds']['test'].cpu(), multioutput='raw_values')
    mae = mean_absolute_error(collector_dict['actual']['test'].cpu(), collector_dict['preds']['test'].cpu(), multioutput='raw_values')
    exploration_dict['perf_models']['r2s']['mlp'][i] = r2
    exploration_dict['perf_models']['maes']['mlp'][i] = mae

    #Get the flops of the model, save them
    show_tensor_1 = torch.ones(1, 20).to(device)
    show_tensor_2 = torch.ones(1, 3).to(device)
    flop_counter = FlopCounterMode(display=False)
    with flop_counter:
        model(show_tensor_1, show_tensor_2)
    flops_per_sample = flop_counter.get_total_flops()
    exploration_dict['perf_models']['flops']['mlp'][i] = flops_per_sample

    # Save the param amount
    exploration_dict['perf_models']['params']['mlp'][i] = get_n_params(model=model)
    print(f"Model: {i} | Metrics saved: R^2, MAE, Flops, Param#")

    # Export the graphs necessary
    all_plots_ex_analysis(model_name=i, model_type='mlp', collector_dict=collector_dict, index=index)
    print(f"Model: {i} | Plots made and saved")

    # Export the predictions for future reference
    exploration_dict['perf_models']['preds']['mlp'][i] = collector_dict['preds']['test']
    print(f"Model: {i} | Predictions saved")

    # Save the model
    model_save_path = rf"trained_models\mlp\{i}"
    torch.save(model.state_dict(), model_save_path)
    print(f"Model: {i} | Model saved")

    # Delete unnecessary stuff, free up mem
    del collector_dict
    del model
    del optimizer
    gc.collect()
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
    print(f"Model: {i} | Garbage collected, states deleted")

    # Now we are ready for the next model

Model: xxsmall | Started loop
Model: xxsmall | Initialization done for model and optimizer
Epoch 1 | Total train loss: 224.30980947127682
Epoch 1 | Val loss: 0.06863674521446228
Epoch 2 | Total train loss: 56.61697225809621
Epoch 2 | Val loss: 0.061338700354099274
Epoch 3 | Total train loss: 51.131289272918366
Epoch 3 | Val loss: 0.048057183623313904
Epoch 4 | Total train loss: 44.65498405406106
Epoch 4 | Val loss: 0.04417048394680023
Epoch 5 | Total train loss: 40.90106576779726
Epoch 5 | Val loss: 0.03214333578944206
Epoch 6 | Total train loss: 39.71340433509249
Epoch 6 | Val loss: 0.037278931587934494
Epoch 7 | Total train loss: 35.507425093503116
Epoch 7 | Val loss: 0.031549446284770966
Epoch 8 | Total train loss: 29.49805541930982
Epoch 8 | Val loss: 0.036712922155857086
Epoch 9 | Total train loss: 27.06246025377004
Epoch 9 | Val loss: 0.02603023685514927
Epoch 10 | Total train loss: 25.534248096603278
Epoch 10 | Val loss: 0.02814541570842266
Epoch 11 | Total train loss: 23.868175

<Figure size 640x480 with 0 Axes>

In [12]:
import pickle

with open('exploration_dict.pkl', 'wb') as f:
    pickle.dump(exploration_dict, f)


# PointNIF Exploration

In [9]:
# Train and store all the Pairnif models

for i in exploration_dict['cfg_models']['cfg_pairnifs'].keys():
    
    print(f"Model: {i} | Started loop")
    model = NIF_Pointwise(cfg_shape_net=exploration_dict['cfg_models']['cfg_pairnifs'][i]['cfg_shape_net'], cfg_param_net=exploration_dict['cfg_models']['cfg_pairnifs'][i]['cfg_param_net'])
    optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4, weight_decay=1e-2)
    model.to(device)
    loss_fn = nn.MSELoss()
    print(f"Model: {i} | Initialization done for model and optimizer")

    collector_dict, model = train(cfg_train=cfg_train, model=model, optimizer=optimizer, loss_fn=loss_fn, dataloaders=dataloaders)
    print(f"Model: {i} | Training complete")

    # Calculate and export all the R^2 and MAE results
    r2 = r2_score(collector_dict['actual']['test'].cpu(), collector_dict['preds']['test'].cpu(), multioutput='raw_values')
    mae = mean_absolute_error(collector_dict['actual']['test'].cpu(), collector_dict['preds']['test'].cpu(), multioutput='raw_values')
    exploration_dict['perf_models']['r2s']['pairnif'][i] = r2
    exploration_dict['perf_models']['maes']['pairnif'][i] = mae

    #Get the flops of the model, save them
    show_tensor_1 = torch.ones(1, 20).to(device)
    show_tensor_2 = torch.ones(1, 3).to(device)
    flop_counter = FlopCounterMode(display=False)
    with flop_counter:
        model(show_tensor_1, show_tensor_2)
    flops_per_sample = flop_counter.get_total_flops()
    exploration_dict['perf_models']['flops']['pairnif'][i] = flops_per_sample

    # Save the param amount
    exploration_dict['perf_models']['params']['pairnif'][i] = get_n_params(model=model)
    print(f"Model: {i} | Metrics saved: R^2, MAE, Flops, Param#")

    # Export the graphs necessary
    all_plots_ex_analysis(model_name=i, model_type='pairnif', collector_dict=collector_dict, index=index)
    print(f"Model: {i} | Plots made and saved")

    # Export the predictions for future reference
    exploration_dict['perf_models']['preds']['pairnif'][i] = collector_dict['preds']['test']
    print(f"Model: {i} | Predictions saved")

    # Save the model
    model_save_path = rf"trained_models\pairnif\{i}"
    torch.save(model.state_dict(), model_save_path)
    print(f"Model: {i} | Model saved")

    # Delete unnecessary stuff, free up mem
    del collector_dict
    del model
    del optimizer
    gc.collect()
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
    print(f"Model: {i} | Garbage collected, states deleted")

    # Now we are ready for the next model

Model: xxsmall_xsmall | Started loop
Model: xxsmall_xsmall | Initialization done for model and optimizer
Epoch 1 | Total train loss: 166.9293605924886
Epoch 1 | Val loss: 0.08314397931098938
Epoch 2 | Total train loss: 66.71018017265851
Epoch 2 | Val loss: 0.06249658763408661
Epoch 3 | Total train loss: 56.05375169841909
Epoch 3 | Val loss: 0.05712618678808212
Epoch 4 | Total train loss: 43.22379517216905
Epoch 4 | Val loss: 0.04093581438064575
Epoch 5 | Total train loss: 41.222005982142946
Epoch 5 | Val loss: 0.03509506955742836
Epoch 6 | Total train loss: 38.696511925663344
Epoch 6 | Val loss: 0.039484236389398575
Epoch 7 | Total train loss: 40.32471662902208
Epoch 7 | Val loss: 0.047474779188632965
Epoch 8 | Total train loss: 37.72624534962051
Epoch 8 | Val loss: 0.027198340743780136
Epoch 9 | Total train loss: 32.07437386317906
Epoch 9 | Val loss: 0.03052794188261032
Epoch 10 | Total train loss: 32.989973875519354
Epoch 10 | Val loss: 0.02601952850818634
Epoch 11 | Total train loss

<Figure size 640x480 with 0 Axes>

In [11]:
import pickle

with open('exploration_dict_pairnif.pkl', 'wb') as f:
    pickle.dump(exploration_dict, f)

# Merging Dicts, Cleaning Notebook

In [None]:
with open('exploration_dict_pairnif.pkl', 'rb') as f:
    exploration_dict_pairnif = pickle.load(f)

with open('exploration_dict_mlp.pkl', 'rb') as f:
    exploration_dict_mlp = pickle.load(f)

exploration_dict_full = {
    'cfg_models':  {
        'cfg_mlps': {},
        'cfg_pairnifs': {}
    },
    'perf_models': {
        'r2s': {
            'mlp': {},
            'pairnif': {}
        },
        'maes': {
            'mlp': {},
            'pairnif': {}
        },
        'preds': {
            'mlp': {},
            'pairnif': {}
        },
        'flops': {
            'mlp': {},
            'pairnif': {}
        },
        'params': {
            'mlp': {},
            'pairnif': {}
        }
    }
}

exploration_dict_full['cfg_models']['cfg_mlps'] = exploration_dict_mlp['cfg_models']['cfg_mlps']
exploration_dict_full['cfg_models']['cfg_pairnifs'] = exploration_dict_pairnif['cfg_models']['cfg_pairnifs']

exploration_dict_full['perf_models']['r2s']['mlp'] = exploration_dict_mlp['perf_models']['r2s']['mlp']
exploration_dict_full['perf_models']['r2s']['pairnif'] = exploration_dict_pairnif['perf_models']['r2s']['pairnif']

exploration_dict_full['perf_models']['maes']['mlp'] = exploration_dict_mlp['perf_models']['maes']['mlp']
exploration_dict_full['perf_models']['maes']['pairnif'] = exploration_dict_pairnif['perf_models']['maes']['pairnif']

exploration_dict_full['perf_models']['preds']['mlp'] = exploration_dict_mlp['perf_models']['preds']['mlp']
exploration_dict_full['perf_models']['preds']['pairnif'] = exploration_dict_pairnif['perf_models']['preds']['pairnif']

exploration_dict_full['perf_models']['flops']['mlp'] = exploration_dict_mlp['perf_models']['flops']['mlp']
exploration_dict_full['perf_models']['flops']['pairnif'] = exploration_dict_pairnif['perf_models']['flops']['pairnif']

exploration_dict_full['perf_models']['params']['mlp'] = exploration_dict_mlp['perf_models']['params']['mlp']
exploration_dict_full['perf_models']['params']['pairnif'] = exploration_dict_pairnif['perf_models']['params']['pairnif']

exploration_dict = exploration_dict_full

# Add the true test data to
test_data = next(iter(dl_test))
target = test_data['perf_coeffs']


with open('exploration_dict.pkl', 'wb') as f:
    pickle.dump(exploration_dict, f)

del exploration_dict_full
del exploration_dict_mlp
del exploration_dict_pairnif




In [25]:
# Add the true test data to
test_data = next(iter(dl_test))
target = test_data['perf_coeffs']
exploration_dict['perf_models']['target'] = target


with open('exploration_dict.pkl', 'wb') as f:
    pickle.dump(exploration_dict, f)

# Analyses

The analyses of the training are as follows:

- Predicting Cd: is the hardest for some reason, the R2 and MAE values for R2 are the lowest, this is kind of expected from NeuralFoil documentation
    - Weighted loss?
    - SmoothL1 loss (Huber loss) must be implemented either to Cd or entirely
- Numerical Instability: We have numerical unstability on the >20m parameter models. To fix this (and to make overall training better), the following changes will be made the next exploration:
    - Gradient clipping using pytorch
    - Layer normalization
- Model plateu: A lot of the models hit plateu of around 0.005 loss, and increasing model size only seems to change when this convergence happens. However, it seems that the plateu are varying depending on the model size (as expected), very minimally.
    - New models must be tried out, since NeuralFoil has much better predictions given less parameters:
        - Different input data
        - Possibly residual connections since the models are relatively deep
        - Any other architectural improvements
    - Learning rate schedulers must be implemented
- Improvements on Workflow:
    - Instead of logging the results on exploration_dict, everything must be logged either on TensorBoard or Weights and Biases
    - Use of tmux + .py for serialized exploration
        - The MLP runs took on the upwards of 100mins and PairNIF on the upwards of 200mins on a 3050ti laptop, jupyter + vscode must be causing a lot of overhead
        - The script must take inputs from .yaml
    - The plots should not be output from the model, the user must do them on their own as well, that capability must be removed
    - Maybe checkpointing?