# Compliance regressor training

#### This notebook aims to launch the training of the regressor predicting the compliance. This regressor is used in TopoDiff to guide the model in the direction of its gradient.

In [1]:
# import torch as th
# import numpy as np
# import matplotlib.pyplot as plt
import os
# import sys

The environment variable 'TOPODIFF_LOGDIR' defines the directory where the logs and model checkpoints will be saved.

In [2]:
os.environ['TOPODIFF_LOGDIR'] = './reg_logdir'

The 'TRAIN_FLAGS', 'REGRESSOR_FLAGS' and 'DATA_FLAGS' respectively set the training parameters, the regressor hyperparameters and the directories where the training and validation data are.

The default values indicated below correspond to the hyperparameters indicated in the Appendix to the paper.

In [3]:
TRAIN_FLAGS="--iterations 400000 --anneal_lr True --batch_size 64 --lr 3e-4 --save_interval 10000 --weight_decay 0.05 --regressor_use_fp16 True"
REGRESSOR_FLAGS="--image_size 64 --regressor_attention_resolutions 32,16,8 --regressor_width 128 --regressor_resblock_updown True --regressor_use_scale_shift_norm True"

In order to run the training, make sure you have placed the data folder at the root of this directory.

All the images, physical fields, load arrays, boundary conditions arrays and the compliance array must be altogether in the same folder (done by default in the data directory that we provide you with).

In [4]:
DATA_FLAGS="--data_dir ./data/dataset_2_reg/training_data --val_data_dir ./data/dataset_2_reg/validation_data --noised True"

In [None]:
%run scripts/regressor_train.py $TRAIN_FLAGS $REGRESSOR_FLAGS $DATA_FLAGS

Logging to ./reg_logdir
creating model and diffusion...
creating data loader...
creating optimizer...
training regressor model...
Found NaN, decreased lg_loss_scale to 15.0
----------------------------
| lg_loss_scale | 16       |
| samples       | 64       |
| step          | 0        |
| train_loss    | 1.05     |
| train_R2      | -1.99    |
| val_loss      | 0.954    |
| val_R2        | -2.14    |
----------------------------
Found NaN, decreased lg_loss_scale to 14.0
----------------------------
| grad_norm     | 275      |
| lg_loss_scale | 14.1     |
| param_norm    | 185      |
| samples       | 704      |
| step          | 10       |
| train_loss    | 9.91     |
| train_R2      | -30.3    |
| val_loss      | 0.546    |
| val_R2        | -0.0745  |
----------------------------
----------------------------
| grad_norm     | 20.9     |
| lg_loss_scale | 14       |
| param_norm    | 185      |
| samples       | 1.34e+03 |
| step          | 20       |
| train_loss    | 0.467    |
|

By the end of the training, you should get in the reg_logdir a series of checkpoints. You can then use the last checkpoint as the regressor when sampling from TopoDiff (see the notebook **4_TopoDiff_sample**).