This notebook is for conducting performance analysis and benchmarking of the model. <br>

Note that since the goal is to compare performances of different sample sizes, all the hyperparameters in the model (even ones which are made configurable) are fixed here.<br>

That is, <br>
- the model used here has 3 hidden layers, with 64, 32, and 16 neurons in each layer respectively;
- activation function: ReLU
- optimiser: Adam
- learning rate: 0.001
- max epochs = 200 (except for when profiling memory usage where it is set to 5)

In [1]:
import performance_analysis as pa

In [2]:
#Generate data for sample size 1K, 5K, and 10K

X_1k, y_1k, Xtr_1k, Xval_1k, Xte_1k, ytr_1k, yval_1k, yte_1k = pa.generate_samples(1000)
X_5k, y_5k, Xtr_5k, Xval_5k, Xte_5k, ytr_5k, yval_5k, yte_5k = pa.generate_samples(5000)
X_10k, y_10k, Xtr_10k, Xval_10k, Xte_10k, ytr_10k, yval_10k, yte_10k = pa.generate_samples(10000)

In [3]:
# e.g. Preview 1K samples
print(f"  X shape: {X_1k.shape}, y shape: {y_1k.shape}")
print(f"  Train: {Xtr_1k.shape}, Val: {Xval_1k.shape}, Test: {Xte_1k.shape}")
print(f"\nFirst 3 samples of X_1k:\n{X_1k[:3]}")
print(f"\nFirst 3 targets of y_1k:\n{y_1k[:3]}")

  X shape: (1000, 5), y shape: (1000,)
  Train: (600, 5), Val: (200, 5), Test: (200, 5)

First 3 samples of X_1k:
[[ 2.05654356  0.60685059  0.48268789 -1.13088844  0.42009449]
 [-0.79919201 -0.64596418 -0.18289644 -0.48274352  1.37487642]
 [ 1.07600714 -0.79602586 -0.75196933  0.02131165 -0.31905394]]

First 3 targets of y_1k:
[ 74.52976776 -42.55455597 -25.34810088]


## Training Time
This is measuring training time full the full sample with all epochs

In [4]:
#1K data
time_1k = pa.measure_training_time(Xtr_1k, ytr_1k)

Training time after warm-up: 1.2916 seconds
Average training time over 10 iterations: 1.3007 seconds
Average training time over 10 iterations: 1.3007 seconds


In [5]:
#5K data
time_5k = pa.measure_training_time(Xtr_5k, ytr_5k)

Training time after warm-up: 6.1176 seconds
Average training time over 10 iterations: 6.2208 seconds
Average training time over 10 iterations: 6.2208 seconds


In [6]:
#10K data
time_10k = pa.measure_training_time(Xtr_10k, ytr_10k)

Training time after warm-up: 12.3131 seconds
Average training time over 10 iterations: 12.3917 seconds
Average training time over 10 iterations: 12.3917 seconds


The above tests show that the model is able to train on CPU within less than a minute for datasets of size up to 10K.

## Memory Usage

This is to profile fewer epochs (e.g. 5) to find bottleneck in memory usage. Profiling the entire fit method (e.g. with 200 epochs) will produce a file too big.

In [7]:
memory_1k = pa.memory_usage(Xtr_1k, ytr_1k, Xte_1k);

Memory usage during training:
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                                   Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls  
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                               aten::mm         2.15%       1.697ms         2.56%       2.020ms       5.771us       1.83 MB       1.83 MB           350  
                                            aten::addmm         2.07%       1.635ms         3.24%       2.557ms      12.787us       1.29 MB       1.29 MB           200  
                                        aten::clamp_min         0.46%     366.055us         0.46%     366.055us       2.

In [None]:
memory_5k = pa.memory_usage(Xtr_5k, ytr_5k, Xte_5k)

TypeError: memory_usage() takes 3 positional arguments but 4 were given

In [None]:
memory_10k = pa.memory_usage(Xtr_10k, ytr_10k, Xte_10k, yte_10k)