In [1]:
:set -XPackageImports

In [2]:
import "prehsept" Lib
import            Net
import            Run
import            HyperParameters
import            Data.Frame       as DF
import qualified  Torch            as T
import qualified  Torch.Extensions as T
import qualified  Torch.NN         as NN

# Model Training

This is the same as [run](https://augustunderground.github.io/prehsept/Run.html#v:run)
but in notebook form. 

Here we will train a neural network on operating point data of a 
primitive device to model the following behaviour:

$$
\begin{bmatrix}
    g_{m} / I_{d} \\
    f_{ug} \\
    V_{ds} \\
    V_{bs}
\end{bmatrix}
\mapsto
\begin{bmatrix}
    I_{d} / W \\
    L \\
    g_{ds} / W \\
    V_{gs}
\end{bmatrix}
$$

We start by defining the device type `dev` and PDK `pdk`.

In [3]:
pdk'        = GPDK180   :: PDK
dev'        = NMOS      :: Device
pdk         = show pdk'
dev         = show dev'
testSplit   = 0.8       :: Float
batchSize   = 1000      :: Int
numEpisodes = 10        :: Int

Next we define the columns of interest in our data and which ones are inputs and outputs.

In [4]:
cols      = [ "gmoverid", "idoverw", "gdsoverw", "fug"
            , "Vds", "Vgs", "Vbs", "vth", "id", "W", "L" 
            ] :: [String]
                
paramsX   = ["gmoverid", "fug", "Vds", "Vbs"]   :: [String]
paramsY   = ["idoverw", "L", "gdsoverw", "Vgs"] :: [String]

numX      = length paramsX :: Int
numY      = length paramsY :: Int

We also have to specify which columns should be in log space.

In [5]:
maskX = boolMask' ["fug"]                 paramsX :: T.Tensor
maskY = boolMask' ["idoverw", "gdsoverw"] paramsY :: T.Tensor

Now we create a path with the current time stamp `../models/<pdk>/<dev>-YYYYMMDD-HHMMSS` where the model checkpoints will be stored. Also we define the path to the operating point of the device we'd like to train for.

In [6]:
modelPath <- createModelDir' pdk dev
dataPath = "../data/" ++ pdk ++ "-" ++ dev  ++ ".pt"

## Data Loading and Processing

With this we can load the data into a `DataFrame`

In [7]:
df' <- DF.fromFile pdk' dataPath



From this we extract the columns of interest and do some slight processing.

In [8]:
 vals   = T.cat (T.Dim 1) [ T.abs $  df' ?? "M0.m1:gmoverid"
                          , T.abs $ (df' ?? "M0.m1:id")  / (df' ?? "W")
                          , T.abs $ (df' ?? "M0.m1:gds") / (df' ?? "W")
                          , T.abs $  df' ?? "M0.m1:fug"
                          ,          df' ?? "M0.m1:vds"
                          ,          df' ?? "M0.m1:vgs"
                          ,          df' ?? "M0.m1:vbs"
                          ,          df' ?? "M0.m1:vth"
                          ,          df' ?? "M0.m1:id"
                          ,          df' ?? "W"
                          ,          df' ?? "L"
                          ]

dfRaw' = DF.dropNan $ DataFrame cols vals

For training we prefer data where the device is in saturation

In [9]:
sat    = satMask dev' dfRaw'
sat'   = T.logicalNot sat
nSat'  = (`div` 4) . head . T.shape . T.nonzero $ sat
dfSat  = rowFilter sat dfRaw'

dfSat' <- DF.sampleIO nSat' False $ rowFilter sat' dfRaw'



Shuffle the data

In [10]:
dfShuff <- DF.shuffleIO (DF.concat [dfSat, dfSat'])



Transform the data for a better distribution of values

In [11]:
dfT  = DF.dropNan 
     $ DF.union (trafo maskX <$> DF.lookup paramsX dfShuff)
                (trafo maskY <$> DF.lookup paramsY dfShuff)
dfX' = DF.lookup paramsX dfT
dfY' = DF.lookup paramsY dfT

For data scaling and transformation we need to define min and max values for inputs and outputs.

In [12]:
minX = fst . T.minDim (T.Dim 0) T.RemoveDim . values $ dfX'
maxX = fst . T.maxDim (T.Dim 0) T.RemoveDim . values $ dfX'
minY = fst . T.minDim (T.Dim 0) T.RemoveDim . values $ dfY'
maxY = fst . T.maxDim (T.Dim 0) T.RemoveDim . values $ dfY'

In [13]:
dfX = scale minX maxX <$> dfX'
dfY = scale minY maxY <$> dfY'
df  = DF.dropNan $ DF.union dfX dfY

## Model Definition

Now we define the Neural Network, Optimizer and training procedure.

In [14]:
net <- T.toDevice T.gpu <$> T.sample (OpNetSpec numX numY)

opt = T.mkAdam 0 β1 β2 $ NN.flattenParameters net

We split the data into training and test set and split those into batches.

In [15]:
(trainX', validX', trainY', validY') = 
        trainTestSplit paramsX paramsY testSplit df

trainX = T.split batchSize (T.Dim 0) . T.toDevice T.gpu $ trainX'
trainY = T.split batchSize (T.Dim 0) . T.toDevice T.gpu $ trainY'
validX = T.split batchSize (T.Dim 0) . T.toDevice T.gpu $ validX'
validY = T.split batchSize (T.Dim 0) . T.toDevice T.gpu $ validY'

## Training

Now we run the training for a given number of epochs:

In [16]:
(net', opt') <- runEpochs modelPath numEpisodes trainX validX trainY validY net opt

Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   0%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   0%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   0%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   1%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   1%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   1%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   2%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   2%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   2%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   2%Training Epoch 10 ╢░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   3%Training Epoch 10 ╢█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   3%Training Epoch 10 ╢█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   3%Training Epoch 10 ╢█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   4%Training Epoch 10 ╢█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   4%Training Epoch 10 ╢█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░╟   4%Training Epoch 10 ╢█░░░░

Afterwards we save a checkpoint, in case we wish to continue training at a later time.

In [19]:
saveCheckPoint modelPath net' opt'

Alternatively we can trace the model and store it as TorchScript for inference. See `./haskell_inference.ipynb` and `./python_inference.ipynb` on how to load and use such a model in haskell and python respectively.

In [21]:
net'' <- T.toDevice T.cpu <$> noGrad net'

tracePath = modelPath ++ "/trace.pt"
predict   = trafo' maskY
          . scale' minY maxY
          . forward net''
          . scale minX maxX
          . trafo maskX

traceModel dev' pdk' numX predict >>= saveInferenceModel tracePath