# Train a model

## Installation

In [1]:
%%bash

pip install graph-pes | tail -n 1

Successfully installed graph-pes-0.0.1


We now should have access to the ``graph-pes-train`` command. We can check this by running:

In [2]:
%%bash

graph-pes-train -h

usage: graph-pes-train [-h] [--config CONFIG] [overrides ...]

Train a GraphPES model from a configuration file using PyTorch Lightning.

positional arguments:
  overrides        Config overrides in the form nested^key=value, separated by
                   spaces, e.g. fitting^loader_kwargs^batch_size=32.

options:
  -h, --help       show this help message and exit
  --config CONFIG  Path to the configuration file. This argument can be used
                   multiple times, with later files taking precedence over
                   earlier ones in the case of conflicts. If no config files
                   are provided, the script will auto-generate.

Copyright 2023-24, John Gardner


## Configuration

## Let's train

In [3]:
%%bash

export LOAD_ATOMS_VERBOSE=0  # disable load-atoms progress bars
graph-pes-train --config quickstart-config.yaml

Seed set to 42


[graph-pes INFO]: Set logging level to INFO
[graph-pes INFO]: Started training at 2024-09-16 17:02:04.007
[graph-pes INFO]: Output directory: graph-pes-results/quickstart-run
[graph-pes INFO]: 
Logging using WandbLogger(
  project="graph-pes",
  id="quickstart-run",
  save_dir="graph-pes-results"
)



GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
wandb: Currently logged in as: jla-gardner. Use `wandb login --relogin` to force relogin
wandb: wandb version 0.18.0 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.17.1
wandb: Run data is saved locally in graph-pes-results/wandb/run-20240916_170205-quickstart-run
wandb: Run `wandb offline` to turn off syncing.
wandb: Resuming run trial-run
wandb: ⭐️ View project at https://wandb.ai/jla-gardner/graph-pes
wandb: 🚀 View run at https://wandb.ai/jla-gardner/graph-pes/runs/quickstart-run


[graph-pes INFO]: Logging to graph-pes-results/quickstart-run/logs/rank-0.log
[graph-pes INFO]: 
model:
   graph_pes.models.PaiNN:
      layers: 3
      cutoff: 3.0
data:
   graph_pes.data.load_atoms_dataset:
      id: QM7
      cutoff: 3.0
      n_train: 1000
      n_valid: 100
loss: graph_pes.training.loss.PerAtomEnergyLoss()
fitting:
   pre_fit_model: true
   max_n_pre_fit: 5000
   early_stopping_patience: null
   trainer_kwargs:
      max_epochs: 25
      accelerator: auto
      enable_model_summary: false
   loader_kwargs:
      num_workers: 0
      persistent_workers: false
      batch_size: 16
      pin_memory: false
   optimizer:
      graph_pes.training.opt.Optimizer:
         name: AdamW
         lr: 0.01
   scheduler: null
   swa: null
general:
   seed: 42
   root_dir: graph-pes-results
   run_id: quickstart-run
   log_level: INFO
   progress: logged
wandb:
   project: graph-pes

[graph-pes INFO]: 
FittingData(
  train=ASEDataset(1,000, labels=['energy']),
  valid=ASEDataset

   epoch   valid/loss/total   valid/loss/per_atom_energy_mae_component   timer/its_per_s/train   timer/its_per_s/valid
       0            0.14216                                    0.14216                83.33334               207.14285
       1            0.14075                                    0.14075                90.90909               221.42857
       2            0.05074                                    0.05074                90.90909               214.28572
       3            0.03484                                    0.03484               100.00000               214.28572
       4            0.05504                                    0.05504                90.90909               214.28572
       5            0.07713                                    0.07713                83.33334               207.14285
       6            0.16388                                    0.16388                83.33334               221.42857
       7            0.07709                     

`Trainer.fit` stopped: `max_epochs=25` reached.


[graph-pes INFO]: Loading best weights from "graph-pes-results/quickstart-run/checkpoints/best.ckpt"
[graph-pes INFO]: Training complete.
[graph-pes INFO]: Model saved to graph-pes-results/quickstart-run/model.pt
[graph-pes INFO]: Deploying model for use with LAMMPS to graph-pes-results/quickstart-run/lammps_model.pt


wandb: \ 0.013 MB of 0.013 MB uploaded
wandb: Run history:
wandb:                                    epoch ▁▁▁▂▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇▇███
wandb:                           lr-AdamW/model ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                   n_learnable_parameters ▁
wandb:                             n_parameters ▁
wandb:                    timer/its_per_s/train ▆▆▆▆▆▆▆▆▆▆▆▆▄▆██▆▆█▆▆█▆▄▆▄▄▆▆▁▆
wandb:                    timer/its_per_s/valid ▄▇▅▅▅▄▇▅▇▅▁▇▇▅▇▇▅▇▇▇█▅█▇▇
wandb:             timer/step_duration_ms/train ▃▃▃▃▃▃▃▃▃▃▃▃▅▃▁▁▃▃▁▃▃▁▃▅▃▅▅▃▃█▃
wandb:             timer/step_duration_ms/valid ▅▂▃▃▃▅▂▃▂▃█▂▂▃▂▂▃▂▂▂▁▃▁▂▂
wandb: train/loss/per_atom_energy_mae_component ▆█▂▁▂▃▂▂▄▁▁▁▁▂▂▁▄▁▁▂▁▃▂▂▃▂▃▃▂▄▂
wandb:                         train/loss/total ▆█▂▁▂▃▂▂▄▁▁▁▁▂▂▁▄▁▁▂▁▃▂▂▃▂▃▃▂▄▂
wandb:        train/metrics/per_atom_energy_mae ▆█▂▁▂▃▂▂▄▁▁▁▁▂▂▁▄▁▁▂▁▃▂▂▃▂▃▃▂▄▂
wandb:                      trainer/global_step ▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇██
wandb: valid/loss/per_atom_energy_mae_com