#TODO add all options to description of the parameter example ïsolated_atoms_shift and per_element_regression_shift

- **n_epochs**: int (required)

  >Number of training epochs.



- **seed**: int (default = 1)
  
    >Seed for initializing random numbers.



- **patience**: int (optional)

  >Number of epochs without improvement before training termination.



- **n_models**: int (default = 1)

  >Number of models trained simultaneously.



- **n_jitted_steps**: int (default = 1)

  >Number of train batches in a compiled loop. Can speed up for small batches.

- **Data**
  - directory: str (default = "/models")

    >Path to directory where training results and checkpoints are written.


  - experiment: str (default = "apax")

    >Model name distinguishing from others in directory.  


  - data_path: str (required if train_ and val_data_path is not specified)

    >Path to single dataset file.
    

  - train_data_path: str (required if data_path is not specified)
    >Path to training dataset.

  - val_data_path: str (required if data_path is not specified)
    >Path to validation dataset.

  - test_data_path: str (optional)
    >Path to test dataset.

  - n_train: int (default = 1000)
    >Number of training data points.
    
  - n_valid: int (default = 100)
    >Number of validation data points.
    
  - batch_size: int (default = 32)
    >Number of training examples evaluated at once.
    
  - valid_batch_size: int (default = 100)
    >Number of validation examples evaluated at once.
    
  - shift_method: str (default = "per_element_regression_shift")
    >Method for shifting.
    
  - shift_options: dict (default = {"energy_regularization": 1.0})
    >Regularization magnitude for energy regression. #TODO fill in the other options
    
  - shuffle_buffer_size: int (default = 1000)
    >Size of `tf.data` shuffle buffer.
    
  - pos_unit: str (default = "Ang")
    >Positional unit.
    
  - energy_unit: str (default = "eV")
    >Energy unit.
    
  - additional_properties_info: dict (optional)
    >Dictionary of property name, shape pairs.
    


- **Model**
  - n_basis: int (default = 7)
    >Number of Gaussian basis functions.

  - n_radial: int (default = 5)
    >Number of contracted basis functions.

  - nn: list of int (default = [512, 512])
    >Hidden layers and units.

  - r_max: float (default = 6.0)
    >Maximum position of first basis function's mean in angstrom.

  - r_min: float (default = 0.5)
    >Descriptor cutoff radius in angstrom.

  - use_zbl: bool (default = false)
    >Use emperical Ziegler-Biersack-Littmark potential.

  - b_init: str (default = "normal")
    >Initialization scheme for biases.  #TODO fill in the other options

  - descriptor_dtype: str (default = "fp64")
    >Descriptor data type.

  - readout_dtype: str (default = "fp32")
    >Readout data type.

  - scale_shift_dtype: str (default = "fp32")
    >Scale/Shift data type.
    


- **Loss**
  - loss_type: str (default = "structures")
    >Weighting scheme for atomic contributions.  #TODO fill in the other options

  - name: str (default = "energy")
    >Quantity keyword.

  - weight: float (default = 1.0)
    >Weighting factor in loss function.

  - name: str (default = "forces")
    >Quantity keyword.

  - weight: float (default = 4.0)
    >Weighting factor in loss function.


- **Metrics**
  - name: str (default = "energy")
    >Quantity keyword.
    
  - reductions:
    >List of reductions on target-prediction differences.
    
  - name: str (default = "forces")
    >Quantity keyword.
    
  - reductions: list of str (default = [mae, mse])
    >Reductions on target-prediction differences.



- **Optimizer**
  - opt_name: str (default = "adam")
    >Optimizer name.  #TODO fill in the other options
    
  - opt_kwargs: dict (if optimizer requires)
    >Optimizer keyword arguments.
    
  - emb_lr: float (default = 0.03)
    >Learning rate for elemental embedding contraction coefficients.
    
  - nn_lr: float (default = 0.03)
    >Learning rate for neural network parameters.
    
  - scale_lr: float (default = 0.001)
    >Learning rate for elemental output scaling factors.
    
  - shift_lr: float (default = 0.05)
    >Learning rate for elemental output shifts.
    
  - zbl_lr: float (default = 0.001)
    >Learning rate for Zero-Body-Loss.
    
  - transition_begin: int (default = 0)
    >Training steps before linear learning rate schedule.



- **Callbacks**
  - name: str (default = "csv")  
    >Callback name.



- **Progress Bar**
  - disable_epoch_pbar: bool (default = false)
    >Disable epoch progress bar.

  - disable_nl_pbar: bool (default = false)
    >Disable NL precomputation progress bar.


- **Checkpoints**
  - ckpt_interval: int (default = 1)
    >Epochs between checkpoints.
    
  - base_model_checkpoint: (optional)
    >Path to pre-trained model checkpoint.
    
  - reset_layers: (optional)
    >List of layers to reinitialize parameters.

| Parameter                  | Default Value                  | Description                                                                 |
|----------------------------|--------------------------------|-----------------------------------------------------------------------------|
| **n_epochs**               | `<NUMBER OF EPOCHS>`           | Number of training epochs.                                                  |
| **seed**                   | 1                              | Seed for initializing random numbers.                                       |
| **patience**               | None                           | Number of epochs without improvement before training termination.          |
| **n_models**               | 1                              | Number of models trained simultaneously.                                    |
| **n_jitted_steps**         | 1                              | Number of train batches in a compiled loop. Can speed up for small batches. |
| **Data**                   |                                |                                                                             |
| directory                  | models/                        | Path to directory where training results and checkpoints are written.      |
| experiment                 | apax                           | Model name distinguishing from others in directory.                         |
| data_path                  | `<PATH>`                       | Path to single dataset file.                                               |
| train_data_path            | `<PATH>`                       | Path to training dataset.                                                   |
| val_data_path              | `<PATH>`                       | Path to validation dataset.                                                 |
| test_data_path             | `<PATH>`                       | Path to test dataset.                                                       |
| n_train                    | 1000                           | Number of training data points.                                             |
| n_valid                    | 100                            | Number of validation data points.                                           |
| batch_size                 | 32                             | Number of training examples evaluated at once.                             |
| valid_batch_size           | 100                            | Number of validation examples evaluated at once.                           |
| shift_method               | "per_element_regression_shift" | Method for shifting.                                                        |
| shift_options              | energy_regularization: 1.0     | Regularization magnitude for energy regression.                             |
| shuffle_buffer_size        | 1000                           | Size of `tf.data` shuffle buffer.                                           |
| pos_unit                   | Ang                            | Positional unit.                                                            |
| energy_unit                | eV                             | Energy unit.                                                                |
| additional_properties_info |                                | Dictionary of property name, shape pairs.                                   |
| **Model**                  |                                |                                                                             |
| n_basis                    | 7                              | Number of Gaussian basis functions.                                         |
| n_radial                   | 5                              | Number of contracted basis functions.                                      |
| nn                         | [512, 512]                     | Hidden layers and units.                                                    |
| r_max                      | 6.0                            | Maximum position of first basis function's mean.                           |
| r_min                      | 0.5                            | Descriptor cutoff radius.                                                   |
| use_zbl                    | false                          | Use Zero-Body-Loss.                                                         |
| b_init                     | normal                         | Initialization scheme for biases.                                           |
| descriptor_dtype           | fp64                           | Descriptor data type.                                                       |
| readout_dtype              | fp32                           | Readout data type.                                                          |
| scale_shift_dtype          | fp32                           | Scale/Shift data type.                                                      |
| **Loss**                   |                                |                                                                             |
| loss_type                  | structures                     | Weighting scheme for atomic contributions.                                 |
| name                       | energy                         | Quantity keyword.                                                           |
| weight                     | 1.0                            | Weighting factor in loss function.                                          |
| name                       | forces                         | Quantity keyword.                                                           |
| weight                     | 4.0                            | Weighting factor in loss function.                                          |
| **Metrics**                |                                |                                                                             |
| name                       | energy                         | Quantity keyword.                                                           |
| reductions                 |                                | List of reductions on target-prediction differences.                        |
| name                       | forces                         | Quantity keyword.                                                           |
| reductions                 | mae, mse                       | Reductions on target-prediction differences.                               |
| **Optimizer**              |                                |                                                                             |
| opt_name                   | adam                           | Optimizer name.                                                             |
| opt_kwargs                 | {}                             | Optimizer keyword arguments.                                                |
| emb_lr                     | 0.03                           | Learning rate for elemental embedding contraction coefficients.            |
| nn_lr                      | 0.03                           | Learning rate for neural network parameters.                                |
| scale_lr                   | 0.001                          | Learning rate for elemental output scaling factors.                        |
| shift_lr                   | 0.05                           | Learning rate for elemental output shifts.                                  |
| zbl_lr                     | 0.001                          | Learning rate for Zero-Body-Loss.                                           |
| transition_begin           | 0                              | Training steps before linear learning rate schedule.                        |
| **Callbacks**              |                                |                                                                             |
| name                       | csv                            | Callback name.                                                              |
| **Progress Bar**           |                                |                                                                             |
| disable_epoch_pbar         | false                          | Disable epoch progress bar.                                                 |
| disable_nl_pbar            | false                          | Disable NL precomputation progress bar.                                     |
| **Checkpoints**            |                                |                                                                             |
| ckpt_interval              | 1                              | Epochs between checkpoints.                                                 |
| base_model_checkpoint     | null                           | Path to pre-trained model checkpoint.                                       |
| reset_layers               | []                             | List of layers to reinitialize parameters.                                  |


# Complete Configuration File
 
```yaml
n_epochs: <NUMBER OF EPOCHS>  # Number of training epochs.
seed: 1                       # Seed for initialising random numbers
patience: None                # Number of epochs without improvement before trainings gets terminated.
n_models: 1                   # Number of models to be trained at once.
n_jitted_steps: 1             # Number of train batches to be processed in a compiled loop. 
                              # Can yield singificant speedups for small structures or small batch sizes.

data:
  directory: models/          # Path to the directory where the training results and checkpoints will be written.
  experiment: apax            # Name of  the model. Distinguishes it from the other models trained in the same `directory`.
  data_path: <PATH>           # Path to a single dataset file. Set either this or `val_data_path` and `train_data_path`.
  train_data_path: <PATH>     # Path to a training dataset. Set this and `val_data_path` if your data comes pre-split.
  val_data_path: <PATH>       # Path to a validation dataset. Set this and `train_data_path` if your data comes pre-split.
  test_data_path: <PATH>      # Path to a test dataset. Set this, `train_data_path` and `val_data_path` if your data comes pre-split.

  n_train: 1000               # Number of training datapoints from `data_path`.
  n_valid: 100                # Number of validation datapoints from `data_path`.

  batch_size: 32              # Number of training examples to be evaluated at once.
  valid_batch_size: 100       # Number of validation examples to be evaluated at once.

  shift_method: "per_element_regression_shift"
  shift_options:
    energy_regularisation: 1.0    # Magnitude of the regularization in the per-element energy regression.
  shuffle_buffer_size: 1000       # Size of the `tf.data` shuffle buffer.

  pos_unit: Ang
  energy_unit: eV

  additional_properties_info:     # Dict of property name, shape (ragged or fixed) pairs

model:
  n_basis: 7                  # Number of uncontracted gaussian basis functions.
  n_radial: 5                 # Number of contracted basis functions.
  nn: [512, 512]              # Number of hidden layers and units in those layers.

  r_max: 6.0                  # Position of the first uncontracted basis function's mean.
  r_min: 0.5                  # Cutoff radius of the descriptor.

  use_zbl: false              # 

  b_init: normal              # Initialization scheme for the neural network biases. Either `normal` or `zeros`.
  descriptor_dtype: fp64
  readout_dtype: fp32
  scale_shift_dtype: fp32

loss:
- loss_type: structures       # Weighting scheme for atomic contributions.
                              # See the MLIP package for reference 10.1088/2632-2153/abc9fe for details
  name: energy                # Keyword of the quantity e.g `energy`.
  weight: 1.0                 # Weighting factor in the overall loss function.
- loss_type: structures
  name: forces
  weight: 4.0

metrics:
- name: energy                # Keyword of the quantity e.g `energy`.
  reductions:                 # List of reductions performed on the difference between target and predictions.
                              # Can be mae, mse, rmse for energies and forces. For forces it is also possible to use `angle`.
  - mae
- name: forces
  reductions:
  - mae
  - mse

optimizer:
  opt_name: adam            # Name of the optimizer. Can be any `optax` optimizer.
  opt_kwargs: {}            # Optimizer keyword arguments. Passed to the `optax` optimizer.
  emb_lr: 0.03              # Learning rate of the elemental embedding contraction coefficients.
  nn_lr: 0.03               # Learning rate of the neural network parameters.
  scale_lr: 0.001           # Learning rate of the elemental output scaling factors.
  shift_lr: 0.05            # Learning rate of the elemental output shifts.
  zbl_lr: 0.001             # 
  transition_begin: 0       # Number of training steps (not epochs) before the start of the linear learning rate schedule.

callbacks:
- name: csv                 # Keyword of the callback used. Currently we implement "csv" and "tensorboard".

progress_bar:
  disable_epoch_pbar: false   # Set to True to disable the epoch progress bar.
  disable_nl_pbar: false      # Set to True to disable the NL precomputation progress bar.


checkpoints:
  ckpt_interval: 1                # Number of epochs between checkpoints.
  
                                  # The options below are used for transfer learning
  base_model_checkpoint: null     # Path to the folder containing a pre-trained model ckpt.
  reset_layers: []                # List of layer names for which the parameters will be reinitialized.

```