Skip to content

Latest commit

 

History

History
139 lines (122 loc) · 13.7 KB

Default parameters.md

File metadata and controls

139 lines (122 loc) · 13.7 KB

Default parameters

The default parameters that control the database construction, model training, high-throughput prediction ...

You can import such parameters from agat module.

from agat.default_paramters import default_elements, default_build_properties, default_data_config, default_train_config, default_high_throughput_config

Or you can read the source code.

default_elements

Elements used to build graph. A list of elements that are used to encode atomic features.

['Ac', 'Ag', 'Al', 'Am', 'Ar', 'As', 'At', 'Au', 'B',  'Ba',
'Be', 'Bh', 'Bi', 'Bk', 'Br', 'C',  'Ca', 'Cd', 'Ce', 'Cf',
'Cl', 'Cm', 'Cn', 'Co', 'Cr', 'Cs', 'Cu', 'Db', 'Ds', 'Dy',
'Er', 'Es', 'Eu', 'F',  'Fe', 'Fl', 'Fm', 'Fr', 'Ga', 'Gd',
'Ge', 'H',  'He', 'Hf', 'Hg', 'Ho', 'Hs', 'I',  'In', 'Ir',
'K',  'Kr', 'La', 'Li', 'Lr', 'Lu', 'Lv', 'Mc', 'Md', 'Mg',
'Mn', 'Mo', 'Mt', 'N',  'Na', 'Nb', 'Nd', 'Ne', 'Nh', 'Ni',
'No', 'Np', 'O',  'Og', 'Os', 'P',  'Pa', 'Pb', 'Pd', 'Pm',
'Po', 'Pr', 'Pt', 'Pu', 'Ra', 'Rb', 'Re', 'Rf', 'Rg', 'Rh',
'Rn', 'Ru', 'S',  'Sb', 'Sc', 'Se', 'Sg', 'Si', 'Sm', 'Sn',
'Sr', 'Ta', 'Tb', 'Tc', 'Te', 'Th', 'Ti', 'Tl', 'Tm', 'Ts',
'U',  'V',  'W',  'Xe', 'Y',  'Yb', 'Zn', 'Zr']

default_build_properties

A dictionary defines which properties will be built into the graph.

Parameter Default value Alternative(s) Explanation
energy True False Include total energy when building graphs.
forces True False Include atomic forces when building graphs.
cell True False Include structural cell when building graphs.
cart_coords True False Include Cartesian coordinates when building graphs.
frac_coords True False Include Fractional coordinates when building graphs.
constraints True False Include constraint information when building graphs.
stress True False Include Virial stress when building graphs.
distance True False Include distance between connected atoms when building graphs.
direction True False Include unit vector between connected atoms when building graphs.
path False True Include file path of each graph corresponding to DFT calculations when building graphs.

default_data_config

A dictionary defines how to build a database.

Parameter Default value Alternative(s) Explanation
species default_elements above A list of element symbols A list of elements that are used to encode atomic features.
path_file 'paths.log' str A file of absolute paths where OUTCAR and XDATCAR files exist.
build_properties default_build_properties above See default_build_properties Properties needed to be built into graph.
topology_only False True Build graph with topology connections only. The energy, forces, cell, and stress will not be included. This setting has higher priority than default_build_properties
dataset_path 'dataset' A str A directory contains the database.
mode_of_NN 'ase_natural_cutoffs' 'ase_natural_cutoffs', 'pymatgen_dist', 'ase_dist', and 'voronoi' The mode of how to detect connection between atoms. Note that pymatgen is much faster than ase.
cutoff 5.0 A float Cutoff distance to identify connections between atoms. Deprecated if mode_of_NN is 'ase_natural_cutoffs'
load_from_binary False True Read graphs from binary graphs that are constructed before. If this variable is True, these above variables will be depressed.
num_of_cores 2 int How many cores are used to extract vasp files and build graphs.
super_cell False True When building graphs, small cell may have problems to find neighbors. Specify this parameter as True to repeat cell to avoid such problems
has_adsorbate False True Include adsorbate information when building graphs. For now, only H and O atoms are considered as adsorbate atoms.
keep_readable_structural_files False True Massive number of structural files (POSCARs) under dataset_path are generated when building graphs, you can choose to keep them or not.
mask_similar_frames False True In VASP calculations, the energy optimization generate many frames that have similar geometry and total energies, you can extract only some of them by specifying this parameter and energy_stride below.
mask_reversed_magnetic_moments False float Frames with atomic magnetic moments lower than this value will be masked.
scale_prop False True Scale the properties. This function seems to be deprecated. I need to double-check the source code first, so do not use it.

default_train_config

A dict determines how to train the AGAT model.

Parameter Default value Alternative(s) Explanation
verbose 1 0, 1 Output verbosity. 0: test output; 1: Validation and test output; 2: train, validation, and test output.
dataset_path 'dataset' A str A directory contains the database.
model_save_dir 'agat_model' directory name, str A directory to save the well-trained model.
epochs 1000 int Number of training epochs.
output_files 'out_file' str A directory to store ouputs of true and predicted properties.
device 'cuda:0' 'cpu' Device to train the model. Use GPU cards to accerelate training.
validation_size 0.15 float, 0<validation_size<1 Determines the proportion of the dataset to be included in the validation split.
test_size 0.15 float, 0<validation_size<1 Determines the proportion of the dataset to be included in the test split.
early_stop True False Implement early stop or not. If this is True, the training will be terminated after a specified number of epochs without model improvement. If this is False, the model weights will be saved every epoch.
stop_patience 300 int Activated when early_stop=True. The training will be terminated after stop_patience epochs without model improvement.
head_list ['mul', 'div', 'free'] list A list of attention mechanisms. See agat/model/model.py.
gat_node_dim_list [len(default_elements), 100, 100, 100] list Node dimensions of AGAT Layer.
energy_readout_node_list [len(head_list)*gat_node_dim_list[-1], 100, 50, 30, 10, 3, 1] list A list of node dimensions of energy readout layers.
force_readout_node_list [len(head_list)*gat_node_dim_list[-1], 100, 50, 30, 10, 3, 3] list A list of node dimensions of force readout layers.
stress_readout_node_list [len(head_list)*gat_node_dim_list[-1], 100, 50, 30, 10, 3, 6] list A list of node dimensions of stress readout layers.
bias True False Add bias or not to the neural networks.
negative_slope 0.2 float This specifies the negative slope of the LeakyReLU activation function.
criterion nn.MSELoss() torch.nn loss functions Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input x and target y.
a 1.0 float The importance of energy loss in the total loss function. See agat/model/fit.py.
b 1.0 float The importance of force loss in the total loss function. See agat/model/fit.py.
c 0.0 float The importance of stress loss in the total loss function. See agat/model/fit.py.
learning_rate 0.0001 float The learning rate of torch.optim.Adam optimizer.
weight_decay 0.0 float The weight decay of torch.optim.Adam optimizer.
batch_size 64 int Training batch size.
val_batch_size 400 int Batch size when validation and test.
transfer_learning False True Turn on the transfer learning when True. (Deprecated)
trainable_layers -4 negative int tail trainable_layers layers are trainable, other layers are freezed. (Deprecated)
mask_fixed False True Mask fixed atoms or not. When True, the atomic forces of fixed atoms will not be included in the loss function. (Deprecated)
tail_readout_no_act [3,3,3] list The tail tail_readout_no_act layers will have no activation functions. The first, second, and third elements are for energy, force, and stress readout layers, respectively.
adsorbate_coeff 20.0 float Indentify and specify the importance of adsorbate atoms with respective to surface atoms. zero for equal importance.

default_ase_calculator_config.

See bfgs for more details.

Parameter Default value Alternative(s) Explanation
fmax 0.1 float Convergence criterion of atomic forces. Details: ase optimizer
steps 200 int Maximum iteration steps.
maxstep 0.05 float maximum distance an atom can move per iteration, unit is Å.
restart None str Pickle file used to store hessian matrix.
restart_steps 0 int Restart optimization if the optimization cannot converge.
perturb_steps 0 int Number of perturbated steps. AGAT may have issues in converging BFGS, perturbating atomic positions may help the convergence.
perturb_amplitude 0.05 float Perturbation amplitudes if erturb_steps larger than 1.
out None str Base name of the output log and traj.

default_high_throughput_config

Settings for the high-throughput predictions.

Parameter Default value Alternative(s) Explanation
model_save_dir agat_model str, a directory name A directory for loading the well-trained model from.
opt_config default_ase_calculator_config dict Settings for ase.optimize.BFGS structural optimizer.
calculation_index 0 str To label the calculation outputs.
fix_all_surface_atom False True Fix all surface atoms or not.
remove_bottom_atoms False True Remove the bottom atoms or not.
save_trajectory False True Keep the optimization trajectory.
partial_fix_adsorbate False True Partially fix the adsorbate freedom.
adsorbates ['H'] keys of agat/lib/adsorbate_poscar.py
sites ['ontop'] list A list of adsorption sites. See agat/app/cata/generate_adsorption_sites.py
dist_from_surf 1.7 float Distance between adosrbate and surface. Unit: angstrom.
using_template_bulk_structure False True Using template to build the surface model. If this is True, you need to prepare a POSCAR_temp file in the working directory.
graph_build_scheme_dir 'dataset' A directory name. str A directory storing the graph_build_scheme.json file. This file is generated when building the database, and is normally saved in the default_data_config['dataset_path'].
device 'cuda' str Determines the device for the model prediction (forward).

default_hp_dft_config

Default settings for the high-throughput DFT calculations.

To be continued...