MLDP

0. Preparation

create necessary directories and initial configuration

mkdir ml
cd ml
mkdir nn inputs 
mkdir 1st 2nd 3rd 4th # nth iterations
cd nn; mkdir scripts; # copy train compress test job submission scripts of both GPU and CPU versions here
cd ../..
cd inputs;
## build input files for all the endmembers, if AB, then 
mkdir A B AB
## build poscar files, stores all the possible poscars that you want to include in the training set
mkdir pos
## create a config document, where 
cd ..
vi config

config should contains the paths to ml, nn, inputs, maybe specify the POTCAR configurations as well?

#some variables
#mldp path
mldp=~/script/mldp
#inputs path
inputs=/u/scratch/j/jd848/ml/inputs
#nn path
nn=/u/scratch/j/jd848/ml/nn

run bash config before proceeding

1. Recalculate

1.1: recalculate with VASP

Generate Descriptor using ASAP

asap gen_desc -s 10 --fxyz OUTCAR soap -e -c 6 -n 6 -l 6 -g 0.44
asap gen_desc -s 2 --fxyz npt.dump soap -e -c 6 -n 4 -l 4 -g 0.44

PCA analysis fps to identify frames to re-calcualte

python $mldp/asap/select_frames.py -i ASAP-desc.xyz -n 70 -s 10
python $mldp/asap/select_frames.py -i ASAP-desc.xyz -n 100

extract_deepmd.py with -id flag and index file generated in last step, consider build a pre folder first

python $mldp/extract_deepmd.py -f OUTCAR -id index_file -st # OUTCAR contains temperature info
python $mldp/extract_deepmd.py -f ../npt.dump -fmt dump -id ../test-frame-select-fps-n-100.index -st -t 4000

prepare a folders named inputs with INCAR,KPOINTS, POTCAR,sub_vasp.sh. Files must be tested for convergence. Also NBANDS and NELEM should be sufficient. Use recal_dpdata.py to recalculate selected frames
```
python $mldp/recal_dpdata.py -d deepmd/ -if $inputs/mgofe/inputs_5000 -rv no
python $mldp/recal_dpdata.py -d deepmd/ -if $inputs/mgofe/inputs_4000 -rv no
```
Insiderecalfolder

python $mldp/post_recal.py -ss $inputs/sub_vasp.sh
Inside recal folder, do python $mldp/check_nbands_nelm.py -ip all -v
Inside recal folder, dopython $mldp/merge_out.py -o OUTCAR -r y

Inside recal folder, remove the old deepmd folder, do

python $mldp/extract_deepmd.py -d deepmd -ttr 10000

dp test

dp test -m $nn/m7/v1/pv.pb -d m7v1 -n 400

analyze nn and vasp

python $mldp/model_dev/analysis.py -tf . -mp m5v1 -rf . -euc 10 -fuc 10 -flc 0.4 -elc 0.02
python $mldp/model_dev/analysis.py -tf . -mp m1v2 -rf . -euc 10 -fuc 10 -flc 0.6
python $mldp/model_dev/analysis.py -tf . -mp m2v1 -rf . -euc 10 -fuc 10 -flc 0.6 -elc 0.8
python $mldp/model_dev/analysis.py -tf . -mp m5v1 -rf . -euc 10 -fuc 10 -flc 0.4  -elc 0.02

build deepmd based on the idx file generated and remove/keep the old deepmd
dp train

####1.2: model deviation

extract frames

python $mldp/extract_deepmd.py -f ./dump.0 -fmt dump -ttr 1000000 -t 3000 -st 4500

dp test with different models

dp test -m $nn/m3/m3v1/m3v1.comp.pb -d m3v1 -n 2000
dp test -m $nn/m1/m2/pvh4.comp.pb -d m3v2

analyze model deviation

python $mldp/model_dev/analysis.py -tf . -mp m2-pvh4 -euc 10 
python $mldp/model_dev/analysis.py -rf . -tf . -mp m3v1 m3v2 m3v3 m3v4

the upper/lower limits of force and energy RMSEs should be benchmarked with VASP runs at least ones

idx file idx_model_deviaton is derived

python $mldp/model_dev/analysis.py -rf . -tf . -mp m3v1 m3v2 m3v3 m3v4 -elc 0.01 -euc 2 -flc 0.4 -fuc 1

A new dump file that contatins a subset of the frames of the original dump file is built based on idx_model_deviaton with dump.py
```
python $mldp/lmp/subset_dump.py -h
```
Generate Descriptor using ASAP on the new dump file
```
asap
```
visual inspection!
follow the section 1.2 for the rest of steps.

2. Pertubation

Perturb systems and run simulations

dependencies

-ase -MDAnalysis -dpdata

2.1. Workflow

analyze rdf with MDAnayalysis search for which two pairs to swap so that the short interatomic distance of the corresponding pair can be reached. velocity.py calculate velocity of the atoms given timestep and temperature and determine the min interatomic distance should be reached given target pressure, temperature, and timestep
simulation using good POSCAR without pertubation, set temperature and teimstep based on 1), to make the system collapse as quickly as possible
simulation using perturbed POSCAR pert.py perturbed POSCAR , input must be vasp/poscar format, for now this file is designed for MgSiO3 only, automatically swap Si-O, Mg-O, Mg-Si post_pert.py inpsect the interatomic distnace for piars in Mg-Si-O system, current cutoffs are designed for MgSiO3 up to around 1400 GPa. Support dump, vasp/poscar, lmp format
simulation with lammps lmp -in in.lammps login node usually can handle this. Do check the interatomic distance frequently. You do not want to waste time doing unnecessary runs
check interatomic distance. python ~/script/mldp/pert/post_pert.py -f mgsio3.dump -ft dump ==WARNING: Dump file may have lost atoms. If so, corresponding frames should be deleted==
recal with recal_lmp.py python ~/script/mldp/recal_lmp.py -if /u/home/j/jd848/project-lstixrud/pv+hf/dp-train/lmp_run/6k/rp5/160-cpu/pert/4k_mgo_swap_p2/inputs -r 0-7 Here step 5) output 0-6 generate interatomic distance close within the range prescribed
check if all vasp runs are done python ~/script/mldp/post_recal_lmp.py If not, bash out
check if all runs have sufficent nbands and NELM is good python ~/script/mldp/check_nbands_nelm.py -ip all if not, increase NBANDS, NELM in INCAR
Merge all vasp runs to one single OUTCAR python ~/script/mldp/merge_out.py -o OUTCAR -r y ==Be cautious about the -r (remove everything) flag==
Build deepmd input file from OUTCAR python ~/script/mldp/extract_deepmd.py -t -bs 1000 ==1000 is a random large number so that only one set is generated, -t means no test set==

3. Lammps

scripts used for analyzing lammps output

dependencies

-lammps_logfile

3.1. Workflow for thermal conductivity calculation

run lammps calculation
log_lmp.py extract the v_Jx, v_Jy, v_Jz heat current python ~/script/mldp/lmp/log_lmp.py -i log.lammps -y v_Jx v_Jy v_Jz -s -p

for multicomponent liquid system, one should also subtract the partial enthalpy term $h_a$ (Eq 4 in Deng and Stixrude, 2021) , so the correct command should be

python ~/script/mldp/lmp/log_lmp.py -i log.lammps -y v_jhx v_jhy v_jhz -s -p

If ave/correlate output file is stored, which is NOT recommended for liquid since many auto-correlation needs to be done, post_corr.py can be used to analyze the results

If no ave/correlate output file, use kappa.py analyze the output of step 2) python ~/script/mldp/lmp/kappa.py -s -a 500 1500 -a specifcy average between step 500 to 1500

4. ASAP

Fingerprint analysis using ASAP and Dscribe

dpkit environment works

dependencies

-ASAP -Dscribe

5. Model deviation

extract_deepmd.py extract configures

dp test with all models, model_dev/model_dev.py gives an example. Do customize the code to your need

analysis.py analyze the test results

post_model_dev.py postprocess

6. Scale

scale

7. Util

some useful routines

Name		Name	Last commit message	Last commit date
Latest commit History 333 Commits
__pycache__		__pycache__
asap		asap
bash_recal		bash_recal
deprecated		deprecated
lmp		lmp
model_dev		model_dev
pert		pert
scale		scale
similarity		similarity
ti		ti
util		util
vasp		vasp
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
check_nbands_nelm.py		check_nbands_nelm.py
dp_test.py		dp_test.py
extract_deepmd.py		extract_deepmd.py
merge_out.py		merge_out.py
post_recal.py		post_recal.py
post_recal_rerun.py		post_recal_rerun.py
post_recal_v2.py		post_recal_v2.py
post_stat_model.py		post_stat_model.py
recal_dpdata.py		recal_dpdata.py
shared_functions.py		shared_functions.py
stat_model.py		stat_model.py
sum_dat_for_pub.py		sum_dat_for_pub.py
sum_dat_for_pub_count_num.py		sum_dat_for_pub_count_num.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLDP

0. Preparation

1. Recalculate

1.1: recalculate with VASP

2. Pertubation

Perturb systems and run simulations

dependencies

2.1. Workflow

3. Lammps

dependencies

3.1. Workflow for thermal conductivity calculation

4. ASAP

dependencies

5. Model deviation

6. Scale

7. Util

About

Releases

Packages

Contributors 3

Languages

neojie/mldp

Folders and files

Latest commit

History

Repository files navigation

MLDP

0. Preparation

1. Recalculate

1.1: recalculate with VASP

2. Pertubation

Perturb systems and run simulations

dependencies

2.1. Workflow

3. Lammps

dependencies

3.1. Workflow for thermal conductivity calculation

4. ASAP

dependencies

5. Model deviation

6. Scale

7. Util

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages