create necessary directories and initial configuration
mkdir ml
cd ml
mkdir nn inputs
mkdir 1st 2nd 3rd 4th # nth iterations
cd nn; mkdir scripts; # copy train compress test job submission scripts of both GPU and CPU versions here
cd ../..
cd inputs;
## build input files for all the endmembers, if AB, then
mkdir A B AB
## build poscar files, stores all the possible poscars that you want to include in the training set
mkdir pos
## create a config document, where
cd ..
vi config config should contains the paths to ml, nn, inputs, maybe specify the POTCAR configurations as well?
#some variables
#mldp path
mldp=~/script/mldp
#inputs path
inputs=/u/scratch/j/jd848/ml/inputs
#nn path
nn=/u/scratch/j/jd848/ml/nnrun bash config before proceeding
-
Generate Descriptor using
ASAPasap gen_desc -s 10 --fxyz OUTCAR soap -e -c 6 -n 6 -l 6 -g 0.44 asap gen_desc -s 2 --fxyz npt.dump soap -e -c 6 -n 4 -l 4 -g 0.44
-
PCA analysis
fpsto identify frames to re-calcualtepython $mldp/asap/select_frames.py -i ASAP-desc.xyz -n 70 -s 10 python $mldp/asap/select_frames.py -i ASAP-desc.xyz -n 100
-
extract_deepmd.pywith-idflag and index file generated in last step, consider build aprefolder firstpython $mldp/extract_deepmd.py -f OUTCAR -id index_file -st # OUTCAR contains temperature info python $mldp/extract_deepmd.py -f ../npt.dump -fmt dump -id ../test-frame-select-fps-n-100.index -st -t 4000
-
prepare a folders named
inputswithINCAR,KPOINTS,POTCAR,sub_vasp.sh. Files must be tested for convergence. Also NBANDS and NELEM should be sufficient. Userecal_dpdata.pyto recalculate selected framespython $mldp/recal_dpdata.py -d deepmd/ -if $inputs/mgofe/inputs_5000 -rv no python $mldp/recal_dpdata.py -d deepmd/ -if $inputs/mgofe/inputs_4000 -rv no
-
Inside
recalfolderpython $mldp/post_recal.py -ss $inputs/sub_vasp.sh -
Inside
recalfolder, dopython $mldp/check_nbands_nelm.py -ip all -v -
Inside
recalfolder, dopython $mldp/merge_out.py -o OUTCAR -r y -
Inside
recalfolder, remove the old deepmd folder, dopython $mldp/extract_deepmd.py -d deepmd -ttr 10000 -
dp testdp test -m $nn/m7/v1/pv.pb -d m7v1 -n 400
-
analyze nn and vasp
python $mldp/model_dev/analysis.py -tf . -mp m5v1 -rf . -euc 10 -fuc 10 -flc 0.4 -elc 0.02
python $mldp/model_dev/analysis.py -tf . -mp m1v2 -rf . -euc 10 -fuc 10 -flc 0.6
python $mldp/model_dev/analysis.py -tf . -mp m2v1 -rf . -euc 10 -fuc 10 -flc 0.6 -elc 0.8
python $mldp/model_dev/analysis.py -tf . -mp m5v1 -rf . -euc 10 -fuc 10 -flc 0.4 -elc 0.02-
build
deepmdbased on the idx file generated and remove/keep the old deepmd -
dp train
####1.2: model deviation
-
extract frames
python $mldp/extract_deepmd.py -f ./dump.0 -fmt dump -ttr 1000000 -t 3000 -st 4500 -
dp test with different models
dp test -m $nn/m3/m3v1/m3v1.comp.pb -d m3v1 -n 2000 dp test -m $nn/m1/m2/pvh4.comp.pb -d m3v2
-
analyze model deviation
python $mldp/model_dev/analysis.py -tf . -mp m2-pvh4 -euc 10 python $mldp/model_dev/analysis.py -rf . -tf . -mp m3v1 m3v2 m3v3 m3v4
the upper/lower limits of force and energy RMSEs should be benchmarked with VASP runs at least ones
idx file
idx_model_deviatonis derivedpython $mldp/model_dev/analysis.py -rf . -tf . -mp m3v1 m3v2 m3v3 m3v4 -elc 0.01 -euc 2 -flc 0.4 -fuc 1
-
A new
dumpfile that contatins a subset of the frames of the original dump file is built based onidx_model_deviatonwithdump.pypython $mldp/lmp/subset_dump.py -h -
Generate Descriptor using
ASAPon the new dump fileasap
visual inspection!
-
follow the section 1.2 for the rest of steps.
-ase
-MDAnalysis
-dpdata
- analyze rdf with
MDAnayalysissearch for which two pairs to swap so that the short interatomic distance of the corresponding pair can be reached.velocity.pycalculate velocity of the atoms given timestep and temperature and determine the min interatomic distance should be reached given target pressure, temperature, and timestep - simulation using good POSCAR without pertubation, set temperature and teimstep based on 1), to make the system collapse as quickly as possible
- simulation using perturbed POSCAR
pert.pyperturbed POSCAR , input must be vasp/poscar format, for now this file is designed for MgSiO3 only, automatically swap Si-O, Mg-O, Mg-Sipost_pert.pyinpsect the interatomic distnace for piars in Mg-Si-O system, current cutoffs are designed for MgSiO3 up to around 1400 GPa. Supportdump,vasp/poscar,lmpformat - simulation with lammps
lmp -in in.lammpslogin node usually can handle this. Do check the interatomic distance frequently. You do not want to waste time doing unnecessary runs - check interatomic distance.
python ~/script/mldp/pert/post_pert.py -f mgsio3.dump -ft dump==WARNING: Dump file may have lost atoms. If so, corresponding frames should be deleted== - recal with
recal_lmp.pypython ~/script/mldp/recal_lmp.py -if /u/home/j/jd848/project-lstixrud/pv+hf/dp-train/lmp_run/6k/rp5/160-cpu/pert/4k_mgo_swap_p2/inputs -r 0-7Here step 5) output 0-6 generate interatomic distance close within the range prescribed - check if all vasp runs are done
python ~/script/mldp/post_recal_lmp.pyIf not,bash out - check if all runs have sufficent nbands and NELM is good
python ~/script/mldp/check_nbands_nelm.py -ip allif not, increase NBANDS, NELM in INCAR - Merge all vasp runs to one single
OUTCARpython ~/script/mldp/merge_out.py -o OUTCAR -r y==Be cautious about the -r (remove everything) flag== - Build
deepmdinput file fromOUTCARpython ~/script/mldp/extract_deepmd.py -t -bs 1000==1000 is a random large number so that only one set is generated, -t means no test set==
scripts used for analyzing lammps output
-lammps_logfile
-
run lammps calculation
-
log_lmp.pyextract the v_Jx, v_Jy, v_Jz heat currentpython ~/script/mldp/lmp/log_lmp.py -i log.lammps -y v_Jx v_Jy v_Jz -s -p
for multicomponent liquid system, one should also subtract the partial enthalpy term
python ~/script/mldp/lmp/log_lmp.py -i log.lammps -y v_jhx v_jhy v_jhz -s -p
If ave/correlate output file is stored, which is NOT recommended for liquid since many auto-correlation needs to be done, post_corr.py can be used to analyze the results
- If no ave/correlate output file, use
kappa.pyanalyze the output of step 2)python ~/script/mldp/lmp/kappa.py -s -a 500 1500-a specifcy average between step 500 to 1500
Fingerprint analysis using ASAP and Dscribe
dpkit environment works
-ASAP
-Dscribe
extract_deepmd.py extract configures
dp test with all models, model_dev/model_dev.py gives an example. Do customize the code to your need
analysis.py analyze the test results
post_model_dev.py postprocess
scale
some useful routines