benchmarking

this repository is for benchmarking the force fields generated by my valence-fitting repo with Matt’s ibstore package

Usage

Environment

Initialize the conda environment with

mamba env create -f env.yaml

Then cd to wherever you cloned ibstore and run

pip install -e .

to add the ibstore package to your environment.

main.py

The central functionality can be accessed by running main.py directly:

python main.py

This is the same as passing the following values for each of the flags:

python main.py \
	    --forcefield force-field.offxml \
	    --dataset datasets/industry.json \
	    --sqlite-file tmp.sqlite \
	    --out-dir . \
	    --procs 16

In both cases, the forcefield to benchmark is taken from force-field.offxml in the current directory, the dataset is taken from the charge-filtered version of Lily’s version of the OpenFF Industry Benchmark Season 1 v1.0 in the datasets directory, the molecule database is stored in a file named tmp.sqlite, and the output CSV and PNG files are written to the current directory.

Makefile

The Makefile can automate this process, as well as sticking the resulting images together with ImageMagick using something like:

make output/industry/out.png

More creatively, you can run the industry benchmarks on a custom forcefield with something like:

make output/industry/sage/out.png TARGET=sage

This looks for a forcefield named sage.offxml in the root directory and runs main.py and the ImageMagick commands to generate the final output. It looks a bit repetitive, but, for now at least, the output/industry/* directory and the TARGET variable must be the same. This also works for any other dataset in the datasets directory, for example:

make output/full-opt/sage/out.png TARGET=sage

Slurm

Similarly, scripts/industry.sh simply calls the make command above, after activating the conda environment from env.yaml. So if everything is set up, you should be able to run

sbatch scripts/industry.sh

and come back around 24 hours later to a summary image like the one shown in the Results section below.

Results

OpenFF Full Optimization Benchmark 1

OpenFF Industry Benchmark Season 1 v1.0

Files

Dir	File	Purpose
.	main.py	Benchmarking script using ibstore
	refilter.py	script to refilter the industry dataset for charge issues
	env.yaml	conda environment to run the script
forcefields	tm.offxml	FB-optimized, really-filtered torsion-multiplicity FF
	sage-tm.offxml	FB-optimized sage 2.1.0 with torsion-multiplicity data
	sage-2.1.0.offxml	Sage 2.1.0 dumped from the toolkit
	eps-tors-10.offxml	Espaloma torsion values with Δ > 10.0 kcal/mol
	sage-sage.offxml	FB-optimized sage 2.1.0 with “original” Sage data
sage	env.yaml	conda environment from sage 2.1.0
	01-setup.py	Setup script from openff-sage
	02-b-minimize.py	Minimize all the structures, also from openff-sage
scripts	fetch_industry.sh	try to download the industry dataset - not working
	industry.sh	run the benchmarks on the industry dataset
	refilter.sh	refilter the industry dataset
	submit.sh	run the benchmarks on the full-opt dataset
full-opt-output	*	Benchmark output on full-opt dataset

Changelog

2024-05-01 cp /pub/amcisaac/sage-2.2.0/05_benchmark_forcefield/datasets/OpenFF-Industry-Benchmark-Season-1-v1.1-filtered-charge-coverage-cache.json datasets/cache/industry.json
- copied Lexie’s re-filtered, cached dataset over my previous cache

Name		Name	Last commit message	Last commit date
Latest commit History 342 Commits
analysis		analysis
change		change
current		current
datasets		datasets
dump		dump
forcefields		forcefields
hist		hist
misc		misc
output		output
profiling		profiling
sage		sage
scripts		scripts
src		src
tests		tests
triage		triage
true		true
.gitignore		.gitignore
Makefile		Makefile
README.org		README.org
cache_dataset.py		cache_dataset.py
csvdiff.R		csvdiff.R
dev.yaml		dev.yaml
env.yaml		env.yaml
get_stats.py		get_stats.py
label.py		label.py
main.py		main.py
new_tm_params.dat		new_tm_params.dat
plot.py		plot.py
select_labels.py		select_labels.py
select_params.py		select_params.py
subset.py		subset.py
testing.sh		testing.sh
ultra-tm.dat		ultra-tm.dat

ntBre/benchmarking

Folders and files

Latest commit

History

Repository files navigation

benchmarking

Usage

Environment

main.py

Makefile

Slurm

Results

OpenFF Full Optimization Benchmark 1

OpenFF Industry Benchmark Season 1 v1.0

Files

Changelog

About

Resources

Stars

Watchers

Forks

Languages