The ConformerEnsemble class can also read energies from the comment line on .xyz files.

In [None]:
from prism_pruner.conformer_ensemble import ConformerEnsemble
from prism_pruner.pruner import prune
from prism_pruner.utils import EH_TO_KCAL  # 627.5096080305927

ensemble = ConformerEnsemble.from_xyz("../tests/crest_conformers.xyz", read_energies=True)
ensemble.coords.shape

(675, 220, 3)


Energy-aware pruning can be useful for larger arrays: using a max_dE of 0.5 kcal/mol, a 28% speedup is observed relative to the normal mode, as a consequence of the fewer number of calls to the similarity evaluation functions. This is of course accompanied by a obtaining a larger, "less pruned" ensemble (132 vs 64 structures).

In [None]:
%%time
pruned, mask = prune(
    ensemble.coords,
    ensemble.atoms,
    energies=ensemble.energies,  # energies were provided in Eh: max_dE should be, too
    max_dE=0.5 / EH_TO_KCAL,  # 0.5 kcal, in Eh (7.968...10^-4 Eh)
    rot_corr_rmsd_pruning=True,
    debugfunction=print,
)

DEBUG: MOIPrunerConfig - k=20, rejected 455 (keeping 220/675), in 0.1 s
DEBUG: MOIPrunerConfig - k=10, rejected 49 (keeping 171/675), in 0.0 s
DEBUG: MOIPrunerConfig - k=5, rejected 18 (keeping 153/675), in 0.0 s
DEBUG: MOIPrunerConfig - k=2, rejected 13 (keeping 140/675), in 0.0 s
DEBUG: MOIPrunerConfig - k=1, rejected 0 (keeping 140/675), in 0.0 s
DEBUG: MOIPrunerConfig - keeping 140/675 (0.1 s)
DEBUG: MOIPrunerConfig - Used cached data 9902/16071 times, 61.61% of total calls

DEBUG: RMSDPrunerConfig - k=5, rejected 4 (keeping 136/140), in 0.3 s
DEBUG: RMSDPrunerConfig - k=2, rejected 0 (keeping 136/140), in 0.1 s
DEBUG: RMSDPrunerConfig - k=1, rejected 0 (keeping 136/140), in 0.1 s
DEBUG: RMSDPrunerConfig - keeping 136/140 (0.5 s)
DEBUG: RMSDPrunerConfig - Used cached data 6341/7717 times, 82.17% of total calls

DEBUG: prune_by_rmsd_rot_corr - temporarily added edge 92-179 to the graph (will be removed before returning)
DEBUG: RMSDRotCorrPrunerConfig - k=5, rejected 2 (keeping 134/1

In [8]:
%%time
pruned, mask = prune(
    ensemble.coords,
    ensemble.atoms,
    rot_corr_rmsd_pruning=True,
    debugfunction=print,
)

DEBUG: MOIPrunerConfig - k=20, rejected 456 (keeping 219/675), in 0.0 s
DEBUG: MOIPrunerConfig - k=10, rejected 56 (keeping 163/675), in 0.0 s
DEBUG: MOIPrunerConfig - k=5, rejected 38 (keeping 125/675), in 0.0 s
DEBUG: MOIPrunerConfig - k=2, rejected 46 (keeping 79/675), in 0.0 s
DEBUG: MOIPrunerConfig - k=1, rejected 15 (keeping 64/675), in 0.0 s
DEBUG: MOIPrunerConfig - keeping 64/675 (0.1 s)
DEBUG: MOIPrunerConfig - Used cached data 5780/15111 times, 38.25% of total calls

DEBUG: RMSDPrunerConfig - k=2, rejected 0 (keeping 64/64), in 0.2 s
DEBUG: RMSDPrunerConfig - k=1, rejected 0 (keeping 64/64), in 0.2 s
DEBUG: RMSDPrunerConfig - keeping 64/64 (0.4 s)
DEBUG: RMSDPrunerConfig - Used cached data 992/3008 times, 32.98% of total calls

DEBUG: prune_by_rmsd_rot_corr - temporarily added edge 35-182 to the graph (will be removed before returning)
DEBUG: RMSDRotCorrPrunerConfig - k=2, rejected 0 (keeping 64/64), in 23.4 s
DEBUG: RMSDRotCorrPrunerConfig - k=1, rejected 0 (keeping 64/64), 

In [17]:
f"{(1 - (35.8 / 50)) * 100:.1f} % speedup"

'28.4 % speedup'