diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md index 15b7ce05..9ae10d8b 100644 --- a/docs/getting-started/installation.md +++ b/docs/getting-started/installation.md @@ -6,13 +6,38 @@ package manager. ## Using conda -The recommended way to install `openff-bespokefit` is via the `conda` package manager: +The recommended way to install `openff-bespokefit` is via the `conda` package manager. +A working installation also requires at least one package from each of the two sections below +("Fragmentation Backends" and "Reference Data Generators") ```shell conda install -c conda-forge openff-bespokefit ``` -### Optional dependencies + + +### Fragmentation Backends + +#### AmberTools Antechamber + +AmberTools is free and open-source, and can generally be used fragment molecules up to 40 heavy atoms in under +10 minutes. + +```shell +conda install -c conda-forge ambertools +```` + +#### OpenEye Toolkits + +If you have access to the OpenEye toolkits (namely `oechem`, `oequacpac` and `oeomega`) we recommend installing +these also as they can speed up certain operations significantly. OpenEye software requires a free-for-academics +license to run. + +```shell +conda install -c openeye openeye-toolkits +``` + +### Reference Data Generators #### Psi4 @@ -23,6 +48,11 @@ recommended to be installed unless you intend to train against data generated us conda install -c conda-forge -c defaults -c psi4 psi4 ``` +:::{warning} +There is an incompatibility between the AmberTools and Psi4 conda packages on Mac, and it is not possible to +create a working conda environment containing both. +::: + #### XTB The xtb package gives access to the XTB semi-empirical models produced by the Grimme group in Bonn which may be used @@ -46,15 +76,6 @@ TorchANI potentials are only suitable for molecules with a net neutral charge an consisting of C, H, N, O, S, F and Cl ::: -#### OpenEye toolkits - -If you have access to the OpenEye toolkits (namely `oechem`, `oequacpac` and `oeomega`) we recommend installing -these also as these can speed up certain operations significantly. - -```shell -conda install -c openeye openeye-toolkits -``` - ## From source To install `openff-bespokefit` from source begin by cloning the repository from diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md index 5ace0315..80f461bf 100644 --- a/docs/getting-started/quick-start.md +++ b/docs/getting-started/quick-start.md @@ -1,6 +1,12 @@ (quick_start_chapter)= # Quick start +:::{warning} +To reduce runtime, this "Quick start" guide uses a fast semiempirical model, "GFN2-xTB", +to generate training data, +rather than the ["default" method](default_qc_method) used to train mainline OpenFF force fields. +::: + BespokeFit aims to provide an automated pipeline that ingests a general molecular force field and a set of molecules of interest, and produce a new bespoke force field that has been augmented with highly specific force field parameters trained to accurately capture the important features and phenomenology of the input set. @@ -9,10 +15,13 @@ Such features may include generating bespoke torsion parameters that have been t to capture as closely as possible the torsion profiles of the rotatable bonds in the target molecule which have a large impact on conformational preferences. -The recommended way to install `openff-bespokefit` is via the `conda` package manager: +The recommended way to install `openff-bespokefit` is via the `conda` package manager. There are several optional +dependencies, and a good starting environment is: ```shell -conda install -c conda-forge openff-bespokefit +conda create -n bespokefit -y -c conda-forge mamba python=3.9 +conda activate bespokefit +mamba install -y -c conda-forge openff-bespokefit xtb-python ambertools ``` although [several other methods are available](installation_chapter). @@ -50,7 +59,8 @@ openff-bespoke executor run --smiles "CC(=O)NC1=CC=C(C=C1)O" \ --output "acetaminophen.json" \ --output-force-field "acetaminophen.offxml" \ --n-qc-compute-workers 2 \ - --qc-compute-n-cores 8 + --qc-compute-n-cores 1 \ + --default-qc-spec xtb gfn2xtb none ``` or the path to an SDF (or similar) file @@ -61,13 +71,20 @@ openff-bespoke executor run --file "acetaminophen.sdf" \ --output "acetaminophen.json" \ --output-force-field "acetaminophen.offxml" \ --n-qc-compute-workers 2 \ - --qc-compute-n-cores 8 + --qc-compute-n-cores 1 \ + --default-qc-spec xtb gfn2xtb none ``` in addition to arguments defining how the bespoke fit should be performed and parallelized. +:::{note} +Sometimes bespoke commands will raise `RuntimeError: The gateway could not be reached`. This can usually be resolved +by rerunning the command a few times. +::: + Here we have specified that we wish to start the fit from the general OpenFF 2.0.0 (Sage) force field, augmenting -it with bespoke parameters generated according to the [default built-in workflow](workflow_chapter). +it with bespoke parameters generated according to the +[default built-in workflow using GFN2-xTB reference data](workflow_chapter). :::{note} Other available workflow can be viewed by running `openff-bespoke executor run --help`, or alternatively, the path to a @@ -85,10 +102,11 @@ extra workers can easily be requested to speed things up: ```shell openff-bespoke executor run --file "acetaminophen.sdf" \ --workflow "default" \ - --n-fragmenter-workers 1 \ - --n-optimizer-workers 1 \ + --n-fragmenter-workers 2 \ + --n-optimizer-workers 2 \ --n-qc-compute-workers 2 \ - --qc-compute-n-cores 8 + --qc-compute-n-cores 1 \ + --default-qc-spec xtb gfn2xtb none ``` See the chapter on the [bespoke executor](executor_chapter) for more information about parallelising fits. @@ -105,9 +123,9 @@ seamlessly coordinates every step of the fitting workflow from molecule fragment ```shell openff-bespoke executor launch --n-fragmenter-workers 1 \ - --n-optimizer-workers 1 \ - --n-qc-compute-workers 2 \ - --qc-compute-n-cores 8 + --n-optimizer-workers 2 \ + --n-qc-compute-workers 4 \ + --qc-compute-n-cores 1 ``` The number of workers dedicated to each bespoke fitting stage can be tweaked here. In general, we recommend devoting @@ -120,14 +138,16 @@ the `submit` command either in the form of a SMILES pattern: ```shell openff-bespoke executor submit --smiles "CC(=O)NC1=CC=C(C=C1)O" \ - --workflow "default" + --workflow "default" \ + --default-qc-spec xtb gfn2xtb none ``` or loading the molecule from an SDF (or similar) file: ```shell openff-bespoke executor submit --file "acetaminophen.sdf" \ - --workflow "default" + --workflow "default" \ + --default-qc-spec xtb gfn2xtb none ``` The `submit` command will also accept a combination of the two input forms as well as multiple occurrences of either. @@ -175,13 +195,32 @@ is used to create the workflows that fully describe how bespoke parameters will ```python from openff.bespokefit.workflows import BespokeWorkflowFactory +from openff.qcsubmit.common_structures import QCSpec -factory = BespokeWorkflowFactory() +factory = BespokeWorkflowFactory( + # Define the starting force field that will be augmented with bespoke + # parameters. + initial_force_field="openff-2.0.0.offxml", + # Change the level of theory that the reference QC data is generated at + default_qc_specs=[ + QCSpec( + method="gfn2xtb", + basis=None, + program="xtb", + spec_name="xtb", + spec_description="gfn2xtb", + ) + ] +) ``` -The default factory will produce [workflows](workflow_chapter) that will augment the OpenFF 2.0.0 force field +Similar to the previous steps, here we override the default +["default" QC specification](default_qc_method) to use GFN2-xTB. If we had Psi4 +installed, we could remove the `default_qc_specs` argument and the factory would instead use our mainline +[fitting QC method](default_qc_method). +The default factory will produce [workflows](workflow_chapter) that augment the OpenFF 2.0.0 force field with bespoke torsion parameters for all non-terminal *rotatable* bonds in the molecule that have been trained -to quantum chemical torsion scan data generated for said molecule using the [Psi4] quantum chemistry package. +to quantum chemical torsion scan data generated for said molecule. :::{note} See the [configuration section](quick_start_config_factory) for more info on customising the workflow factory. @@ -214,7 +253,7 @@ with BespokeExecutor( n_fragmenter_workers = 1, n_optimizer_workers = 1, n_qc_compute_workers = 2, - qc_compute_worker_config=BespokeWorkerConfig(n_cores=8) + qc_compute_worker_config=BespokeWorkerConfig(n_cores=1) ) as executor: # Submit our workflow to the executor task_id = executor.submit(input_schema=workflow_schema) diff --git a/docs/index.md b/docs/index.md index cc23a2db..effeccb9 100644 --- a/docs/index.md +++ b/docs/index.md @@ -18,7 +18,7 @@ It is a Python library in the [Open Force Field ecosystem] that emphasises: directly from the command line without touching a line of Python :::{warning} -Please note that BespokeFit is experimental, pre-production software. It does +Please note that BespokeFit is under continuous development. It does not promise to have a stable API and may in cases produce inaccurate results. We are always looking to improve this framwork so if you do find any undesirable or irritating behaviour, please [file an issue!] diff --git a/docs/users/theory.md b/docs/users/theory.md index 42edba42..21d92ef8 100644 --- a/docs/users/theory.md +++ b/docs/users/theory.md @@ -67,6 +67,8 @@ step. [`openff-fragmenter`]: https://fragmenter.readthedocs.io/en/stable/index.html [may be specified]: workflow_chapter +(default_qc_method)= + ## QC Generation The third stage in the bespoke fitting workflow is generating any reference quantum chemical data that the bespoke