Skip to content

Commit

Permalink
Merge pull request #172 from openforcefield/quick-start-update
Browse files Browse the repository at this point in the history
Update quick start guide to use psi4-ambertools stack
  • Loading branch information
j-wags committed May 17, 2022
2 parents 5087bb0 + 2ae4ad3 commit cb5aa10
Show file tree
Hide file tree
Showing 4 changed files with 91 additions and 29 deletions.
43 changes: 32 additions & 11 deletions docs/getting-started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,38 @@ package manager.

## Using conda

The recommended way to install `openff-bespokefit` is via the `conda` package manager:
The recommended way to install `openff-bespokefit` is via the `conda` package manager.
A working installation also requires at least one package from each of the two sections below
("Fragmentation Backends" and "Reference Data Generators")

```shell
conda install -c conda-forge openff-bespokefit
```

### Optional dependencies


### Fragmentation Backends

#### AmberTools Antechamber

AmberTools is free and open-source, and can generally be used fragment molecules up to 40 heavy atoms in under
10 minutes.

```shell
conda install -c conda-forge ambertools
````

#### OpenEye Toolkits

If you have access to the OpenEye toolkits (namely `oechem`, `oequacpac` and `oeomega`) we recommend installing
these also as they can speed up certain operations significantly. OpenEye software requires a free-for-academics
license to run.

```shell
conda install -c openeye openeye-toolkits
```

### Reference Data Generators

#### Psi4

Expand All @@ -23,6 +48,11 @@ recommended to be installed unless you intend to train against data generated us
conda install -c conda-forge -c defaults -c psi4 psi4
```

:::{warning}
There is an incompatibility between the AmberTools and Psi4 conda packages on Mac, and it is not possible to
create a working conda environment containing both.
:::

#### XTB

The xtb package gives access to the XTB semi-empirical models produced by the Grimme group in Bonn which may be used
Expand All @@ -46,15 +76,6 @@ TorchANI potentials are only suitable for molecules with a net neutral charge an
consisting of C, H, N, O, S, F and Cl
:::

#### OpenEye toolkits

If you have access to the OpenEye toolkits (namely `oechem`, `oequacpac` and `oeomega`) we recommend installing
these also as these can speed up certain operations significantly.

```shell
conda install -c openeye openeye-toolkits
```

## From source

To install `openff-bespokefit` from source begin by cloning the repository from
Expand Down
73 changes: 56 additions & 17 deletions docs/getting-started/quick-start.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
(quick_start_chapter)=
# Quick start

:::{warning}
To reduce runtime, this "Quick start" guide uses a fast semiempirical model, "GFN2-xTB",
to generate training data,
rather than the ["default" method](default_qc_method) used to train mainline OpenFF force fields.
:::

BespokeFit aims to provide an automated pipeline that ingests a general molecular force field and a set of
molecules of interest, and produce a new bespoke force field that has been augmented with highly specific
force field parameters trained to accurately capture the important features and phenomenology of the input set.
Expand All @@ -9,10 +15,13 @@ Such features may include generating bespoke torsion parameters that have been t
to capture as closely as possible the torsion profiles of the rotatable bonds in the target molecule
which have a large impact on conformational preferences.

The recommended way to install `openff-bespokefit` is via the `conda` package manager:
The recommended way to install `openff-bespokefit` is via the `conda` package manager. There are several optional
dependencies, and a good starting environment is:

```shell
conda install -c conda-forge openff-bespokefit
conda create -n bespokefit -y -c conda-forge mamba python=3.9
conda activate bespokefit
mamba install -y -c conda-forge openff-bespokefit xtb-python ambertools
```

although [several other methods are available](installation_chapter).
Expand Down Expand Up @@ -50,7 +59,8 @@ openff-bespoke executor run --smiles "CC(=O)NC1=CC=C(C=C1)O" \
--output "acetaminophen.json" \
--output-force-field "acetaminophen.offxml" \
--n-qc-compute-workers 2 \
--qc-compute-n-cores 8
--qc-compute-n-cores 1 \
--default-qc-spec xtb gfn2xtb none
```

or the path to an SDF (or similar) file
Expand All @@ -61,13 +71,20 @@ openff-bespoke executor run --file "acetaminophen.sdf" \
--output "acetaminophen.json" \
--output-force-field "acetaminophen.offxml" \
--n-qc-compute-workers 2 \
--qc-compute-n-cores 8
--qc-compute-n-cores 1 \
--default-qc-spec xtb gfn2xtb none
```

in addition to arguments defining how the bespoke fit should be performed and parallelized.

:::{note}
Sometimes bespoke commands will raise `RuntimeError: The gateway could not be reached`. This can usually be resolved
by rerunning the command a few times.
:::

Here we have specified that we wish to start the fit from the general OpenFF 2.0.0 (Sage) force field, augmenting
it with bespoke parameters generated according to the [default built-in workflow](workflow_chapter).
it with bespoke parameters generated according to the
[default built-in workflow using GFN2-xTB reference data](workflow_chapter).

:::{note}
Other available workflow can be viewed by running `openff-bespoke executor run --help`, or alternatively, the path to a
Expand All @@ -85,10 +102,11 @@ extra workers can easily be requested to speed things up:
```shell
openff-bespoke executor run --file "acetaminophen.sdf" \
--workflow "default" \
--n-fragmenter-workers 1 \
--n-optimizer-workers 1 \
--n-fragmenter-workers 2 \
--n-optimizer-workers 2 \
--n-qc-compute-workers 2 \
--qc-compute-n-cores 8
--qc-compute-n-cores 1 \
--default-qc-spec xtb gfn2xtb none
```

See the chapter on the [bespoke executor](executor_chapter) for more information about parallelising fits.
Expand All @@ -105,9 +123,9 @@ seamlessly coordinates every step of the fitting workflow from molecule fragment

```shell
openff-bespoke executor launch --n-fragmenter-workers 1 \
--n-optimizer-workers 1 \
--n-qc-compute-workers 2 \
--qc-compute-n-cores 8
--n-optimizer-workers 2 \
--n-qc-compute-workers 4 \
--qc-compute-n-cores 1
```

The number of workers dedicated to each bespoke fitting stage can be tweaked here. In general, we recommend devoting
Expand All @@ -120,14 +138,16 @@ the `submit` command either in the form of a SMILES pattern:

```shell
openff-bespoke executor submit --smiles "CC(=O)NC1=CC=C(C=C1)O" \
--workflow "default"
--workflow "default" \
--default-qc-spec xtb gfn2xtb none
```

or loading the molecule from an SDF (or similar) file:

```shell
openff-bespoke executor submit --file "acetaminophen.sdf" \
--workflow "default"
--workflow "default" \
--default-qc-spec xtb gfn2xtb none
```

The `submit` command will also accept a combination of the two input forms as well as multiple occurrences of either.
Expand Down Expand Up @@ -175,13 +195,32 @@ is used to create the workflows that fully describe how bespoke parameters will

```python
from openff.bespokefit.workflows import BespokeWorkflowFactory
from openff.qcsubmit.common_structures import QCSpec

factory = BespokeWorkflowFactory()
factory = BespokeWorkflowFactory(
# Define the starting force field that will be augmented with bespoke
# parameters.
initial_force_field="openff-2.0.0.offxml",
# Change the level of theory that the reference QC data is generated at
default_qc_specs=[
QCSpec(
method="gfn2xtb",
basis=None,
program="xtb",
spec_name="xtb",
spec_description="gfn2xtb",
)
]
)
```

The default factory will produce [workflows](workflow_chapter) that will augment the OpenFF 2.0.0 force field
Similar to the previous steps, here we override the default
["default" QC specification](default_qc_method) to use GFN2-xTB. If we had Psi4
installed, we could remove the `default_qc_specs` argument and the factory would instead use our mainline
[fitting QC method](default_qc_method).
The default factory will produce [workflows](workflow_chapter) that augment the OpenFF 2.0.0 force field
with bespoke torsion parameters for all non-terminal *rotatable* bonds in the molecule that have been trained
to quantum chemical torsion scan data generated for said molecule using the [Psi4] quantum chemistry package.
to quantum chemical torsion scan data generated for said molecule.

:::{note}
See the [configuration section](quick_start_config_factory) for more info on customising the workflow factory.
Expand Down Expand Up @@ -214,7 +253,7 @@ with BespokeExecutor(
n_fragmenter_workers = 1,
n_optimizer_workers = 1,
n_qc_compute_workers = 2,
qc_compute_worker_config=BespokeWorkerConfig(n_cores=8)
qc_compute_worker_config=BespokeWorkerConfig(n_cores=1)
) as executor:
# Submit our workflow to the executor
task_id = executor.submit(input_schema=workflow_schema)
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ It is a Python library in the [Open Force Field ecosystem] that emphasises:
directly from the command line without touching a line of Python

:::{warning}
Please note that BespokeFit is experimental, pre-production software. It does
Please note that BespokeFit is under continuous development. It does
not promise to have a stable API and may in cases produce inaccurate results.
We are always looking to improve this framwork so if you do find any undesirable
or irritating behaviour, please [file an issue!]
Expand Down
2 changes: 2 additions & 0 deletions docs/users/theory.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ step.
[`openff-fragmenter`]: https://fragmenter.readthedocs.io/en/stable/index.html
[may be specified]: workflow_chapter

(default_qc_method)=

## QC Generation

The third stage in the bespoke fitting workflow is generating any reference quantum chemical data that the bespoke
Expand Down

0 comments on commit cb5aa10

Please sign in to comment.