# Dubai — Quantum accurate bond inference and partial charge calculations

This notebook will walk us through getting quantum-accurate bond inference and Mulliken partial charge calculations for a SMILES string retrieved from PubChem.

# 0) Complete example
See the [sample notebook](/Quickstarts/ligand_bond_inference_and_partial_charge_calculation-sample.ipynb) for a complete demonstration.

# 1) Setup
See [Quickstart](../index.ipynb#imports) for more details on the setup.

## 1.0) Imports

In [None]:
import os
import json
from pathlib import Path

import requests
import rush

## 1.1) Configuration


In [None]:
TOKEN = os.getenv("RUSH_TOKEN")
# You might have a custom deployment url, by default it will use
# https://tengu.qdx.ai
RUSH_URL = os.getenv("RUSH_URL")

In [None]:
DESCRIPTION = "quantum-bond-inference-notebook"
TAGS = ["rush-py", "dubai", "convert"]
WORK_DIR = Path.home() / "qdx" / "dubai-rush-py-demo"

## 1.2) Build your client

In [None]:
# |hide
if WORK_DIR.exists():
    client = rush.Provider(workspace=WORK_DIR, access_token=TOKEN, url=RUSH_URL)
    await client.nuke(remote=False)

In [None]:
os.makedirs(WORK_DIR, exist_ok=True)

client = await rush.build_provider_with_functions(
    access_token=TOKEN, url=RUSH_URL, workspace=WORK_DIR, batch_tags=TAGS
)

In [None]:
# | hide
client = await rush.build_provider_with_functions(
    access_token=TOKEN,
    url=RUSH_URL,
    workspace=WORK_DIR,
    batch_tags=TAGS,
    restore_by_default=True,
)

# 2) Preparation

## 2.0) Download Aspirin SDF from PubChem

In [None]:
# Convert aspirin to a QDXF file so we can use it for this demo
SMILES_STRING = "CC(=O)OC1=CC=CC=C1C(=O)O"
SDF_LINK = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/smiles/{SMILES_STRING}/record/SDF?record_type=3d"

file_path = f"{WORK_DIR}/aspirin.sdf"
with open(file_path, "wb") as f:
    f.write(requests.get(SDF_LINK).content)

## 2.1) Convert SDF file to QDXF format
QDXF is the central molecule format of the Rush API, so before we use the Dubai module to infer connectivity (bonds) for our molecule, we must convert our SDF file to QDXF.

In [None]:
# We need to specify storage > file size to ensure that we have allocated
# enough resources for the convert module
(ligand,) = await client.convert(
    "SDF", Path(file_path), resources={"storage": 6000}
)

2024-02-29 09:37:07,541 - rush - INFO - Trying to restore job with tags: ['rush-py', 'dubai', 'convert'] and path: github:talo/tengu-prelude/f506c7ead174cdb7e8d1725139254bb85c6b62f8#convert
2024-02-29 09:37:07,602 - rush - INFO - Restoring job from previous run with id 9048168a-b3f8-4722-ba9e-d3c2b947aedf


## 2.2) Remove connectivity
We remove connectivity, as we will be perceiving bonds (quantum-accurately) using Dubai in the next step.

In [None]:
ligand_path = await ligand.download()

In [None]:
ligand = json.load(open(ligand_path, "r"))

In [None]:
EXPECTED_CONNECTIVITY = ligand[0]["topology"]["connectivity"]
ligand = ligand[0]
del ligand["topology"]["connectivity"]

ligand["topology"]["fragment_multiplicities"] = [1]

# 3) Infer bonds and calculate partial charges
In this stage, we set configuration for the Dubai module, as well as saving our QDXF Aspirin to disk, as the Dubai module needs the file itself.


## 3.0) Arguments

In [None]:
DUBAI_RESOURCES = {
    "gpus": 1,
    "storage": 1024_000,
    "walltime": 60,
}
LIGAND_FILEPATH = Path(f"{WORK_DIR}/aspirin.qdxf.json")
json.dump(ligand, open(LIGAND_FILEPATH, "w"))

## 3.1) Run the inference
Finally, we run Dubai to perform quantum-accurate bond inference, as well the calculation of Mulliken partial charges.

In [None]:
help(client.dubai)

Help on function dubai in module rush.provider:

async dubai(*args: *tuple[RushObject[Record]], target: 'Target | None' = None, resources: 'Resources | None' = {'storage': 1034, 'storage_units': 'MB', 'gpus': 4}, tags: 'list[str] | None' = None, restore: 'bool | None' = None) -> tuple[RushObject[Record]]
    Perform quantum accurate bond inference and partial charge calculation on a Conformer

    Module version:
    `github:talo/Dubai/4a177b6f5711de65abf0c8856adf3c2604ca228d#dubai_tengu`

    QDX Type Description:

        input_conformer: Object[Conformer]
        ->
        output_conformer: Object[Conformer]


    :param input_conformer: A Conformer. The Conformer's Topology requires fragment charges and fragment charge multiplicities
    :return output_conformer: Output Conformer including partial charges and bond recalculation



In [None]:
(ligand_with_bonds,) = await client.dubai(
    LIGAND_FILEPATH, resources=DUBAI_RESOURCES, target="NIX_SSH"
)

2024-02-29 11:11:46,856 - rush - INFO - Trying to restore job with tags: ['rush-py', 'dubai', 'convert'] and path: github:talo/Dubai/4a177b6f5711de65abf0c8856adf3c2604ca228d#dubai_tengu


In [None]:
output_ligand_path = await ligand_with_bonds.download()

In [None]:
output_ligand = json.load(open(output_ligand_path, "r"))

for expected_bond, outputted_bond in zip(
    EXPECTED_CONNECTIVITY, output_ligand["topology"]["connectivity"]
):
    # Check start atoms are the same
    assert expected_bond[0] == outputted_bond[0]
    # Check ending atoms are the same
    assert expected_bond[1] == outputted_bond[1]
    # NB: we don't check the third item of the bond -- the bond type. This is
    # because Dubai accurately outputs ring bonds as a specific 'RINGBOND' type,
    # whereas SDF aspirin was interleaving single and double bonds.
print("Bond inference performed correctly!")

Bond inference performed correctly!
