# Dubai — Quantum accurate bond inference and partial charge calculations

This notebook will walk us through getting quantum-accurate bond inference and Mulliken partial charge calculations for a SMILES string retrieved from PubChem.

## 0) Setup
See [Quickstart](../index.ipynb#imports) for more details on the setup.

## 0.0) Imports

In [None]:
import os
import json
from pathlib import Path

import requests
import rush

## 0.1) Credentials


In [None]:
TOKEN = os.getenv("RUSH_TOKEN")
# You might have a custom deployment url, by default it will use https://tengu.qdx.ai
RUSH_URL = os.getenv("RUSH_URL")

## 0.2) Configuration
Let's set some global variables that define our project.

In [None]:
DESCRIPTION = "quantum-bond-inference-notebook"
TAGS = ["rush-py", "dubai", "convert"]
WORK_DIR = Path.home() / "qdx" / "dubai-rush-py-demo"

## 0.3) Build your rush client

In [None]:
# |hide
if WORK_DIR.exists():
    client = rush.Provider(workspace=WORK_DIR, access_token=TOKEN, url=RUSH_URL)
    await client.nuke()

In [None]:
os.makedirs(WORK_DIR, exist_ok=True)

client = await rush.build_provider_with_functions(
    access_token=TOKEN, url=RUSH_URL, workspace=WORK_DIR, batch_tags=TAGS
)

## 0.4) Download Aspirin SDF from PubChem

In [None]:
# Convert aspirin to a QDXF file so we can use it for this demo
SMILES_STRING = "CC(=O)OC1=CC=CC=C1C(=O)O"
SDF_LINK = (
    f"https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/smiles/{SMILES_STRING}/record/SDF?record_type=3d"
)

file_path = f"{WORK_DIR}/aspirin.sdf"
with open(file_path, "wb") as f:
    f.write(requests.get(SDF_LINK).content)

## 0.5) Convert SDF file to QDXF format
QDXF is the central molecule format of the Rush API, so before we use the Dubai module to infer connectivity (bonds) for our molecule, we must convert our SDF file to QDXF.

In [None]:
# We need to specify storage > file size to ensure that we have allocated enough resources for the convert module
(ligand,) = await client.convert("SDF", Path(file_path), resources={"storage": 5000})

ligand = await ligand.get()

2024-01-24 10:35:42,147 - rush - INFO - Argument 6e2c0a62-43df-452e-84e3-1016992d3320 is now ModuleInstanceStatus.RESOLVING
2024-01-24 10:35:43,286 - rush - INFO - Argument 6e2c0a62-43df-452e-84e3-1016992d3320 is now ModuleInstanceStatus.ADMITTED
2024-01-24 10:35:44,397 - rush - INFO - Argument 6e2c0a62-43df-452e-84e3-1016992d3320 is now ModuleInstanceStatus.RUNNING
2024-01-24 10:35:45,509 - rush - INFO - Argument 6e2c0a62-43df-452e-84e3-1016992d3320 is now ModuleInstanceStatus.AWAITING_UPLOAD


## 0.6 ) Remove connectivity
We remove connectivity, as we will be perceiving bonds (quantum-accurately) using Dubai in the next step.

In [None]:
EXPECTED_CONNECTIVITY = ligand[0]["topology"]["connectivity"]
ligand = ligand[0]
del ligand["topology"]["connectivity"]

ligand["topology"]["fragment_multiplicities"] = [1]

## 1.0) Set Dubai module specific configuration
In this stage, we set configuration for the Dubai module, as well as saving our QDXF Aspirin to disk, as the Dubai module needs the file itself.


In [None]:
DUBAI_RESOURCES = {
    "gpus": 1,
    "storage": 1024_000,
    "walltime": 60,
}
LIGAND_FILEPATH = Path(f"{WORK_DIR}/aspirin.qdxf.json")
json.dump(ligand, open(LIGAND_FILEPATH, "w"))

## 1.1) Run Dubai
Finally, we run Dubai to perform quantum-accurate bond inference, as well the calculation of Mulliken partial charges.

In [None]:
help(client.dubai)

Help on function dubai in module rush.provider:

async dubai(*args: [<class 'pathlib.Path'>], target: rush.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX_SSH: 'NIX_SSH'>, resources: rush.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=0, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=10, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>]
    Perform quantum accurate bond inference and partial charge calculation on a Conformer
    
    Module version: `Dubai/d013073e0774326e903e6abaca4c1871d4950e70`
    
    QDX Type Description:
    
        input: @Conformer
        ->
        output: @Conformer
    
    :param input: A Conformer. The Conformer's Topology requires fragment charges and fragment charge multiplicities
    :return output: Output Conformer inc

In [None]:
(ligand_with_bonds,) = await client.dubai(LIGAND_FILEPATH, resources=DUBAI_RESOURCES)

In [None]:
output_ligand = await ligand_with_bonds.get()

for expected_bond, outputted_bond in zip(EXPECTED_CONNECTIVITY, output_ligand["topology"]["connectivity"]):
    # Check start atoms are the same
    assert expected_bond[0] == outputted_bond[0]
    # Check ending atoms are the same
    assert expected_bond[1] == outputted_bond[1]
    # NB: we don't check the third item of the bond -- the bond type. This is because Dubai accurately outputs ring bonds as
    # a specific 'RINGBOND' type, whereas SDF aspirin was interleaving single and double bonds.

2024-01-24 10:35:47,888 - rush - INFO - Argument 71a62101-da66-466d-aa16-5f28f8c425b4 is now ModuleInstanceStatus.RESOLVING
2024-01-24 10:35:58,892 - rush - INFO - Argument 71a62101-da66-466d-aa16-5f28f8c425b4 is now ModuleInstanceStatus.ADMITTED
2024-01-24 10:36:37,365 - rush - INFO - Argument 71a62101-da66-466d-aa16-5f28f8c425b4 is now ModuleInstanceStatus.DISPATCHED
2024-01-24 10:36:46,274 - rush - INFO - Argument 71a62101-da66-466d-aa16-5f28f8c425b4 is now ModuleInstanceStatus.AWAITING_UPLOAD
