# Ligand ADMET and Potency (Property Prediction)

The [ADMET](https://polarishub.io/competitions/asap-discovery/antiviral-admet-2025) and [Potency](https://polarishub.io/competitions/asap-discovery/antiviral-potency-2025) Challenge of the [ASAP Discovery competition](https://polarishub.io/blog/antiviral-competition) take the shape of a property prediction task. Given the SMILES (or, to be more precise, the CXSMILES) of a molecule, you are asked to predict the numerical properties of said molecule. This is a relatively straight-forward application of ML and this notebook will quickly get you up and running!

To begin with, choose one of the two challenges! The code will look the same for both. 

In [1]:
import polaris as po
from polaris.hub.client import PolarisHubClient
import csv

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
client = PolarisHubClient()
client.login(overwrite=True)

Please enter the authorization token:  Y2ZIMDBHNJCTMDZMMY0ZOWU0LWEWOGETZTBHNZI3NTAXZGFL


In [3]:
CHALLENGE = "antiviral-admet-2025"  # or: "antiviral-potency-2025"

In [4]:
polaris login --overwrite

SyntaxError: invalid syntax (1113333330.py, line 1)

## Load the competition

Let's first load the competition from Polaris.

Make sure you are logged in! If not, simply run `polaris login` and follow the instructions. 

In [5]:
import polaris as po

competition = po.load_competition(f"asap-discovery/{CHALLENGE}")

As suggested in the logs, we'll cache the dataset. Note that this is not strictly necessary, but it does speed up later steps.

In [6]:
competition.cache()

'C:\\Users\\Valerij\\AppData\\Local\\polaris\\polaris\\Cache\\datasets\\d2431504-9a17-45e2-9764-5fbda67ed209'

Let's get the train and test set and take a look at the data structure.

In [7]:
train, test = competition.get_train_test_split()

In [8]:
train[0]

('COC1=CC=CC(Cl)=C1NC(=O)N1CCC[C@H](C(N)=O)C1 |a:16|',
 {'LogD': 0.3, 'MDR1-MDCKII': 2.0, 'HLM': nan, 'MLM': nan, 'KSOL': nan})

In [9]:
print(test)

<polaris.dataset._subset.Subset object at 0x00000254A553BA40>


In [10]:
test[0]

'CC(C)[C@H]1C2=C(CCN1C(=O)CC1=CN=CC3=CC=CC=C13)SC=C2 |o1:3|'

In [11]:
print(len(test))

126


In [12]:
import pandas as pd

df = pd.read_csv("antiviral_admet_predictions.csv")

In [13]:
df

Unnamed: 0,SMILES,in-vitro_HLM_bienta: CLint (Num) (uL/min/mg),in-vitro_KSOL-PBS_bienta: mean_solubility (Num) (uM),in-vitro_LogD_bienta: LogD (Num),in-vitro_MDR1-MDCKII-Papp_bienta: mean_Papp_A_to_B (Num) (10^-6 cm/s),in-vitro_MLM_bienta: CLint (Num) (uL/min/mg)
0,CC(C)[C@H]1C2=C(CCN1C(=O)CC1=CN=CC3=CC=CC=C13)...,220.971471,254.368055,2.903119,12.420424,345.842936
1,CC(C)C1(C)CCN(C(=O)CC2=CN=CC3=CC=CC=C23)CC1,229.144458,251.732889,3.238782,17.729760,347.774639
2,O=C(CC1=CN=CC2=CC=CC=C12)N1CCC2=C(C=CS2)C12CCC2,211.390603,290.660625,2.821505,17.055120,370.460577
3,CC1(CC(F)(F)F)CCN(C(=O)CC2=CN=CC3=CC=CC=C23)CC1,204.494985,193.597737,3.770655,17.758855,304.119910
4,CCC1=CC=C([C@H]2C[C@H](C)CCN2C(=O)CC2=CN=CC3=C...,192.872860,260.140832,2.757594,13.480864,325.786526
...,...,...,...,...,...,...
121,C[C@H]1CN(C2=CN=CC3=CC=CC=C23)C(=O)[C@@]12CN(C...,123.307702,361.952245,1.858892,6.127117,236.509169
122,C[C@H]1CN(C2=CN=CC3=CC=CC=C23)C(=O)[C@@]12CN(C...,121.835544,363.217492,1.839692,5.971129,233.844745
123,COC[C@H]1CN(C2=CN=CC3=CC=CC=C23)C(=O)[C@@]12CN...,83.323647,379.526509,1.555967,4.398426,179.845488
124,C[C@H]1CN(C2=CN=CC3=CC=CC=C23)C(=O)[C@@]12CN(C...,111.726419,407.768852,1.653481,6.855751,218.133738


In [14]:
competition.target_cols

{'HLM', 'KSOL', 'LogD', 'MDR1-MDCKII', 'MLM'}

In [15]:
test.target_cols

['LogD', 'MDR1-MDCKII', 'HLM', 'MLM', 'KSOL']

In [16]:
competition.target_cols

{'HLM', 'KSOL', 'LogD', 'MDR1-MDCKII', 'MLM'}

An interesting idea would be to build a multi-task model to leverage shared information across tasks.

For the sake of simplicity, however, we'll simply build a model per target here. 

In [18]:
y_pred = {}

for tgt in competition.target_cols:
    if tgt == "HLM":
        y_pred[tgt] = df["in-vitro_HLM_bienta: CLint (Num) (uL/min/mg)"].values  # Assign SARS values
    elif tgt == "KSOL":
        y_pred[tgt] = df["in-vitro_KSOL-PBS_bienta: mean_solubility (Num) (uM)"].values  # Assign MERS values
    elif tgt == "LogD":
        y_pred[tgt] = df["in-vitro_LogD_bienta: LogD (Num)"].values  # Assign MERS values
    elif tgt == "MDR1-MDCKII":
        y_pred[tgt] = df["in-vitro_MDR1-MDCKII-Papp_bienta: mean_Papp_A_to_B (Num) (10^-6 cm/s)"].values  # Assign MERS values
    elif tgt == "MLM":
        y_pred[tgt] = df["in-vitro_MLM_bienta: CLint (Num) (uL/min/mg)"].values  # Assign MERS values

# Check the output
print(y_pred)


{'LogD': array([2.90311869, 3.23878179, 2.82150548, 3.77065468, 2.75759425,
       2.44502287, 2.31011526, 3.22961067, 2.63879609, 1.86546174,
       1.38034722, 3.25016698, 1.52752823, 2.77672298, 2.03518505,
       2.68948779, 3.182039  , 2.83074764, 1.61751372, 2.65514812,
       1.52488646, 2.66826722, 3.20429655, 3.32395187, 2.99052032,
       3.2179227 , 2.71055993, 2.72601597, 2.64811014, 3.07920115,
       2.34668639, 2.76325072, 2.12797874, 2.71055993, 2.70768236,
       2.54780705, 2.5611781 , 2.57417965, 3.25543723, 2.56659812,
       2.60731789, 2.87407159, 2.78933192, 2.36728634, 2.60560618,
       2.84286676, 2.80680726, 2.80680726, 2.43537982, 3.10623497,
       3.22278045, 2.63859917, 2.89409214, 2.7862296 , 2.52843503,
       2.90482387, 2.14712671, 2.59763526, 2.82003321, 2.45143712,
       3.1965778 , 2.92393472, 2.86444078, 2.4555389 , 2.81912168,
       1.01228738, 2.69156751, 2.71158931, 1.03106759, 0.96460586,
       1.46036412, 1.24580071, 0.61411928, 1.35960766

In [19]:
for tgt in competition.target_cols:
    print(tgt)

LogD
MDR1-MDCKII
HLM
MLM
KSOL


## Submit your predictions
Submitting your predictions to the competition is simple.

In [21]:
competition.submit_predictions(
    predictions=y_pred,
    prediction_name="preliminary_predictions_admet",
    prediction_owner="wolberlab",
    report_url="https://github.com/talagayev/polaris_antiviral_challenge/blob/main/ADMET_MPNN/report.md", 
    # The below metadata is optional, but recommended.
    github_url="https://github.com/talagayev/polaris_antiviral_challenge",
    contributors=["talagayev", "ndoering99", "sijie-liu97"],
    description="Preliminary predictions for admet predictions with adjusted values for MDR",
    tags=["antiviral_admet"],
    user_attributes={"Framework": "Chemprop", "Method": "MPNN"}
)

For the ASAP competition, we will only evaluate your latest submission. 

The results will only be disclosed after the competition ends.

The End.