# Ligand ADMET and Potency (Property Prediction)

The [ADMET](https://polarishub.io/competitions/asap-discovery/antiviral-admet-2025) and [Potency](https://polarishub.io/competitions/asap-discovery/antiviral-potency-2025) Challenge of the [ASAP Discovery competition](https://polarishub.io/blog/antiviral-competition) take the shape of a property prediction task. Given the SMILES (or, to be more precise, the CXSMILES) of a molecule, you are asked to predict the numerical properties of said molecule. This is a relatively straight-forward application of ML and this notebook will quickly get you up and running!

To begin with, choose one of the two challenges! The code will look the same for both. 

In [1]:
import polaris as po
from polaris.hub.client import PolarisHubClient
import csv

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
client = PolarisHubClient()
client.login(overwrite=True)

Please enter the authorization token:  MGIZZJCYYJCTZWQ2NI0ZM2U4LWIYMTMTNZCWMTQXMDDIMDKZ


In [4]:
CHALLENGE = "antiviral-potency-2025"  # or: "antiviral-potency-2025"

## Load the competition

Let's first load the competition from Polaris.

Make sure you are logged in! If not, simply run `polaris login` and follow the instructions. 

In [5]:
import polaris as po

competition = po.load_competition(f"asap-discovery/{CHALLENGE}")

As suggested in the logs, we'll cache the dataset. Note that this is not strictly necessary, but it does speed up later steps.

In [6]:
competition.cache()

'C:\\Users\\talag\\AppData\\Local\\polaris\\polaris\\Cache\\datasets\\1afc91de-787c-4a11-90a6-1df588f238ee'

Let's get the train and test set and take a look at the data structure.

In [7]:
train, test = competition.get_train_test_split()

In [8]:
train[0]

('COC[C@]1(C)C(=O)N(C2=CN=CC3=CC=CC=C23)C(=O)N1C |&1:3|',
 {'pIC50 (SARS-CoV-2 Mpro)': np.float64(nan),
  'pIC50 (MERS-CoV Mpro)': np.float64(4.19)})

In [9]:
print(test)

<polaris.dataset._subset.Subset object at 0x000002ABA8D3F170>


In [10]:
test[0]

'C=CC(=O)NC1=CC=CC(N(CC2=CC=CC(Cl)=C2)C(=O)CC2=CN=CC3=CC=CC=C23)=C1'

In [11]:
import pandas as pd


df = pd.read_csv("antiviral_potency_predictions_2.csv")

In [12]:
competition.target_cols

{'pIC50 (MERS-CoV Mpro)', 'pIC50 (SARS-CoV-2 Mpro)'}

In [13]:
test.target_cols

['pIC50 (SARS-CoV-2 Mpro)', 'pIC50 (MERS-CoV Mpro)']

An interesting idea would be to build a multi-task model to leverage shared information across tasks.

For the sake of simplicity, however, we'll simply build a model per target here. 

In [72]:
for tgt in competition.target_cols:
    print(tgt)

pIC50 (SARS-CoV-2 Mpro)
pIC50 (MERS-CoV Mpro)


In [14]:
y_pred = {}

for tgt in competition.target_cols:
    if tgt == "pIC50 (SARS-CoV-2 Mpro)":
        y_pred[tgt] = df["SARS"].values  # Assign SARS values
    elif tgt == "pIC50 (MERS-CoV Mpro)":
        y_pred[tgt] = df["MERS"].values  # Assign MERS values

# Check the output
print(y_pred)


{'pIC50 (SARS-CoV-2 Mpro)': array([5.2606316, 6.042621 , 5.5060997, 5.7655044, 6.3580527, 6.635112 ,
       5.8555408, 6.781377 , 6.5181847, 6.5181847, 6.125652 , 7.2176757,
       6.455889 , 7.156018 , 5.78455  , 6.902244 , 6.653148 , 6.085179 ,
       4.744858 , 6.31968  , 7.1286716, 7.4047213, 6.65365  , 8.3577795,
       5.84428  , 6.5421844, 6.1597114, 7.3859096, 5.9935613, 4.979324 ,
       6.635112 , 6.978636 , 6.6643333, 6.332336 , 6.65365  , 6.332336 ,
       6.615716 , 6.313798 , 6.635112 , 6.514045 , 6.332336 , 6.332336 ,
       7.0407805, 7.0407805, 7.5038323, 7.5038323, 5.9526405, 5.9526405,
       6.927808 , 6.927808 , 6.6376767, 6.635112 , 6.8372865, 6.65365  ,
       5.7378697, 6.5267005, 6.332336 , 6.313798 , 6.8294764, 6.509946 ,
       6.4625707, 6.550733 , 6.550733 , 6.65365  , 7.6429057, 6.176478 ,
       7.365069 , 7.685716 , 6.635112 , 6.987118 , 6.332336 , 7.427582 ,
       7.427582 , 6.332336 , 6.65365  , 6.2001977, 6.2001977, 7.697893 ,
       6.65365  , 7.126

## Submit your predictions
Submitting your predictions to the competition is simple.

In [15]:
competition.submit_predictions(
    predictions=y_pred,
    prediction_name="preliminary_predictions",
    prediction_owner="wolberlab",
    report_url="https://github.com/talagayev/polaris_antiviral_challenge/blob/main/Preliminary_prediction_protocol_2nd_submission.md", 
    # The below metadata is optional, but recommended.
    github_url="https://github.com/talagayev/polaris_antiviral_challenge",
    contributors=["talagayev", "ndoering99", "sijie-liu97"],
    description="Preliminary predictions for the potency for the 2nd intermediate leaderboard",
    tags=["antiviral_potency"],
    user_attributes={"Framework": "Scikit-learn", "Method": "XGBoost"}
)

For the ASAP competition, we will only evaluate your latest submission. 

The results will only be disclosed after the competition ends.

The End.