## Designing improved inhibitors for the Zika virus NS2B‚ÄìNS3 protease (PDB: 7I9O) ü¶ü‚ùå

In this notebook we will act as an early-stage antiviral design team.
Our biological target is the **Zika virus NS2B‚ÄìNS3 protease**, a serine protease formed by the NS3 catalytic domain together with its NS2B cofactor. This protease is essential for ZIKV replication because it cleaves the viral polyprotein into the individual structural and non-structural proteins the virus needs to assemble and replicate, which makes it a high-value antiviral target. ([Nature][1])
We will use the experimentally solved crystal structure **7I9O**, which captures the ZIKV NS2B‚ÄìNS3 protease bound to a small-molecule inhibitor. ([rcsb.org][2])

You are given a **core scaffold** derived from a weak hit in that pocket. The atoms in the core are ‚Äúlocked‚Äù: we assume they are important for binding. Certain positions on the scaffold are marked with `*`. Those `*` positions are attachment points where we are allowed to grow new **R-groups** to make the molecule bind better and look more like a viable lead.

We will use **LibINVENT**, a scaffold-decorator model, to explore these R-groups. LibINVENT proposes substituents for the `*` sites and we will train/steer it with reinforcement learning. The key lever you control is the **scoring function** in `zika.toml`: you will define what ‚Äúgood‚Äù means (for example: reasonable molecular weight, acceptable physicochemical properties, no obvious liabilities, etc.), and LibINVENT will try to generate molecules that satisfy that profile.

Your workflow in this notebook following the same flow as the previous one with many spots for tweaking:

1. **Define a scoring function**
   Edit the TOML so that high score = ‚Äúthis looks like a plausible NS2B‚ÄìNS3 protease inhibitor and a drug-like small molecule‚Äù.
2. **Generate candidates with LibINVENT**
   Run RL to sample decorated molecules starting from the given scaffold.
3. **Triage / down-select**
   You cannot dock thousands of structures. You should prioritise and filter the generated molecules (chemistry sanity, diversity, properties) and choose at most ~100 molecules that are worth docking into 7I9O.
4. **Nominate synthesis candidates**
   From the docked / prioritised set, choose your final **top 10 compounds**. These 10 are the ones you would hand to medicinal chemistry as proposed ‚Äúnext-step‚Äù lead ideas against the Zika NS2B‚ÄìNS3 protease.

[1]: https://www.nature.com/articles/s41467-025-63602-z?utm_source=chatgpt.com "Combined crystallographic fragment screening and deep ..."
[2]: https://www.rcsb.org/structure/7i9o?utm_source=chatgpt.com "7I9O: Group deposition of ZIKV NS2B-NS3 protease in ..."


In [None]:
from pathlib import Path
import subprocess, shlex, os

# make sure output dir exists
Path("output").mkdir(exist_ok=True)

CONFIG_PATH = Path("config/zika.toml").resolve()
LOG_PATH    = Path("output/zika.log").resolve()