# 🧠 ADC Conjugation Platform Recommender

Welcome to the ADC (Antibody-Drug Conjugate) Conjugation Platform Recommender!  
This notebook provides a simple, interactive way to predict the **most suitable conjugation platform** for your ADC candidate based on key biophysical and manufacturability parameters.

It uses a pretrained machine learning model to make predictions and is powered by a user-friendly Gradio interface. Generates an Automated PDF report summarizing recommendations and input rationale.


---

## 📌 How to Use This Notebook

### 🔧 Step 1: Install Required Libraries
The notebook will automatically install Gradio and import all required dependencies.

### 📂 Step 2: Upload the Trained Model
Upload the pretrained model file named `rf_adc_pipeline.pkl` when prompted.  
> 💡 Make sure this file was exported using `joblib` from a compatible Python environment.

### 🧪 Step 3: Enter Input Parameters
Once the interface loads, you’ll be able to enter values for the following properties of your ADC:

### 📊 Feature Definitions & Their Role in Platform Selection

Below is a short explanation of each input feature and how it relates to ADC conjugation platform recommendations:

### 🧬 Technology_Category (0–2)
Represents the high-level conjugation technology class (e.g., traditional, site-specific, or novel chemistry). This influences compatibility with payloads, scalability, and clinical maturity.

### ⚖️ DAR_Mean (Drug-to-Antibody Ratio - Mean)
The average number of drug molecules attached per antibody. Optimal DAR impacts potency, toxicity, and platform suitability (some platforms only tolerate certain DAR ranges).

### 📈 DAR_Std / DAR_CV (Standard Deviation / Coefficient of Variation)
Measures of DAR heterogeneity. Lower variability generally suggests better control and product consistency, which is preferred in regulatory and manufacturing contexts.

### 🔐 Stability_Score
Represents the biochemical and storage stability of the ADC (0–10). Unstable conjugates may require specific linker chemistries or more robust platforms.

### 🧪 Expression_Ease (0.0–1.0)
Indicates how easily the antibody or engineered site can be expressed and produced. Higher ease favors platforms requiring genetic modifications or engineered sites.

### 🧬 Homogeneity (0.0–1.0)
Assesses structural uniformity of the final ADC product. Homogeneous products are more favorable in clinical and regulatory settings and often associated with advanced conjugation platforms.

### 💰 Cost_Index (0.0–1.0)
Normalized estimate of platform-associated cost. Lower-cost platforms may be prioritized for early-stage programs or when budget is a key constraint.

### ⚠️ CMC_Risk (0 = Low, 1 = Medium, 2 = High)
Represents the estimated Chemistry, Manufacturing, and Controls (CMC) risk. Higher risk may limit platform selection to more established technologies with proven scalability and quality profiles.

### ✅ Approved_Usage (0 = No, 1 = Yes)
Indicates whether the platform has already been used in approved ADC products. Regulatory precedent increases confidence and may influence platform selection for faster approval timelines.

### 🏢 Vendor (0, 1, 2)
Encodes the platform provider (e.g., CDMO or internal tech stack). Vendor choice can affect technical compatibility, support availability, and strategic alignment.

### 📈 Scalability (0.0–1.0)
Represents how easily the platform can be scaled up for manufacturing. Crucial for late-stage or commercial ADC programs.

### ⏱️ Latency_to_Clinic_yrs
Estimated time (in years) to bring a candidate using this platform into clinical trials. Influences decision-making in fast-track or competitive pipeline strategies.

---

Each of these features plays a role in how suitable or viable a given conjugation platform is for your ADC candidate, considering both scientific and development constraints.


Use the sliders, dropdowns, and radio buttons to set these parameters.

### 🚀 Step 4: View Recommendation
After you enter the values, the model will output a **recommended conjugation platform** such as:

> ✅ Recommended ADC Conjugation Platform: **Lysine-Based**

### 🚀 Step 4: Receive Prediction & Download Report

- The model will output the **recommended ADC conjugation platform** (e.g., “Engineered Cys” or “Lysine-Based”).
- A professional PDF will be generated summarizing the recommendation.
- Run the final cell to download the report:
  
```python
from google.colab import files
files.download("adc_platform_report.pdf")
```
---

## 🛠️ Notes
- If you encounter an error, make sure all inputs are filled in and the correct model file is uploaded.
- This model was trained on historical ADC development data and is meant for research and educational purposes only.

---

Let’s get started! 👇


In [3]:
!pip install gradio fpdf --quiet

import pandas as pd
import joblib
import gradio as gr
from fpdf import FPDF
from google.colab import files
import datetime

# Upload model
print("📤 Please upload 'rf_adc_pipeline.pkl'")
uploaded = files.upload()
model_pipeline = joblib.load('rf_adc_pipeline.pkl')

# Prediction + PDF generation
def predict_and_report(tech_cat, dar_mean, dar_std, dar_cv,
                       stability, expression, homogeneity,
                       cost, cmc_risk, usage, vendor,
                       scalability, latency):

    try:
        input_dict = {
            'Technology_Category': tech_cat,
            'DAR_Mean': float(dar_mean),
            'DAR_Std': float(dar_std),
            'DAR_CV': float(dar_cv),
            'Stability_Score': float(stability),
            'Expression_Ease': float(expression),
            'Homogeneity': float(homogeneity),
            'Cost_Index': float(cost),
            'CMC_Risk': cmc_risk,
            'Approved_Usage': usage,
            'Vendor': vendor,
            'Scalability': float(scalability),
            'Latency_to_Clinic_yrs': float(latency)
        }

        X_new = pd.DataFrame([input_dict])
        for col in ['Technology_Category', 'CMC_Risk', 'Approved_Usage', 'Vendor']:
            X_new[col] = X_new[col].astype(str)

        pred = model_pipeline.predict(X_new)[0]

        # PDF report generation
        pdf = FPDF()
        pdf.add_page()
        pdf.set_font("Arial", 'B', 16)
        pdf.cell(200, 10, txt="ADC Conjugation Platform Report", ln=True, align='C')

        pdf.set_font("Arial", size=12)
        pdf.ln(10)
        pdf.cell(200, 10, txt=f"Generated on: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", ln=True)

        pdf.set_font("Arial", 'B', 14)
        pdf.set_text_color(0, 0, 128)
        pdf.ln(10)
        pdf.cell(200, 10, txt="Input Parameters:", ln=True)
        pdf.set_font("Arial", size=12)
        pdf.set_text_color(0, 0, 0)
        for key, val in input_dict.items():
            pdf.cell(200, 8, txt=f"{key}: {val}", ln=True)

        pdf.set_font("Arial", 'B', 14)
        pdf.set_text_color(0, 100, 0)
        pdf.ln(10)
        pdf.cell(200, 10, txt=f"Recommended Platform: {pred}", ln=True)

        pdf.output("adc_platform_report.pdf")

        return f"Recommended Platform: {pred}\n📄 PDF report generated successfully. Download below."

    except Exception as e:
        return f"❌ Error: {str(e)}"

# Gradio UI
gr.Interface(
    fn=predict_and_report,
    inputs=[
        gr.Slider(0, 2, step=1, label="Technology_Category (0–2)"),
        gr.Slider(2.0, 4.5, step=0.1, label="DAR_Mean"),
        gr.Slider(0.1, 1.0, step=0.01, label="DAR_Std"),
        gr.Slider(0.01, 0.5, step=0.01, label="DAR_CV"),
        gr.Slider(0.0, 10.0, step=0.1, label="Stability_Score"),
        gr.Slider(0.0, 1.0, step=0.05, label="Expression_Ease"),
        gr.Slider(0.0, 1.0, step=0.05, label="Homogeneity"),
        gr.Slider(0.0, 1.0, step=0.05, label="Cost_Index"),
        gr.Dropdown(choices=[0, 1, 2], label="CMC_Risk (0=Low, 1=Medium, 2=High)"),
        gr.Radio(choices=[0, 1], label="Approved_Usage (0=No, 1=Yes)"),
        gr.Dropdown(choices=[0, 1, 2], label="Vendor (0–2)"),
        gr.Slider(0.0, 1.0, step=0.05, label="Scalability"),
        gr.Slider(0.0, 10.0, step=0.1, label="Latency_to_Clinic_yrs")
    ],
    outputs="text",
    title="🔬 ADC Platform Recommender",
    description="Enter ADC conjugation parameters to predict the optimal platform and generate a downloadable report."
).launch()

# Download button (run after prediction)
print("\n📎 After prediction, run the following cell to download the report:")
print("files.download('adc_platform_report.pdf')")


📤 Please upload 'rf_adc_pipeline.pkl'


Saving rf_adc_pipeline.pkl to rf_adc_pipeline (2).pkl
It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://ae9b2dd5c9dafa8a64.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)



📎 After prediction, run the following cell to download the report:
files.download('adc_platform_report.pdf')


In [4]:
from google.colab import files
files.download("adc_platform_report.pdf")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>