# Syk Inhibitor IC50 Predictor: A SMILES-based Tool for Drug Discovery

This notebook provides an interactive tool for predicting the $IC_{50}$ values of potential Syk inhibitors based on their SMILES representation.

Instructions:
1. Run all cells in this notebook before using the prediction tool.
2. Ensure all required libraries are installed.
3. Make sure the model file `'stacking_regressor.joblib'` is in the correct path.
4. Use the input field to enter SMILES strings and predict $IC_{50}$ values.

In [3]:
# Import necessary libraries
import pandas as pd
import numpy as np
import pandas as pd
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.DataStructs.cDataStructs import ConvertToNumpyArray
from rdkit import RDLogger
RDLogger.DisableLog('rdApp.*')

from sklearn.base import BaseEstimator, TransformerMixin

import ipywidgets as widgets
from IPython.display import display, HTML

import joblib

In [4]:
class pIC_predictor_smiles(BaseEstimator, TransformerMixin):
    def __init__(self, predictor_path, radius=2, fpSize=2048):
        self.predictor = joblib.load(predictor_path)
        self.radius = radius
        self.fpSize = fpSize

    def fit(self, X, y=None):
        return self

    def transform(self, smiles):
        if not isinstance(smiles, str):
            raise ValueError("Input must be a single SMILES string")

        fingerprint = self.get_fingerprint(smiles)

        fingerprint_df = pd.DataFrame([fingerprint],
                                      columns=[f'fingerprint_{i}' for i in range(self.fpSize)])

        pIC50 = self.predictor.predict(fingerprint_df)

        return pIC50[0]

    def get_fingerprint(self, smiles):
        fp_array = np.zeros((0,), dtype=np.int8)
        mol = Chem.MolFromSmiles(smiles)
        fp = AllChem.GetMorganFingerprintAsBitVect(mol, 3, 2048)
        ConvertToNumpyArray(fp, fp_array)
        return fp_array

In [5]:
def predict_IC50(smiles):
    bioactivity_predictor = pIC_predictor_smiles(model_path)
    pIC50 = bioactivity_predictor.transform(smiles)
    IC50 = 10**(-pIC50)/(10**(-9))

    return IC50

In [6]:
model_path = '/content/drive/MyDrive/статья/Models/stacking_regressor.joblib'

In [7]:
smiles_input = widgets.Text(
    value='',
    placeholder='Enter SMILES',
    description='SMILES:',
    disabled=False
)

predict_button = widgets.Button(description="Predict IC50")
output = widgets.Output()

In [8]:
def on_button_clicked(b):
    with output:
        output.clear_output()
        smiles = smiles_input.value
        if smiles:
            try:
                IC50 = predict_IC50(smiles)
                print(f"Predicted IC50 value: {IC50:.2f} nM")
            except Exception as e:
                print(f"There was an error: {str(e)}")
        else:
            print("Enter SMILES")

In [9]:
predict_button.on_click(on_button_clicked)

## Use of predictor

In the field below, you can enter the SMILES of any molecule to predict its IC50 against the Syk kinase.

Below are examples of known Syk inhibitors and their experimental IC50 values:


*   Fostamatinib (R788):

    `SMILES = COC1=CC(NC2=NC=C(F)C(NC3=NC4=C(OC(C)(C)C(=O)N4COP(O)(O)=O)C=C3)=N2)=CC(OC)=C1OC`
  
    Experimental IC50 = 41 nM

*   Entospletinib (R788):

    `SMILES = C1CN(CCO1)C1=CC=C(NC2=NC(=CN3C=CN=C23)C2=CC3=C(C=NN3)C=C2)C=C1`
  
    Experimental IC50 = 7.6 nM


Try entering the SMILES of these or other molecules to evaluate their activity.



In [10]:
display(HTML("<h3>Prediction of IC50 by SMILES</h3>"))
display(smiles_input, predict_button, output)

Text(value='', description='SMILES:', placeholder='Enter SMILES')

Button(description='Predict IC50', style=ButtonStyle())

Output()