# Inference

This script showcases the different models available in fishbAIT and how to use them efficiently.

In [1]:
import torch
import pandas as pd
import numpy as np
from inference_utils.fishbAIT_for_inference import fishbAIT

  from .autonotebook import tqdm as notebook_tqdm


Specify the model version and load the model

In [2]:
MODEL_VERSION = 'EC50'

In [3]:
fishbait = fishbAIT(model_version=MODEL_VERSION)
fishbait.load_fine_tuned_model()

Load the SMILES you wish to predict

In [4]:
data = pd.read_excel('../data/Inference_example_2.xlsx')
data

Unnamed: 0,SMILES,cmpdname
0,CC(=O)Oc1ccccc1C(O)=O,Aspirin
1,[Cr],Chromium
2,[H+].[Cl-].CNCCC(Oc1ccc(cc1)C(F)(F)F)c2ccccc2,Fluoxetine hydrochloride
3,Clc1ccc(cc1)C(c2ccc(Cl)cc2)C(Cl)(Cl)Cl,Clofenotane
4,[Cu],Copper
...,...,...
995,[Pb++].[O-]c1c(cc(c([O-])c1[N+]([O-])=O)[N+]([...,Lead styphnate
996,CC(C)(C)C(O)(CCc1ccc(Cl)cc1)Cn2cncn2,Tebuconazole
997,[Na+].[Na+].[Na+].[Na+].OCCN(CCO)c1nc(Nc2ccc(c...,OpticalBrightenerBbu220
998,CNC.OC(=O)COc1ccc(Cl)cc1Cl,"2,4-D dimethylamine salt"


Specify the endpoint and effect you wish to predict and make the prediction

In [5]:
PREDICTION_ENDPOINT = 'LOEC'
PREDICTION_EFFECT = 'DVP'
EXPOSURE_DURATION = 96

In [6]:
fishbait.predict_toxicity(SMILES = data.SMILES.iloc[0:10].tolist(), exposure_duration=EXPOSURE_DURATION, endpoint=PREDICTION_ENDPOINT, effect=PREDICTION_EFFECT)

RuntimeError: You are trying to predict a `LOEC` endpoint with fishbAIT version EC50. 
            This will not work. Reload a correct version of fishbAIT (i.e. `EC50`, `EC10` or `EC50EC10`) or specify correct endpoint.
            For additional information call: __help__

If you run into truble using this model use the `__help__` function

In [5]:
fishbait.__help__()


        This is a python class used to load and use the fine-tuned deep-learning model `fishbAIT` for environmental toxicity predictions in fish.
        The models have been trained on a large corpus of SMILES (chemical representations) on data collected from various sources.

        Currently there are three models available for use.
        - `EC50` The EC50 model is trained on EC50 mortality (MOR) data and is thus suitable for the prediction of said endpoints.
        - `EC10` The EC10 model is trained on EC10/NOEC data with various effects (mortality, intoxication, development, reproduction, morphology, growth and population) ab. (MOR, ITX, DVP, REP, MPH, GRO, POP)
        - `EC50EC10` The EC50EC10 model is trained on both EC50, EC10 and NOEC data with various effects (mortality, intoxication, development, reproduction, morphology, growth and population) ab. (MOR, ITX, DVP, REP, MPH, GRO, POP)

        For the most accurate predictions, refer to the combined EC50EC10 model.
    