Skip to content

ur-whitelab/mol.dev

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

version paper MIT license

Purpose

This repository keeps the models and src for mol.dev application. Mol.dev is a web application that uses deep ensemble and RNN models to predict small molecules' properties and uncertainties quickly and accurately, without relying on servers. It uses an JavaScript implementation of the Tensor Flow library to run the trained model directly on any end-point device. By developing mol.dev, we expect to ease the prediction of small molecule properties.

Usage

Our model is available at https://mol.dev to use. Once the page is loaded, two input sections will be available: one for inserting a string representation of the molecule of interest using SMILES format, and the other using SELFIES format. It is important to note that only one input is needed; when the string is inserted correctly, the model will automatically fill the other field with the correct representation.

Screenshot 2023-02-07 at 3 43 30 PM

After inputing a string representation, the application will show its structure in the header

Screenshot 2023-02-07 at 3 44 01 PM

and the prediction will be computed.

Screenshot 2023-02-07 at 3 44 47 PM

The button "Expand ▶️" can be used to get information about the prediction of each element in the ensemble.

Model Card

  • Model Details: Solubility predictor with uncertainty. Model is a bidirectional LSTM that predict standard deviation and mean. An ensemble of 10 is combined for predictions. Epistemic uncertainty comes from model disagreement.
  • Intended: Use Organic molecules
  • Factors: Model may not generalize to large molecules, very insoluble (< -12.5 logS), and highly soluble (> 1 logS), ions, or metals.
  • Metrics: Test correlation 0.79. Test MAE 1.24
  • Evaluation: Data Withheld examples (test data)
  • Training Data: 9982 molecules, which was augmented to 96625 molecules.
  • Ethical Considerations: None noted
  • Caveats: Check the parity plot to see where your molecule falls relative to the training curve.

Mol.dev implemented model parity plot with metrics

Citation

Please, cite Ramos et al.:

@Article{ramos2023solubility,
author ="Ramos, Mayk Caldas and White, Andrew D.",
title  ="Predicting small molecules solubility on endpoint devices using deep ensemble neural networks",
journal  ="Digital Discovery",
year  ="2024",
pages  ="-",
publisher  ="RSC",
doi  ="10.1039/D3DD00217A",
url  ="http://dx.doi.org/10.1039/D3DD00217A",
}