Skip to content

nihilisticneuralnet/LigandNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

LigandNet: Small Molecule-Protein Interactions Prediction Model

Abstract: This code is the solution of the NeurIPS 2024 - Predict New Medicines with BELKA which ranked 73rd out of 1950 teams.

The code implements a deep learning approach to predict small molecule-protein interactions using the BELKA dataset. The model is based on a 1D Convolutional Neural Network (1D-CNN) trained on SMILES representations of molecules to classify their binding affinity to protein targets. The dataset consists of molecular structures and their corresponding binary binding labels, obtained through DNA-encoded chemical library (DEL) technology.

To enhance model performance, the code employs a 15-fold stratified cross-validation strategy, training separate models for each fold and storing their weights to facilitate ensembling. The preprocessing pipeline includes encoding SMILES data, feature extraction, and augmenting the dataset with additional molecular descriptors. The final predictions are generated by aggregating outputs from multiple folds, achieving an average precision score of 0.26109.

Plots

t-SNE Plot of Dataset

Plot1

Visualizing (1st) molecule and building block

  • 2d

Plot2

  • 3d

    • Molecule_Smiles

    Plot3

    1. Building block_1

    Plot4

    1. Building block_2

    Plot5

    1. Building block_3

    Plot6

Flow Chart

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published