A collection of AI for Drug Design related papers and corresponding code sources (in progress).
- NeurIPS (2021) Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation. A novel RL framework that generates pharmacochemically acceptable molecules with large docking scores.
- NeurIPS (2021) A 3D Generative Model for Structure-Based Drug Design. A 3D generative model that generates molecules given a designated 3D protein binding site.
- ICLR oral (2022) Data-Efficient Graph Grammar Learning for Molecular Generation . A data-efficient generative model that can be learned from datasets with orders of magnitude smaller sizes than common benchmarks.
- ICLR spotlight (2022) Spanning Tree-based Graph Generation for Molecules . Formulating molecular graph generation as a construction of a spanning tree and the residual edges.
- ICLR spotlight (2022) Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design . Demonstrating the potential to solve both problems of design and synthesis simultaneously.
- ICLR (2022) Learning to Extend Molecular Scaffolds with Structural Motifs . A new fragment-based generative model of molecules that can be constrained to include an arbitrary subgraph (scaffold).
- ICLR (2022) Differentiable Scaffolding Tree for Molecule Optimization . Make the molecular optimization problem differentiable at the structure level.
- ICLR (2022) Top-N: Equivariant Set and Graph Generation without Exchangeability . Top-n can replace i.i.d. generation in any VAE or GAN -- it is easier to train and better captures complex dependencies in the data.
- ICLR (2022) Multi-objective Optimization by Learning Space Partition . LaMOO substantially outperforms strong baselines on multiple real-world MOO tasks, by up to 225% in sample efficiency for neural architecture search on Nasbench201, and up to 10% for molecular design.
- ICLR (2022) Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery . All reviewers are convinced about the novelty of the proposed method. They all appreciate the attempt to attack COVID-19 using machine learning.
- ICLR (2022) An Autoregressive Flow Model for 3D Molecular Geometry Generation from Scratch . 3D molecular geometry generation from scratch.
- NeurIPS (2021) GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles. Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
- NeurIPS (2021) Property-Aware Relation Networks for Few-Shot Molecular Property Prediction. Molecular property prediction is essentially a few-shot problem which makes it hard to use regular machine learning models.
- NeurIPS (2021) Functionally Regionalized Knowledge Transfer for Low-resource Drug Discovery. A functional rationalized meta-learning algorithm FRML for transferring the knowledge from previous assays, namely in-vivo experiments, by different laboratories and against various target proteins.
- ICLR oral (2022) Meta-Learning with Fewer Tasks through Task Interpolation . The bottleneck of current meta-learning algorithms is the requirement of a large number of meta-training tasks, which may not be accessible in real-world scenarios.
- ICLR (2022) Constrained Graph Mechanics Networks . Can be used for molecular dynamics prediction.
- NeurIPS (2021) SE(3)-equivariant prediction of molecular wavefunctions and electronic densities Introduce general SE(3)-equivariant operations and building blocks for constructing deep learning architectures for geometric point cloud data and apply them to reconstruct wavefunctions of atomistic systems with unprecedented accuracy.
- ICLR spotlight (2022) Equivariant Transformers for Neural Network based Molecular Potentials . A novel equivariant Transformer architecture for the prediction of molecular potentials and provide insights into the molecular representation through extensive analysis of the model's attention weights. Highlight the importance of datasets including off-equilibrium conformations for the evaluation of molecular potentials.
- ICLR spotlight (2022) Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions . A new network architecture that solves the Schrödinger equation for multiple geometries simultaneously.
- ICLR (2022) Simple GNN Regularisation for 3D Molecular Property Prediction and Beyond . Adding node-label noise to a GNN. There is little technical novelty. The proposed applications of the approach are interesting.
- NeurIPS (2021) Directional Message Passing on Molecular Graphs via Synthetic Coordinates. Propose synthetic coordinates that enable the use of advanced GNNs without requiring the true molecular configuration.
- NeurIPS (2021) GemNet: Universal Directional Graph Neural Networks for Molecules. We show that GNNs with directed edge embeddings and two-hop message passing are indeed universal approximators for predictions that are invariant to translation, and equivariant to permutation and rotation.
- ICLR (2022) Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations . A method of processing the 3D torsion angles of a molecular conformer to learn tetrahedral chirality while integrating a novel invariance to rotations about internal molecular bonds directly into the model architecture.
- ICLR (2022) A Program to Build E(N)-Equivariant Steerable CNNs. A general method to build G-steerable kernel spaces for equivariant steerable CNNs.
- ICLR spotlight (2022) Geometric and Physical Quantities improve E(3) Equivariant Message Passing . Generalise equivariant graph networks such that node and edge updates are able to leverage covariant information.
- ICLR (2022) Spherical Message Passing for 3D Molecular Graphs . Incorporating torsion information when representing 3D molecules is novel and helpful.
- NeurIPS (2021) Deep Molecular Representation Learning via Fusing Physical and Chemical Information. Two networks specialize in their own tasks and cooperate by providing expertise to each other.
- ICLR (2022) MoReL: Multi-omics Relational Learning . Multi-omics data analysis has the potential to discover hidden molecular interactions, revealing potential regulatory and/or signal transduction pathways for cellular processes of interest when studying life and disease systems.
- ICLR (2022) Graph Neural Networks with Learnable Structural and Positional Representations . This work adds the positional encoding (akin to those in transformers, but adapted) to GNNs.
- ICLR (2022) Pre-training Molecular Graph Representation with 3D Geometry . A new SSL framework to make 3D geomety information helpful for 2D representation, in terms of the downstream tasks with 2D info only.
- ICLR (2022) Chemical-Reaction-Aware Molecule Representation Learning . Make use of chemical reactions to improve the generalization ability of learned molecule embeddings
- NeurIPS (2021) Motif-based Graph Self-Supervised Learning for Molecular Property Prediction. Most existing self-supervised pretraining frameworks for GNNs only focus on node-level or graph-level tasks. These approaches cannot capture the rich information in subgraphs or graph motifs.
- ICLR (2022) GeneDisco: A Benchmark for Experimental Design in Drug Discovery. GeneDisco contains a curated set of multiple publicly available experimental data sets as well as open-source implementations of state-of-the-art active learning policies for experimental design and exploration.
- ICLR spotlight (2022) Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design . The reviewers are in agreement that the problem is one of importance, and that the technical and empirical contributions are strong. There are concerns over the relevance of evaluating the method by using a predictive model as ground truth. Still, the overall contributions remain.
- ICLR (2022) Maximum n-times Coverage for Vaccine Design . The results are used to produce a panstrain COVID vaccine.
- NeurIPS (2021) Multi-Scale Representation Learning on Proteins. A multi-scale graph construction of a protein.
- ICLR (2022) OntoProtein: Protein Pretraining With Gene Ontology Embedding . A general framework to integrate knowledge graph (gene ontology) into protein pre-training.
- NeurIPS (2021) Language models enable zero-shot prediction of the effects of mutations on protein function. Modeling the effect of sequence variation on function is a fundamental problem for understanding and designing proteins.
-
NeurIPS (2021) Co-evolution Transformer for Protein Contact Prediction. Protein contact prediction (PCP) is an essential building block of many protein structure related applications.
-
ICLR (2022) Geometric Transformers for Protein Interface Contact Prediction . A geometry-evolving graph transformer for 3D protein structures.
-
ICLR spotlight (2022) Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking . Guarantees the same resulting protein complex independent of the initial placement of the two 3D structures.