Skip to content

ljmartin/what_do_mol_prop_look_like

Repository files navigation

What do molecular properties look like?

view the app: link

what

this a streamlit app centred around 500,000 molecules from the ZINC database. You can drag the sliders to filter the molecules by their physical chemistry properties (so far, only molecular weight and calculated logP). You then press a button to select a small random sample and visualize them for inspection.

scrnshot

why

ultra-large molecular docking is in vogue now. A typical virtual library can easily reach hundreds of millions of molecules, requiring high-performance computing clusters. One way to cut down on computation time is to pre-filter the library by properties you think are desirable. The ZINC tranche browser is one way to do this. But how do you decide where to set the cut-off? The five rules from Lipinsk, Lombardo, Dominy, and Feeney were the original way, but more and more people are seeing the rules broken.

So the rules are flexible - but where do you stop? This project attempts to teach an intuitive feel for molecular properties using the old-fashioned trick of looking - if you stare long enough at molecules in a certain property range, you get a feel for what is being included and what is being excluded by a given filtering rule.

how

molecules were downloaded from ZINC using the tranche browser - 2d, standard reactivity, and in-stock. ZINC does all the work here - The .curl file to do this is in the ./smilesfiles dir but the files are a too big for github so you'd have to download them using bash ZINC-downloader-2D-smi.curl in a terminal. I used a notebook (setup.ipynb) to downsample this to 500,000 molecules (sample.smifi). The notebook also uses rdkit to calculate molecular weight and logP (using Crippen algorithm). It's worth noting that some say the Crippen algorithm exaggerates logP magnitude (citation needed).

The app itself is a streamlit python app. See their website for instructions on how to make streamlit apps, or inspect stApp.py. Streamlit currently (jan 2021) have a service that hosts apps. If that goes down, this app can be run locally using streamlit run stApp.py. One thing to note is that rdkit is installed by conda, which requires the conda.txt instead of a requirements.txt, which uses pip, as well as a conda_channels.txt.

thanks to iwatobipen for working out the conda stuff beforehand.

add more properties?

open an issue or drop a line with a property and I'll add it!

About

What do molecular properties look like?

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published