Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically get MMPs for a given data set #51

Closed
pykao opened this issue Jan 11, 2023 · 4 comments
Closed

Automatically get MMPs for a given data set #51

pykao opened this issue Jan 11, 2023 · 4 comments

Comments

@pykao
Copy link

pykao commented Jan 11, 2023

Hi authors,

I thought this tool can automatically find the MMPs from a group of molecules.

For example, if mmpdb is given a sdf, csv or smi file, it can generate a resulting file which has all the MMPs from the given file.

However, when I read the paper, it seems that the user needs to provide user-defined cutting patterns. (the constants part in the paper)

Is mmpdb a interactive MMPs generation tool?

Best,

PK

@adalke
Copy link
Contributor

adalke commented Jan 11, 2023

When I hear "interactive" I think of a GUI. mmpdb is a command-line tool.

mmpdb only accepts SMILES as input, not SDF or CSV. See mmpdb help-analysis for documentation on the overall process. (Also in the README.)

There's a default cutting pattern so you don't need to specify one yourself. Use mmpdb fragment --help to see the details:

 The --cut-smarts argument supports the following short-hand aliases:
   'default': Cut all C-[!H] non-ring single bonds except for Amides/Esters/Amidines/Sulfonamides and CH2-CH2 and CH2-CH3 bonds
      smarts: [#6+0;!$(*=,#[!#6])]!@!=!#[!#0;!#1;!$([CH2]);!$([CH3][CH2])]
   'cut_AlkylChains': As default, but also cuts CH2-CH2 and CH2-CH3 bonds
      smarts: [#6+0;!$(*=,#[!#6])]!@!=!#[!#0;!#1]
   'cut_Amides': As default, but also cuts [O,N]=C-[O,N] single bonds
      smarts: [#6+0]!@!=!#[!#0;!#1;!$([CH2]);!$([CH3][CH2])]
   'cut_all': Cuts all Carbon-[!H] single non-ring bonds. Use carefully, this will create a lot of cuts
      smarts: [#6+0]!@!=!#[!#0;!#1]
   'exocyclic': Cuts all exocyclic single bonds
      smarts: [R]!@!=!#[!#0;!#1]
   'exocyclic_NoMethyl': Cuts all exocyclic single bonds apart from those connecting to CH3 groups
      smarts: [R]!@!=!#[!#0;!#1;!$([CH3])]

@pykao
Copy link
Author

pykao commented Jan 11, 2023

Hi @adalke,

Thanks for your reply. Do you know once I get the resulting file, i.e., test_data.fragments or test_data.mmpdb, how can I access the MMPs of the input data set?

["VERSION", "mmpdb-fragment/2"]                                                                     
 ["SOFTWARE", "mmpdb-2.1"]                                                                           
 ["OPTION", "cut_smarts", "[#6+0;!$(*=,#[!#6])]!@!=!#[!#0;!#1;!$([CH2]);!$([CH3][CH2])]"]            
 ["OPTION", "max_heavies", "100"]                                                                    
 ["OPTION", "max_rotatable_bonds", "10"]                                                             
 ["OPTION", "method", "chiral"]                                                                      
 ["OPTION", "num_cuts", "3"]                                                                         
 ["OPTION", "rotatable_smarts", "[!$([NH]!@C(=O))&!D1&!$(*#*)]-&!@[!$([NH]!@C(=O))&!D1&!$(*#*)]"]    
 ["OPTION", "salt_remover", "<default>"]                                                             
 ["RECORD", "phenol", "Oc1ccccc1", 7, "Oc1ccccc1", [[1, "N", 1, "1", "*O", "0", 6, "1", "*c1ccccc1",  "c1ccccc1"], [1, "N", 6, "1", "*c1ccccc1", "0", 1, "1", "*O", "O"]]]                                
 ["RECORD", "catechol", "Oc1ccccc1O", 8, "Oc1ccccc1O", [[1, "N", 1, "1", "*O", "0", 7, "1", "*        c1ccccc1O", "Oc1ccccc1"], [1, "N", 7, "1", "*c1ccccc1O", "0", 1, "1", "*O", "O"], [2, "N", 6, "11",  "*c1ccccc1*", "01", 2, "11", "*O.*O", null]]]                                                       
 ["RECORD", "2-aminophenol", "Oc1ccccc1N", 8, "Nc1ccccc1O", [[1, "N", 1, "1", "*N", "0", 7, "1", "*   c1ccccc1O", "Oc1ccccc1"], [1, "N", 1, "1", "*O", "0", 7, "1", "*c1ccccc1N", "Nc1ccccc1"], [1, "N",   7, "1", "*c1ccccc1N", "0", 1, "1", "*O", "O"], [1, "N", 7, "1", "*c1ccccc1O", "0", 1, "1", "*N",     "N"], [2, "N", 6, "11", "*c1ccccc1*", "01", 2, "12", "*N.*O", null]]]                               
 ["RECORD", "2-chlorophenol", "Oc1ccccc1Cl", 8, "Oc1ccccc1Cl", [[1, "N", 1, "1", "*Cl", "0", 7, "1",  "*c1ccccc1O", "Oc1ccccc1"], [1, "N", 1, "1", "*O", "0", 7, "1", "*c1ccccc1Cl", "Clc1ccccc1"], [1,    "N", 7, "1", "*c1ccccc1Cl", "0", 1, "1", "*O", "O"], [1, "N", 7, "1", "*c1ccccc1O", "0", 1, "1", "*  Cl", "Cl"], [2, "N", 6, "11", "*c1ccccc1*", "01", 2, "12", "*Cl.*O", null]]]                        
 ["RECORD", "o-phenylenediamine", "Nc1ccccc1N", 8, "Nc1ccccc1N", [[1, "N", 1, "1", "*N", "0", 7,      "1", "*c1ccccc1N", "Nc1ccccc1"], [1, "N", 7, "1", "*c1ccccc1N", "0", 1, "1", "*N", "N"], [2, "N",    6, "11", "*c1ccccc1*", "01", 2, "11", "*N.*N", null]]]                                              
 ["RECORD", "amidol", "Nc1cc(O)ccc1N", 9, "Nc1ccc(O)cc1N", [[1, "N", 1, "1", "*N", "0", 8, "1", "*    c1cc(O)ccc1N", "Nc1ccc(O)cc1"], [1, "N", 1, "1", "*N", "0", 8, "1", "*c1ccc(O)cc1N",                 "Nc1cccc(O)c1"], [1, "N", 1, "1", "*O", "0", 8, "1", "*c1ccc(N)c(N)c1", "Nc1ccccc1N"], [1, "N", 8,   "1", "*c1cc(O)ccc1N", "0", 1, "1", "*N", "N"], [1, "N", 8, "1", "*c1ccc(N)c(N)c1", "0", 1, "1", "*   O", "O"], [1, "N", 8, "1", "*c1ccc(O)cc1N", "0", 1, "1", "*N", "N"], [2, "N", 7, "12", "*c1ccc(*     )c(N)c1", "10", 2, "12", "*N.*O", null], [2, "N", 7, "12", "*c1ccc(N)c(*)c1", "10", 2, "12", "*N.*   O", null], [2, "N", 7, "12", "*c1ccc(O)cc1*", "01", 2, "11", "*N.*N", null], [3, "N", 6, "123", "*   c1ccc(*)c(*)c1", "201", 3, "112", "*N.*N.*O", null]]]                                               
 ["RECORD", "hydroxyquinol", "Oc1cc(O)ccc1O", 9, "Oc1ccc(O)c(O)c1", [[1, "N", 1, "1", "*O", "0", 8,   "1", "*c1cc(O)ccc1O", "Oc1ccc(O)cc1"], [1, "N", 1, "1", "*O", "0", 8, "1", "*c1ccc(O)c(O)c1",        "Oc1ccccc1O"], [1, "N", 1, "1", "*O", "0", 8, "1", "*c1ccc(O)cc1O", "Oc1cccc(O)c1"], [1, "N", 8,     "1", "*c1cc(O)ccc1O", "0", 1, "1", "*O", "O"], [1, "N", 8, "1", "*c1ccc(O)c(O)c1", "0", 1, "1", "*   O", "O"], [1, "N", 8, "1", "*c1ccc(O)cc1O", "0", 1, "1", "*O", "O"], [2, "N", 7, "12", "*c1ccc(*     )c(O)c1", "01", 2, "11", "*O.*O", null], [2, "N", 7, "12", "*c1ccc(O)c(*)c1", "01", 2, "11", "*O.*   O", null], [2, "N", 7, "12", "*c1ccc(O)cc1*", "01", 2, "11", "*O.*O", null], [3, "N", 6, "123", "*   c1ccc(*)c(*)c1", "012", 3, "111", "*O.*O.*O", null]]]                                               
 ["RECORD", "phenylamine", "Nc1ccccc1", 7, "Nc1ccccc1", [[1, "N", 1, "1", "*N", "0", 6, "1", "*       c1ccccc1", "c1ccccc1"], [1, "N", 6, "1", "*c1ccccc1", "0", 1, "1", "*N", "N"]]]                     
 ["RECORD", "cyclopentanol", "C1CCCC1N", 6, "NC1CCCC1", [[1, "N", 1, "1", "*N", "0", 5, "1", "*       C1CCCC1", "C1CCCC1"], [1, "N", 5, "1", "*C1CCCC1", "0", 1, "1", "*N", "N"]]]

Best,
PK

@pykao pykao changed the title Is mmpdb a fully automated MMPs generation tool? Automatically get MMPs for a given data set Jan 11, 2023
@adalke
Copy link
Contributor

adalke commented Jan 11, 2023

  1. you should use the most recent development version of mmpdb, at https://github.com/adalke/mmpdb/tree/v3-dev . It's close to being merged back to the main branch. There are a couple of last little bits to take care of, which mostly don't affect you.

In v3, the JSON lines format you quoted here for the fragmentations has been replaced by a SQLite database.

  1. mmpdb rulecat gives you the rules. More complex custom analysis requires understanding the schema (see mmpdblib/schema.sql) and making your own SQL queries.

  2. There is very little unpaid support for mmpdb. You'll need to try out the programs, read the command-line help (including mmpdb help and the sub-help commands), and do your own experimentation. I am also available for paid support.

@pykao
Copy link
Author

pykao commented Jan 12, 2023

@adalke Thank you for your help :)

@pykao pykao closed this as completed Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants