GitHub - mungpeter/LigandMMPA: Prepare and fragmentate ligands for matched-molecular pair analysis

AnalogGenerator

Generate a combinatorial library of diverse analogs on core scaffolds by decorating with a library of R-groups

====================================================================================

>  0_canonical_smiles_convert.py
       [input: original SMILES file]
       [output: RDKit canonical SMILES file]

e.g)  > x.py original.smi rdkit_canonical.smi

Goal: A quick way to convert any SMILES strings into RDKit-specific canonical SMILES format.

================================================

>  1_mmpdb_frag_gen.csh 
      [list of SMILES files]

Goal: Parse chemical library (SMILES) with mmpDB to generate fragments in JSON format (formatted list of lists)

================================================

>   2_parse_mmpdb_frag.py
       -list [list of SMILES files for mmpDB fragmentation]
       -size [heavy atom count of fragment to be saved]
       -out  [mmpDB output prefix]
       -regex [Optional: Regular Expression of atomtypes to be excluded] 
                         (def: "c|n|s|N|S|O|P|F|Cl|Br|I|Se|Te|B")

e.g)  > x.py -list smi.list -size 10 -out mmpdb_frag \
             -regex "c|n|s"      (don't collect any aromatic fragments)

Goal: Parse the JSON results from mmpDB's fragmentation step originally for Matched Molecular Pair Analysis. Here the results are collected to generate a list of SMILES strings with attachment point designated as [*] and output as CSV file.

Note: [*] are placed at the beginning of the string by mmpDB, may create problems for Chem.CanonSmiles() function when combining R-groups to core molecule. Place the [*] flag behind one atom:

      [*]CCCC          -->  C([*])CCC
      [*]C1CC1         -->  C1([*])CC1
      [*][C@H]12CC1C2  -->  [C@H]12([*])CC1C2

================================================

>   3_weld_r_groups.py
       -templ [Core Scaffold SMILES with attachment points marked by "x" (eg: CxxCx=C)]
       -r     [CSV file of R groups with attachment point marked by "[ * ]" generated by 2_parse_mmpdb_frag.py]
       -out   [Output prefix]
       -raw        [Optional: Read in pre-generated analog intermediate file (pickle file)]
       -unsat_min  [Optional: Remove molecule with deg unsaturation less than this (def: None)]
       -unsat_max  [Optional: Remove molecule with deg unsaturation larger than this (def: None)]

e.g) >   x.py -templ core_template.smi -r mmpdb_frag.csv -out combinatorial_analogs \
              -raw analog_intermediate.pickle.bz2 \
              -unsat_min 2 -unsat_max 5

Goal: Create combinatorial analog library of a core scaffold using a fragment library. While working on the generation, intermediates from the "Combine Core/R-group to generate molecule" step is saved into a pickle.bz2 in case failure in later steps.

create all permutations of cores with different number of branch pts limited by predefined number of allowed R-group carbon atom
create all permutations of different number of all R-groups, limited by predefined number of allowed R-group carbon atom
Combine core scaffolds (with designated branch points) and R-groups (with branch points) into one molecule
remove molecules with degree of unsaturation not fitting criteria
remove duplicated SMILES (strings are tautomerized and canonicalized)
generate all possible stereoisomers (E/Z, diastereomer) and remove meso-isomers (SMILES are tautomerized and canonicalized)
report results

####################################################################

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
examples		examples
mmpdb-master		mmpdb-master
old		old
.DS_Store		.DS_Store
0_canonical_smiles_convert.py		0_canonical_smiles_convert.py
1_mmpdb_frag_gen.csh		1_mmpdb_frag_gen.csh
2_parse_mmpdb_frag.py		2_parse_mmpdb_frag.py
3_weld_r_groups.py		3_weld_r_groups.py
LICENSE		LICENSE
README.md		README.md
check_py		check_py
mmpdb-master.zip		mmpdb-master.zip
mmpdb_frag_json_example.txt		mmpdb_frag_json_example.txt
rdkit_open.py		rdkit_open.py
test.sdf		test.sdf
test_frag_gen.py		test_frag_gen.py
x_count_frag.py		x_count_frag.py
x_tau_can.py		x_tau_can.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AnalogGenerator

About

Releases

Packages

Languages

License

mungpeter/LigandMMPA

Folders and files

Latest commit

History

Repository files navigation

AnalogGenerator

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages