Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
Welcome to LAST-Utils. These are LAST utilities, available from http://github.com/amaunz/last-utils/tree/master. Requirements: ruby 1.8 with OpenBabel bindings (see http://openbabel.org/wiki/Ruby). = EXAMPLES: Two modi are available: conversion LAST->SMARTS and instantiation SMARTS/SMILES. 1) Mine LAST descriptors and convert output to SMARTS (see LAST README file how to use the fminer frontend binary for LAST) using cpdbdata (see http://github.com/amaunz/cpdbdata): /path/to/fminer -f14 /path/to/cpdbdata/salmonella_mutagenicity/salmonella_mutagenicity_alt.smi /path/to/cpdbdata/salmonella_mutagenicity/salmonella_mutagenicity_alt.class | ./last-utils.rb 1 "nls" > salm-last.smarts Note: This should be called from the current directory. Note: Variants 'msa' and 'nls' produce LAST-SMARTS with optional parts of the structures (with recursive SMARTS), while 'nop' disallows optional parts of the structures (ambiguities only on the atom / edge level). 2) Find instantiations of molecules in a .smi file using the last descriptors we just mined: /path/to/last-utils/last-utils.rb 2 /path/to/cpdbdata/salmonella_mutagenicity/salmonella_mutagenicity_alt.smi < salm-last.smarts > salm-last.inst = NOTE: For precise synopsis and information run last-utils.rb without arguments. = TRANSFORMATION TO SMARTS: SMARTS are regular expressions for chemical fragments. The implementation used here, (LAST-SMARTS), is recursively generated by a depth-first traversal of each LAST graph, starting at node 0. Atoms are represented by their number, e.g. '#6' for carbon. Bonds are represented by their order (1-3). For an introduction to SMARTS, see e.g. http://www.daylight.com/meetings/summerschool01/course/basics/smarts.html. For every node visited, we demand an explicit but arbitrary branch with '(~*)' IF AND ONLY IF there are n optional branches with n>1 (*). In case of (*), for each branch bi, i in [1,n], we describe the 1-step ('local') neighborhood of the node by a recursive SMARTS pattern, including the node itself, predecessor,bi, and successor. The pattern bi is itself a LAST-SMARTS, and the local neighborhoods are combined via disjunction (there must be at least two due to (*)). Formally, LAST-SMARTS are defined as follows (EBNF): AN := ’17’ | ’35’ | ’5’ | ’6’ | ’7’ | ’8’ | ’15’ | ’16’ | ’9’ | ’53’ A := (AN ’,’ A) | AN SB := ’-’ | ’=’ | ’#’ | ’:’ E := (SB ’,’ E) | SB N := ’[#’ A ’]’ LR := (L ’,’ LR) | L L := N ’(’ E LS ’)’ (’(’ E N ’)’)+ BN := ’[#’ A ’;$’ ’(’ LR ’)’ ’]’ ’(~*)’+ LS := (N | BN) | LS E (N | BN) Example: [#7] [#6;$([#6]([#7])([#7 ])=[#6 ]),$([#6]([#7][#6])([#7 ])=[#6 ])](~*) =[#6] This denotes a nitrogen connected to a carbon double-connected to a carbon. The middle carbon’s local environment is recursively described. It consists of back and forward links, but additionally specifies either a nitrogen or a nitrogen/carbon branch. Since the standard is “truly recursive”, i.e. nothing inside the $(...) is identified with the outside, we need (~*) to enforce that at least one additional branch is actually attached. Andreas Maunz, 2010