-
Notifications
You must be signed in to change notification settings - Fork 0
NCI Yeast Anticancer Drug Screen: the large scale data used in "Large-Scale Graph Mining using Backbone Refinement Classes"
amaunz/data-yeast-ac
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
http://dtp.nci.nih.gov/yacds/download.html und http://dtpsearch.ncifcrf.gov/FTP/OPEN_BABEL_SMILES.TXT Created two activity-files: A structure was classified as active, if - at least one strain was >= 0.7 (ac_onestrain.class) - all strains were >= 0.7 (ac_allstrains.class) |-- README this file |-- ac_alt.smi compounds in SMILES format |-- ac_alt_allstrains.active sfgm compatible input format (actives) |-- ac_alt_allstrains.class lazar and libfminer compatible input format (all) |-- ac_alt_allstrains.inactive sfgm compatible input format (inactives) |-- ac_alt_onestrain.active |-- ac_alt_onestrain.class |-- ac_alt_onestrain.inactive |-- remove.rb remove trees without edges in gSpan file |-- replace.rb restore correct numbering from ac.smi |-- report.rb report trees without edges in gSpan file |-- sdf2gsp.pl convert SDF Molfile to gSpan input format How to create gSpan input file format (perl and ruby required): 1) Remove numbers from ac.smi file to obtain 1 SMILES per row in file ac.no_nr.smi 2) do 'babel -d -ismi ac.no_nr.smi -osdf ac.sdf' to convert to MOL format 3) do 'perl sdf2gsp.pl < ac.sdf > ac.no_nr.gsp' to convert to gSpan input format 4) do 'ruby replace.rb ac.no_nr.gsp ac.smi > ac.f.gsp' to restore correct numbering from .smi file [ optional: do 'ruby report.rb ac.f.gsp' to report trees without edges ] 5) do 'ruby remove.rb ac.f.gsp > ac.gsp' to remove trees without edges [ optional: do 'ruby report.rb ac.gsp' to report trees without edges ] POSTPROCESSED THE DATA (IMPORTANT)! COMMENTS: - Alternative Database (therefore the 'alt' in filenames). - See file 'bad.txt' for removed molecules and reasons for removal.
About
NCI Yeast Anticancer Drug Screen: the large scale data used in "Large-Scale Graph Mining using Backbone Refinement Classes"
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published