Skip to content


Merge branch 'release/v3.2.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
rautenberg committed Jul 13, 2012
2 parents 6dc3e74 + bc73e23 commit 9d0d964
Show file tree
Hide file tree
Showing 1,225 changed files with 77,734 additions and 494 deletions.
5 changes: 4 additions & 1 deletion .gitmodules
Expand Up @@ -3,4 +3,7 @@
url = git://
[submodule "last-utils"]
path = last-utils
url = git://
url = git://
[submodule "bbrc-sample"]
path = bbrc-sample
url = git://
13 changes: 13 additions & 0 deletions ChangeLog
@@ -1,3 +1,16 @@
v4.0.0 2012-07-12
* bbrc-sample as submodule
* fminer feature datasets carry metadata (minfreq, nr_hits, ...)
* matching service uses now last-utils, calculates p-values
* fminer support for percentage and per-mil frequencies
* switch to opentox-ruby version 4.0.0

* Dichotomy between nominal and numeric features removed, which allows for uniform handling of all descriptors
* Uniform interface /pc for feature generation (for PC descriptors)
* Uniform interface /fs for feature selection (using recursive feature elimination)
* min_sim for cosine similarity corrected

v3.1.0 2012-02-24
* lazar.rb: pc type parameter in model, cleaned all parameters, propositionalized learning only for SVM, switch for minimal training performance, removed conf_stdev
* fminer.rb: feature match service for datasets, also with number of hits
188 changes: 126 additions & 62 deletions
Expand Up @@ -3,66 +3,104 @@ OpenTox Algorithm

- An [OpenTox]( REST Webservice
- Implements the OpenTox algorithm API for
- fminer
- lazar
- subgraph descriptor calculation (fminer)
- physico-chemical descriptor calculation (pc) for more than 300 descriptors
- feature selection (fs) using recursive feature elimination (rfe)
- See [opentox-ruby on]( for high-level workflow documentation

REST operations

Get a list of all algorithms GET / - URIs of algorithms 200
Get a representation of the GET /fminer/ - fminer representation 200,404
fminer algorithms
Get a representation of the GET /fminer/bbrc - bbrc representation 200,404
Get a representation of the GET /lazar - lazar representation 200,404
lazar algorithm
Get a list of all algorithms GET / - URIs of algorithms 200
Get a representation of the GET /fminer/ - fminer representation 200,404
fminer algorithms
Get a representation of the GET /fminer/bbrc - bbrc representation 200,404
bbrc algorithm
Get a representation of the GET /fminer/last - last representation 200,404
last algorithm
Get a representation of the GET /lazar - lazar representation 200,404
lazar algorithm
Get a representation of the GET /feature_selection - feature selection representation 200,404
feature selection algorithms
Get a representation of the GET /feature_selection/rfe - rfe representation 200,404
rfe algorithm

Create bbrc features POST /fminer/bbrc dataset_uri, URI for feature dataset 200,400,404,500
[min_frequency=5 per-mil],
Create last features POST /fminer/last dataset_uri, URI for feature dataset 200,400,404,500
[min_frequency=8 %],
Create lazar model POST /lazar dataset_uri, URI for lazar model 200,400,404,500
[nr_hits=false (class. using wt. maj. vote), true (else)],
[min_sim=0.3 (nominal), 0.4 (numeric features)]

Create selected features POST /feature_selection/rfe dataset_uri, URI for dataset 200,400,404,500

Get a representation of the GET /fminer/last - last representation 200,404
last algorithm
Get a representation of the GET /pc - URIs of algorithms 200,404
pc algorithms
Get a representation of the GET /pc/<name> - descriptor representation 200,404
pc algorithm <name>
Get a representation of the GET /fs - URIs of algorithms 200,404
fs algorithms
Get a representation of the GET /fs/rfe - rfe representation 200,404
rfe algorithm
Create lazar model POST /lazar dataset_uri, URI for lazar model 200,400,404,500
[nr_hits=false (cl+wmv),
true (else)],
[min_sim=0.3 (nominal), 0.4
(numeric features)],
Create bbrc features POST /fminer/bbrc dataset_uri, URI for feature dataset 200,400,404,500
[min_frequency=5 per-mil],
Create last features POST /fminer/last dataset_uri, URI for feature dataset 200,400,404,500
[min_frequency=8 %],
Create features POST /pc/AllDescriptors dataset_uri, URI for dataset 200,400,404,500
Create feature POST /pc/<name> dataset_uri URI for dataset 200,400,404,500
Select features POST /fs/rfe dataset_uri, URI for dataset 200,400,404,500


- prediction\_algorithm: One of "weighted\_majority\_vote" (default for classification), "local\_svm\_classification", "local\_svm\_regression" (default for regression). "weighted\_majority\_vote" is not applicable for regression.
- pc_type: Mandatory for feature dataset, one of [geometrical, topological, electronic, constitutional, hybrid, cpsa].
- nr_hits: Whether nominal features should be instantiated with their occurrence counts in the instances. One of "true", "false".
- min_sim: The minimum similarity threshold for neighbors. Numeric value in [0,1].
- min_train_performance. The minimum training performance for "local\_svm\_classification" (Accuracy) and "local\_svm\_regression" (R-squared). Numeric value in [0,1].
- del_missing: one of true, false
- *del_missing*: one of
- *true*
- *false*

- *feature\_type*: Type of subgraphs when no feature dataset is supplied, one of
- *trees*
- *paths*

- *lib*: Mandatory for feature datasets that do not contain appropriate feature metadata, one of
- *cdk*
- *openbabel*
- *joelib*

- *min_sim*: The minimum similarity threshold for neighbors. Numeric value in [0,1].

- *min_train_performance*. The minimum training performance for *local\_svm\_classification* (Accuracy) and *local\_svm\_regression* (R-squared). Numeric value in [0,1].

See for a graphical overview.
- *nr_hits*: Whether nominal features should be instantiated with their occurrence counts in the instances. One of
- *true*
- *false*

- *pc_type*: Mandatory for feature datasets that do not contain appropriate feature metadata, one of
- *geometrical*
- *topological*
- *electronic*
- *constitutional*
- *hybrid*
- *cpsa*

- *prediction\_algorithm*: One of
- *weighted\_majority\_vote* (default for classification, n.a. for regression)
- *local\_svm\_classification*
- *local\_svm\_regression* (default for regression).

Supported MIME formats
Expand All @@ -76,17 +114,39 @@ Examples

NOTE: hosts the stable version that might not have complete functionality yet. **Please try** for latest versions.

### Get the OWL-DL representation of lazar


### Get the OWL-DL representation of fminer


### Get the OWL-DL representation of lazar
### Get the OWL-DL representation of pc


### Get the OWL-DL representation of fs


* * *

The following creates datasets with backbone refinement class representatives or latent structure patterns, using supervised graph mining, see These features can be used e.g. as structural alerts, as descriptors (fingerprints) for prediction models or for similarity calculations.
### Create lazar model

Creates a standard Lazar model with subgraph descriptors.

curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d feature_generation_uri=

Creates a Lazar model with physico-chemical descriptors.

curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d feature_dataset_uri={feature_dataset_uri}

feature_uri specifies the dependent variable from the dataset.

* * *

Creates subgraph descriptors with backbone refinement class representatives or latent structure patterns, using supervised graph mining, see These features can be used e.g. as structural alerts, as descriptors (fingerprints) for prediction models or for similarity calculations.

### Create the full set of frequent and significant subtrees

Expand All @@ -101,30 +161,34 @@ backbone=false reduces BBRC mining to frequent and correlated subtree mining (mu

feature_uri specifies the dependent variable from the dataset.
Adding -d nr_hits=true produces frequency counts per pattern and molecule.
Please click [here]( for more guidance on usage.
Click [here]( for more guidance on usage.

### Create [LAST-PM]( descriptors, recommended for small to medium-sized datasets.

curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d min_frequency={min_frequency}

feature_uri specifies the dependent variable from the dataset.
Adding -d nr_hits=true produces frequency counts per pattern and molecule.
Please click [here]( for guidance for more guidance on usage.
Click [here]( for guidance for more guidance on usage.

* * *

### Create lazar model
* * *

Creates a standard Lazar model.
### Create a feature dataset of physico-chemical descriptors with CDK

curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d feature_generation_uri=
curl -X POST -d dataset_uri={dataset_uri} -d lib=cdk

[API documentation](
lib specifies the library to use.

* * *

### Create a feature dataset of selected features
curl -X POST -d dataset_uri={dataset_uri} -d prediction_feature_uri={prediction_feature_uri} -d feature_dataset_uri={feature_dataset_uri} -d del_missing=true
### Select features from a feature dataset

curl -X POST -d dataset_uri={dataset_uri} -d prediction_feature={feature_uri} -d feature_dataset_uri={feature_dataset_uri}

feature_uri specifies the dependent variable from the dataset.

* * *

Copyright (c) 2009-2011 Christoph Helma, Martin Guetlein, Micha Rautenberg, Andreas Maunz, David Vorgrimmler, Denis Gebele. See LICENSE for details.

36 changes: 26 additions & 10 deletions application.rb
@@ -1,17 +1,33 @@
# Java Klimbim
ENV["JAVA_HOME"] = "/usr/lib/jvm/java-6-sun" unless ENV["JAVA_HOME"]
ENV["JOELIB2"] = File.join File.expand_path(File.dirname(__FILE__)),"java"
deps = []
deps << "#{ENV["JAVA_HOME"]}/lib/tools.jar"
deps << "#{ENV["JAVA_HOME"]}/lib/classes.jar"
deps << "#{ENV["JOELIB2"]}"
jars = Dir[ENV["JOELIB2"]+"/*.jar"].collect {|f| File.expand_path(f) }
deps = deps + jars
ENV["CLASSPATH"] = deps.join(":")

require 'rubygems'
# AM LAST: can include both libs, no problems
require File.join(File.expand_path(File.dirname(__FILE__)), 'libfminer/libbbrc/bbrc') # has to be included before openbabel, otherwise we have strange SWIG overloading problems
require File.join(File.expand_path(File.dirname(__FILE__)), 'libfminer/liblast/last') # has to be included before openbabel, otherwise we have strange SWIG overloading problems
require File.join(File.expand_path(File.dirname(__FILE__)), 'last-utils/lu.rb') # AM LAST
gem "opentox-ruby", "~> 3"

# fminer libs to be included before openbabel, otherwise strange SWIG overloading problems
require File.join(File.expand_path(File.dirname(__FILE__)), 'libfminer/libbbrc/bbrc')
require File.join(File.expand_path(File.dirname(__FILE__)), 'libfminer/liblast/last')
require File.join(File.expand_path(File.dirname(__FILE__)), 'last-utils/lu.rb')

gem "opentox-ruby", "~> 4"
require 'opentox-ruby'
require 'rjb'
require 'rinruby'

#require 'smarts.rb'
#require 'similarity.rb'
require 'openbabel.rb'
# main
require 'fminer.rb'
require 'lazar.rb'
require 'feature_selection.rb'
require 'fs.rb'
require 'pc.rb'

set :lock, true

Expand All @@ -23,7 +39,7 @@
# @return [text/uri-list] algorithm URIs
get '/?' do
list = [ url_for('/lazar', :full), url_for('/fminer/bbrc', :full), url_for('/fminer/last', :full), url_for('/feature_selection/rfe', :full) ].join("\n") + "\n"
list = [ url_for('/lazar', :full), url_for('/fminer/bbrc', :full), url_for('/fminer/bbrc/sample', :full), url_for('/fminer/last', :full), url_for('/fminer/bbrc/match', :full), url_for('/fminer/last/match', :full), url_for('/feature_selection/rfe', :full), url_for('/pc', :full) ].join("\n") + "\n"
case request.env['HTTP_ACCEPT']
when /text\/html/
content_type "text/html"
Expand Down

0 comments on commit 9d0d964

Please sign in to comment.