Skip to content

Supplementary materials for team Dmlab's Tox21 Data Challenge solution. Companion to the Frontiers in Environmental Science special issue article.

Notifications You must be signed in to change notification settings

themrbarti/tox21-challenge-publication

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Tox21 Data Challenge publication supplementary material

Supplementary materials for Tox21 Data Challenge solution of team Dmlab. Companion to the article "Identifying biological pathway interrupting toxins using multi-tree ensembles" published in the special issue of Frontiers in Environmental Science journal.

The publication is currently under review.

Materials included here:

  • /configuration/padel-descriptors.conf: Configuration file to generate descriptors & fingerprints in PaDel
  • /misc/feature_importances_nr-aromatase.csv: top200 most important features by the random forest model for winning track NR-aromatase
  • /misc/feature_importances_nr-ar.csv: top200 most important features by the random forest model for winning track NR-AR
  • /misc/feature_importances_sr-p53.csv: top200 most important features by the random forest model for winning track SR-p53
  • /misc/weights-relief-03.csv: Attribute weights generated by the attribute weighting scheme in RapidMiner
  • /processes/python/nr-aromatase.py: sample data preparation & modeling for winning track NR-aromatase
  • /processes/python/nr-ar.py: sample data preparation & modeling for winning track NR-AR
  • /processes/python/sr-p53.py: sample data preparation & modeling for winning track SR-p53
  • /processes/rapidminer/4-preparing-final-evaluation-set.rmp: RapidMiner process to generate final evaluation data set
  • /processes/rapidminer/3-selecting-attributes-by-weights.rmp: RapidMiner process for attribute weighting scheme
  • /processes/rapidminer/2-remove-missing-values.rmp: RapidMiner process to handle missing data
  • /processes/rapidminer/1-prepare-descriptions-fingerprints.rmp: RapidMiner process to prepare data set
  • /processes/knime/2-fingerprint-generation.zip: KNIME process to generate fingeprints
  • /processes/knime/1-descriptor-generation.zip: KNIME process to generate descriptors

The solution was developed using the following software versions:

PaDel Descriptor

KNIME Analytics Platform 2.10.1

RDKit KNIME Extension 2.4.0

RapidMiner 5.3.15

RapidMiner Feature Selection Extension 1.1.4

Python 2.7.5

Pandas library 0.14.1

Scikit-learn library 0.15.0

About

Supplementary materials for team Dmlab's Tox21 Data Challenge solution. Companion to the Frontiers in Environmental Science special issue article.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages