Skip to content
This repository has been archived by the owner on Jan 7, 2020. It is now read-only.

Commit

Permalink
minor changes
Browse files Browse the repository at this point in the history
  • Loading branch information
jannahastings committed Nov 25, 2011
1 parent d6e0cc2 commit 9e3a9bb
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion strucontjcheminf.tex
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -461,7 +461,7 @@ \subsection*{Algorithmic and statistical approaches to automatic hierarchy const
% But note that it doesn't give you hierarchies because IT IS NOT COMPOSITIONAL. So C cannot be derived from X.
% Markush notation was invented to baffle and bullshit the enemy.

All the above is basically rule-based but machine learning approaches to QSAR have become fashionable recently. Supervised methods, such as Bayesian classifiers, decision trees and support vector machines, are employed to classify compounds for a particular functional activity class. However, these approaches, resulting as they do in binary output, do not immediately lend themselves to hierarchical classifications. Supervised machine learning for prediction of chemical class membership based on an existing hierarchy is an interesting option but have hitherto only been shown to work (HELLO?) on small numbers of classes, and furthermore requires a reasonably large training set of chemicals which are already classified. Although existing databases like ChEBI and MeSH~\cite{meshUrl} could act as training sets for such an endeavor, the size of these data is still a tiny fraction of the enormous chemical space, and the problem is further complicated by the fact that the leaf nodes of such classification trees normally contain few structures. Also the ChEBI classification, at least, is far from complete. An arbitrary compounds belongs to loads, literally loads of classes and will only have been classified under one or two.
All the above is basically rule-based but machine learning approaches to QSAR have become prominent in recent research. Supervised methods, such as Bayesian classifiers, decision trees and support vector machines, are employed to classify compounds for a particular functional activity class. However, these approaches, resulting as they do in binary output, do not immediately lend themselves to hierarchical classifications. Supervised machine learning for prediction of chemical class membership based on an existing hierarchy is an interesting option but have hitherto only been shown to work (HELLO?) on small numbers of classes, and furthermore requires a reasonably large training set of chemicals which are already classified. Although existing databases like ChEBI and MeSH~\cite{meshUrl} could act as training sets for such an endeavor, the size of these data is still a tiny fraction of the enormous chemical space, and the problem is further complicated by the fact that the leaf nodes of such classification trees normally contain few structures. Also the ChEBI classification, at least, is far from complete. An arbitrary compounds belongs to a vast number of classes and will only have been classified under one or two.

%Machine learning has to be further discussed. In particular the way that machine learning allows application of certain types of hierarchy construction on a larger scale than exhaustive pairwise alignment?

Expand Down

0 comments on commit 9e3a9bb

Please sign in to comment.