Skip to content

leADS: improved metabolic pathway inference based on active dataset subsampling

License

Notifications You must be signed in to change notification settings

hallamlab/leADS

Repository files navigation

Workflow

Basic Description

This repo contains an implementation of leADS (multi-label learning based on Active Dataset Subsampling) that leverages the idea of subsampling examples from data to reduce the negative impact of training loss. Specifically, leADS performs an iterative procedure to: (a)- constructing an acquisition model in an ensemble framework; (b) subselect informative examples using an acquisition function (entropy, mutual information, variation ratios, normalized propensity scored precision at k); and (c)- train on reduced selected examples. The ensemble approach was sought to enhance the generalization ability of the multi-label learning systems by concurrently building and executing a group of multi-label base learners, where each is assigned a portion of samples, to ensure proper learning of class labels (e.g. pathways). leADS was evaluated on the pathway prediction task using 10 multi-organism pathway datasets, where the experiments revealed that leADS achieved very compelling and competitive performances against the state-of-the-art pathway inference algorithms.

See tutorials on the GitHub wiki page for more information and guidelines.

Citing

If you find leADS useful in your research, please consider citing the following paper:

Contact

For any inquiries, please contact: arbasher@student.ubc.ca