Skip to content

Multi-label classification using the Classifier Chain framework #176

@andrewdalpino

Description

@andrewdalpino

Unlike binary or multiclass classification, multi-label classification aims to predict the presence of more than a single class per sample. One approach to learning a model for this type of problem is to train a binary classifier to detect the presence of a single class label for each unique class label. This is sometimes called the "one vs. rest" approach. The simplest case of multi-label classification via this method is to assume that all labels are independent (the presence of one says nothing about the others). This case is referred to as Binary Relevance. A more complex approach is to model the correlations of labels using a directed acyclic graph (DAG) in which binary classifiers feed their predictions as features to downstream classifiers. This ticket is to research and develop a ensemble classifier that is capable of multi-label classification and covers the case where the indegree of each downstream node is at least 1 (Binary Relevance). If we can figure a way to greedily construct an (semi) optimal DAG using some heuristic approach then even better!

Related literature:

https://arxiv.org/pdf/1912.13405.pdf

Screenshot 2021-05-18 183232

Scikit Learn implemetation (Binary Relevance only):

https://scikit-learn.org/dev/modules/generated/sklearn.multiclass.OneVsRestClassifier.html

Other things to consider ...

  • How expensive is it to construct a full-connected DAG in a real-world setting?
  • Could cross validation methods be used to prune a fully-connected DAG for better inference performance?
  • What changes to the Dataset API need to be made?
  • Should we use probabilities, classes, or one of the other as derived features for downstream classifiers?

Metadata

Metadata

Assignees

Labels

ResearchActive area of researchenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions