# Classification

This notebook discusses Multi-label classification methods for the [academia.stackexchange.com](https://academia.stackexchange.com/) data dump.

Multilabel classification can be divided into three categories: problem transformation, algorithm adaption and ensembles. The here presented classifiers are used with the data preprocessed using [BoW](4.1-me-classification-bow.ipynb), [Word2Vec](4.2-me-classification-word2vec.ipynb), [Doc2Vec](4.2-me-classification-doc2vec.ipynb), [Fasttext](4.3-me-classification-fasttext.ipynb).

## Table of Contents
* [Problem Transformation](#problem_transformation)
* [Algorithm Adaption](#algorithm_adaption)
* [Ensembles](#ensembles)

<a id='problem_transformation'/>

## Problem Transformation

Problem transformation methods divide the multi label classification into n binary classification tasks (n = |Label|).

**DecisionTreeClassifier**

~~**ExtraTreesClassifier**~~

**KNeighborsClassifier**

~~**RadiusNeighborsClassifier**~~

**MLPClassifier**

**Multioutput Classifier**
MultiouputClassifier transforms sklearn classifier into classifiers capable of Binary Relevence.

**Classifier Chain**
<cite>[Read et al., 2011][1]</cite>

**LabelPowerset**

**ClasswiseClassifier**

[1]: https://doi.org/10.1007/s10994-011-5256-5

<a id='multioutput'/>

## Algorithm Adaption

Classifiers in the category of algorithm adaption where specifically designed for multi-label classification tasks. They are usually adaptions of classifiers used for binary classification.

**MLkNN**

> Firstly, for each test instance, its k nearest neighbors in the training set are identified. Then, according to statistical information gained from the label sets of these neighboring instances, i.e. the number of neighboring instances belonging to each possible class, maximum a posteriori (MAP) principle is utilized to determine the label set for the test instance.
<cite>[Zhang & Zhou, 2007][1]</cite>

~~**BRkNN**~~

~~> BRkNN is an adaptation of the kNN algorithm for multilabel classification that is conceptually equivalent to using the popular Binary Relevance problem transformation method in conjunction with the kNN algorithm, but |L| times faster. [L = labels]
<cite>[Spyromitros et al., 2008][2]</cite>~~

~~*BRkNNa*~~

~~>  This version of the classifier assigns the labels that are assigned to at least half of the neighbors.
<cite>[skmultilearn][3]</cite>~~

~~*BRkNNb*~~

~~>  This version of the classifier assigns the most popular m labels of the neighbors, where m is the average number of labels assigned to the object’s neighbors.
<cite>[skmultilearn][4]</cite>~~

**MLARAM**

> an extension of fuzzy Adaptive Resonance Associative Map (ARAM) – an Adaptive Resonance Theory (ART)based neural network. It aims at speeding up the classification process in the presence of very large data.
<cite>[F. Benites & E. Sapozhnikova, 2015][5]</cite>

[1]: https://doi.org/10.1016/j.patcog.2006.12.019
[2]: https://doi.org/10.1007/978-3-540-87881-0_40
[3]: http://scikit.ml/api/skmultilearn.adapt.brknn.html#skmultilearn.adapt.BRkNNaClassifier
[4]: http://scikit.ml/api/skmultilearn.adapt.brknn.html#skmultilearn.adapt.BRkNNbClassifier
[5]: https://doi.org/10.1109/ICDMW.2015.14

<a id=ensembles/>

## Ensembles

Classifier ensembles usually train a bunch of classifiers that decide together which labels should be applied to a sample.

~~**ExtraTreesClassifier**~~

~~**RandomForestClassifier**~~

**RAkEL**

> Rakel: randomly breaking the initial set of labels into a number of small-sized labelsets, and employing [Label powerset] to train a corresponding multilabel classifier.
<cite>[Tsoumakas et al., 2011][1]</cite>

*RAkELo*

> Divides the label space in to m subsets of size k, trains a Label Powerset classifier for each subset and assign a label to an instance if more than half of all classifiers (majority) from clusters that contain the label assigned the label to the instance.
<cite>[skmultilearn][2]</cite>

*RAkELd*

>Divides the label space in to equal partitions of size k, trains a Label Powerset classifier per partition and predicts by summing the result of all trained classifiers.
<cite>[skmultilearn][3]</cite>

**MajorityVotingClassifier**

**LabelSpacePartitioningClassifier**

<cite>[Szymański et al., 2016][4]</cite>



[1]: https://doi.org/10.1109/TKDE.2010.164
[2]: http://scikit.ml/api/skmultilearn.ensemble.rakelo.html#skmultilearn.ensemble.RakelO
[3]: http://scikit.ml/api/skmultilearn.ensemble.rakeld.html#skmultilearn.ensemble.RakelD
[4]: https://doi.org/10.3390/e18080282