Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Multilabel Classification

Problem analysis, metrics and techniques

by Francisco Herrera, Francisco Charte, Antonio J. Rivera, María J. del Jesus

Springer, 2016.

Book Cover

This repository provides the multilabel datasets used throughout the chapters of the book Multilabel Classification - Problem analysis, metrics and techniques, as well as some code and links. Click the folder corresponding the chapters (in the list above) to download the files you are interested in. You can also clone the entire repository, as well as to download it as a ZIP file.

About this book

This book offers a comprehensive review of multilabel techniques widely used to classify and label texts, pictures, videos and music in the Internet. A deep review of the specialized literature on the field includes the available software needed to work with this kind of data. It provides the user with the software tools needed to deal with multilabel data, as well as step by step instruction on how to use them. The main topics covered are:

  • The special characteristics of multi-labeled data and the metrics available to measure them.
  • The importance of taking advantage of label correlations to improve the results.
  • The different approaches followed to face multi-label classification.
  • The preprocessing techniques applicable to multi-label datasets.
  • The available software tools to work with multi-label data.

This book is beneficial for professionals and researchers in a variety of fields because of the wide range of potential applications for multilabel classification. Besides its multiple applications to classify different types of online information, it is also useful in many other areas, such as genomics and biology. No previous knowledge about the subject is required. The book introduces all the needed concepts to understand multilabel data characterization, treatment and evaluation.


The following is a summary of links provided through the book. Each chapter's folder also contains the links which are relevant to the studied topic.

Data Repositories

Most of the available multilabel datasets can be obtained from the following data repositories:

R Ultimate Multilabel Dataset Repository - RUMDR

MULAN Data Repository

MEKA Data Repository

KEEL Multilabel Datasets

LibSVM Multilabel Datasets

Extreme Classification Repository


The following are links to multilabel software tools:

mldr R package at CRAN - GitHub

mldr.datasets R package at CRAN - GitHub

MEKA Java software

MULAN Java software

Synthetic Dataset Generator for Multi-label Learning - Mldatagen

Code / Programs

ML-TREE - Source code with the reference implementation of the ML-TREE algorithm

Rank-SVM - Source code with the reference implementation of the Rank-SVM algorithm

MLSMOTE - Java implementation of the MLSMOTE algorithm

MLeNN - Java implementation of the MLeNN algorithm

REMEDIAL - R source code for the REMEDIAL algorithm


This content is licensed under LGPLv3. What does this mean?


Multilabel Classification - Problem analysis, metrics and techniques software and data repository




No releases published


No packages published