Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
fc4e5eb
DOC Modifying the index
Mar 26, 2017
cec4f59
DOC Update the index
Mar 29, 2017
8c2d559
DOC add datasets module
Apr 1, 2017
11cc6e3
DOC add barebone
glemaitre May 19, 2017
e9f7f55
itr
glemaitre May 21, 2017
9148e96
IMG add image linear SVC
glemaitre May 21, 2017
afdfe07
IMG RUS
glemaitre May 21, 2017
a7c8526
DOC add practical guide over-sampling
glemaitre May 24, 2017
1377af9
DOC finish over-sampling
glemaitre May 27, 2017
9516928
add missing image in over-sampling
glemaitre Jun 15, 2017
d83021e
DOC change color figure
glemaitre Jun 15, 2017
31a10b0
FIX ADASYN generate from minority class only
glemaitre Jun 23, 2017
0a3d5e5
Merge remote-tracking branch 'origin/master' into is/253
glemaitre Aug 3, 2017
70f5e93
iter
glemaitre Aug 3, 2017
e81dfba
EXA add couples of examples
glemaitre Aug 4, 2017
fdde270
EXA add plots
glemaitre Aug 4, 2017
abf9707
DOC fix whats new and install
glemaitre Aug 4, 2017
f38a7f8
DOC linked to examples
glemaitre Aug 4, 2017
ebb42b3
DOC linked to examples
glemaitre Aug 4, 2017
492e672
Advance the dataset
glemaitre Aug 4, 2017
5f44cd4
CI upgrade sphinx and sphinx-gallery
glemaitre Aug 4, 2017
10ac077
CI remove warning numpydoc
glemaitre Aug 4, 2017
dbd7909
EXA update example
glemaitre Aug 4, 2017
6d57ef1
DOC cross-referencing
glemaitre Aug 4, 2017
bf7dc05
DOC fix cross-referencing of 2 examples
glemaitre Aug 4, 2017
c7e2686
DOC add warning regarding cleaning algorithm in ratio docstring
glemaitre Aug 7, 2017
a8139fc
DOC add nearmiss
glemaitre Aug 7, 2017
9748945
iter
glemaitre Aug 7, 2017
d8ea81a
ite
glemaitre Aug 8, 2017
882f80a
Merge remote-tracking branch 'origin/master' into is/253
glemaitre Aug 8, 2017
0ba859f
more permissive allknn
glemaitre Aug 8, 2017
c7477c9
DOC add tomek links
glemaitre Aug 8, 2017
ed51835
DOC finish under-sampling
glemaitre Aug 8, 2017
fc06d7c
DOC fix end-string warning
glemaitre Aug 8, 2017
210c12f
DOC fix place tomek paragraph
glemaitre Aug 8, 2017
b158caf
DOC fix typo sphx reference
glemaitre Aug 8, 2017
2407db8
DOC add examples cross-reference
glemaitre Aug 8, 2017
5898baf
DOC add combine user guide
glemaitre Aug 8, 2017
224b14e
DOC add ensemble description in user guide.
glemaitre Aug 8, 2017
9b102c4
iter
glemaitre Aug 8, 2017
6712fc5
DOC add text examples
glemaitre Aug 9, 2017
0488e6e
DOC add small description data sets
glemaitre Aug 9, 2017
10c17a8
Merge remote-tracking branch 'origin/master' into is/253
glemaitre Aug 9, 2017
881f6f2
DOC fix spelling
glemaitre Aug 9, 2017
a75ad3d
iter
glemaitre Aug 9, 2017
8eaccc7
iter
glemaitre Aug 9, 2017
123ddb1
DOC add backreference examples
glemaitre Aug 9, 2017
25163b0
DOC add next previous buttons back
glemaitre Aug 9, 2017
38b8cd9
DOC update the install from master using pip
glemaitre Aug 9, 2017
a008f03
Merge remote-tracking branch 'origin/master' into is/253
glemaitre Aug 11, 2017
5114a98
Merge remote-tracking branch 'origin/master' into is/253
glemaitre Aug 11, 2017
7348c4c
EXA improve make_imbalance example
glemaitre Aug 11, 2017
a07390f
Merge branch 'master' into is/253
glemaitre Aug 11, 2017
98e920e
Merge branch 'master' into is/253
glemaitre Aug 11, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ clean:
rm -rf coverage
rm -rf dist
rm -rf build
rm -rf doc/_build
rm -rf doc/auto_examples
rm -rf doc/generated
rm -rf doc/modules
Expand All @@ -35,7 +36,6 @@ test-coverage:
test: test-coverage test-doc

html:
conda install -y sphinx sphinx_rtd_theme numpydoc
export SPHINXOPTS=-W; make -C doc html

conda:
Expand Down
4 changes: 4 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,10 @@ commands to get a copy from GitHub and install all dependencies::
cd imbalanced-learn
pip install .

Or install using pip and GitHub::

pip install -U git+https://github.com/scikit-learn-contrib/imbalanced-learn.git

Testing
~~~~~~~

Expand Down
63 changes: 63 additions & 0 deletions doc/_static/js/copybutton.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
$(document).ready(function() {
/* Add a [>>>] button on the top-right corner of code samples to hide
* the >>> and ... prompts and the output and thus make the code
* copyable. */
var div = $('.highlight-python .highlight,' +
'.highlight-python3 .highlight,' +
'.highlight-pycon .highlight,' +
'.highlight-default .highlight')
var pre = div.find('pre');

// get the styles from the current theme
pre.parent().parent().css('position', 'relative');
var hide_text = 'Hide the prompts and output';
var show_text = 'Show the prompts and output';
var border_width = pre.css('border-top-width');
var border_style = pre.css('border-top-style');
var border_color = pre.css('border-top-color');
var button_styles = {
'cursor':'pointer', 'position': 'absolute', 'top': '0', 'right': '0',
'border-color': border_color, 'border-style': border_style,
'border-width': border_width, 'color': border_color, 'text-size': '75%',
'font-family': 'monospace', 'padding-left': '0.2em', 'padding-right': '0.2em',
'border-radius': '0 3px 0 0'
}

// create and add the button to all the code blocks that contain >>>
div.each(function(index) {
var jthis = $(this);
if (jthis.find('.gp').length > 0) {
var button = $('<span class="copybutton">&gt;&gt;&gt;</span>');
button.css(button_styles)
button.attr('title', hide_text);
button.data('hidden', 'false');
jthis.prepend(button);
}
// tracebacks (.gt) contain bare text elements that need to be
// wrapped in a span to work with .nextUntil() (see later)
jthis.find('pre:has(.gt)').contents().filter(function() {
return ((this.nodeType == 3) && (this.data.trim().length > 0));
}).wrap('<span>');
});

// define the behavior of the button when it's clicked
$('.copybutton').click(function(e){
e.preventDefault();
var button = $(this);
if (button.data('hidden') === 'false') {
// hide the code output
button.parent().find('.go, .gp, .gt').hide();
button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'hidden');
button.css('text-decoration', 'line-through');
button.attr('title', show_text);
button.data('hidden', 'true');
} else {
// show the code output
button.parent().find('.go, .gp, .gt').show();
button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'visible');
button.css('text-decoration', 'none');
button.attr('title', hide_text);
button.data('hidden', 'false');
}
});
});
16 changes: 16 additions & 0 deletions doc/_templates/class.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
:mod:`{{module}}`.{{objname}}
{{ underline }}==============

.. currentmodule:: {{ module }}

.. autoclass:: {{ objname }}

{% block methods %}
.. automethod:: __init__
{% endblock %}

.. include:: {{module}}.{{objname}}.examples

.. raw:: html

<div style='clear:both'></div>
12 changes: 12 additions & 0 deletions doc/_templates/function.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
:mod:`{{module}}`.{{objname}}
{{ underline }}====================

.. currentmodule:: {{ module }}

.. autofunction:: {{ objname }}

.. include:: {{module}}.{{objname}}.examples

.. raw:: html

<div style='clear:both'></div>
16 changes: 16 additions & 0 deletions doc/_templates/numpydoc_docstring.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{{index}}
{{summary}}
{{extended_summary}}
{{parameters}}
{{returns}}
{{yields}}
{{other_parameters}}
{{attributes}}
{{raises}}
{{warns}}
{{warnings}}
{{see_also}}
{{notes}}
{{references}}
{{examples}}
{{methods}}
46 changes: 30 additions & 16 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ This is the full API documentation of the `imbalanced-learn` toolbox.

.. _under_sampling_ref:

Under-sampling methods
======================
:mod:`imblearn.under_sampling`: Under-sampling methods
======================================================

.. automodule:: imblearn.under_sampling
:no-members:
Expand All @@ -26,6 +26,7 @@ Prototype generation

.. autosummary::
:toctree: generated/
:template: class.rst

under_sampling.ClusterCentroids

Expand All @@ -40,6 +41,7 @@ Prototype selection

.. autosummary::
:toctree: generated/
:template: class.rst

under_sampling.CondensedNearestNeighbour
under_sampling.EditedNearestNeighbours
Expand All @@ -54,8 +56,8 @@ Prototype selection

.. _over_sampling_ref:

Over-sampling methods
=====================
:mod:`imblearn.over_sampling`: Over-sampling methods
====================================================

.. automodule:: imblearn.over_sampling
:no-members:
Expand All @@ -65,6 +67,7 @@ Over-sampling methods

.. autosummary::
:toctree: generated/
:template: class.rst

over_sampling.ADASYN
over_sampling.RandomOverSampler
Expand All @@ -73,8 +76,8 @@ Over-sampling methods

.. _combine_ref:

Combination of over- and under-sampling methods
===============================================
:mod:`imblearn.combine`: Combination of over- and under-sampling methods
========================================================================

.. automodule:: imblearn.combine
:no-members:
Expand All @@ -84,15 +87,16 @@ Combination of over- and under-sampling methods

.. autosummary::
:toctree: generated/
:template: class.rst

combine.SMOTEENN
combine.SMOTETomek


.. _ensemble_ref:

Ensemble methods
================
:mod:`imblearn.ensemble`: Ensemble methods
==========================================

.. automodule:: imblearn.ensemble
:no-members:
Expand All @@ -102,15 +106,16 @@ Ensemble methods

.. autosummary::
:toctree: generated/
:template: class.rst

ensemble.BalanceCascade
ensemble.EasyEnsemble


.. _pipeline_ref:

Pipeline
========
:mod:`imblearn.pipeline`: Pipeline
==================================

.. automodule:: imblearn.pipeline
:no-members:
Expand All @@ -120,14 +125,20 @@ Pipeline

.. autosummary::
:toctree: generated/
:template: class.rst

pipeline.Pipeline

.. autosummary::
:toctree: generated/
:template: function.rst

pipeline.make_pipeline

.. _metrics_ref:

Metrics
=======
:mod:`imblearn.metrics`: Metrics
================================

.. automodule:: imblearn.metrics
:no-members:
Expand All @@ -137,6 +148,7 @@ Metrics

.. autosummary::
:toctree: generated/
:template: function.rst

metrics.classification_report_imbalanced
metrics.sensitivity_specificity_support
Expand All @@ -147,8 +159,8 @@ Metrics

.. _datasets_ref:

Datasets
========
:mod:`imblearn.datasets`: Datasets
==================================

.. automodule:: imblearn.datasets
:no-members:
Expand All @@ -158,12 +170,13 @@ Datasets

.. autosummary::
:toctree: generated/
:template: function.rst

datasets.make_imbalance
datasets.fetch_datasets

Utilities
=========
:mod:`imblearn.utils`: Utilities
================================

.. automodule:: imblearn.utils
:no-members:
Expand All @@ -173,6 +186,7 @@ Utilities

.. autosummary::
:toctree: generated/
:template: function.rst

utils.estimator_checks.check_estimator
utils.check_neighbors_object
Expand Down
56 changes: 56 additions & 0 deletions doc/combine.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
.. _combine:

=======================================
Combination of over- and under-sampling
=======================================

.. currentmodule:: imblearn.over_sampling

We previously presented :class:`SMOTE` and showed that this method can generate
noisy samples by interpolating new points between marginal outliers and
inliers. This issue can be solved by cleaning the resulted space obtained
after over-sampling.

.. currentmodule:: imblearn.combine

In this regard, Tomek's link and edited nearest-neighbours are the two cleaning
methods which have been added pipeline after SMOTE over-sampling to obtain a
cleaner space. Therefore, imbalanced-learn implemented two ready-to-use class
which pipeline both over- and under-sampling methods: (i) :class:`SMOTETomek`
and (ii) :class:`SMOTEENN`.

These two classes can be used as any other sampler with identical parameters
than their former samplers::

>>> from collections import Counter
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=5000, n_features=2, n_informative=2,
... n_redundant=0, n_repeated=0, n_classes=3,
... n_clusters_per_class=1,
... weights=[0.01, 0.05, 0.94],
... class_sep=0.8, random_state=0)
>>> print(Counter(y))
Counter({2: 4674, 1: 262, 0: 64})
>>> from imblearn.combine import SMOTEENN
>>> smote_enn = SMOTEENN(random_state=0)
>>> X_resampled, y_resampled = smote_enn.fit_sample(X, y)
>>> print(Counter(y_resampled))
Counter({1: 4381, 0: 4060, 2: 3502})
>>> from imblearn.combine import SMOTETomek
>>> smote_tomek = SMOTETomek(random_state=0)
>>> X_resampled, y_resampled = smote_tomek.fit_sample(X, y)
>>> print(Counter(y_resampled))
Counter({1: 4566, 0: 4499, 2: 4413})

We can also see in the example below that :class:`SMOTEENN` tends to clean more
noisy samples than :class:`SMOTETomek`.

.. image:: ./auto_examples/combine/images/sphx_glr_plot_comparison_combine_001.png
:target: ./auto_examples/combine/plot_comparison_combine.html
:scale: 60
:align: center

See :ref:`sphx_glr_auto_examples_combine_plot_smote_enn.py`,
:ref:`sphx_glr_auto_examples_combine_plot_smote_tomek.py`,
and
:ref:`sphx_glr_auto_examples_combine_plot_comparison_combine.py`.
Loading