Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
188 commits
Select commit Hold shift + click to select a range
44f86f5
Wrote the test for clustering centroids under-sampling
May 24, 2016
69ef76f
Remove the verbose from the coverage
May 24, 2016
fc5f893
Modify the sampler for full numpy support
May 24, 2016
bcde1cb
Wrote the test for cnn and nearmiss
May 26, 2016
73836e3
Finish under-sampling tests
May 26, 2016
6282518
Starting to write testing for easy ensemble
May 27, 2016
d841c67
Make the testing for the easy ensemble method
May 27, 2016
843bc85
Finish the testing for the ensemble method
Jun 7, 2016
01ba4c1
Address the problem of ratio
May 30, 2016
4fd6e8f
Switch to fully numpy random packages
Jun 10, 2016
f603847
Add the test for the over-sampling methods
Jun 10, 2016
63801d2
Add test for combine method
Jun 10, 2016
9b44e05
Merge pull request #62 from glemaitre/test_combine
glemaitre Jun 10, 2016
303ecde
PEP8
Jun 10, 2016
94170b2
Refactor the setup files
Jun 14, 2016
fda865a
Remove accentuation
Jun 14, 2016
b80cf03
Update the documentation
Jun 15, 2016
beb188a
Change the directory where to compile the doc
Jun 15, 2016
4610ce8
Add automatically the .nojekyll
Jun 15, 2016
545b781
Update the README.md
glemaitre Jun 17, 2016
08cae89
Update the LICENSE.md
glemaitre Jun 17, 2016
1dc5fef
Merge remote-tracking branch 'main/master' into instance_hardness
dvro Jun 17, 2016
ee51225
instance hardness updated
dvro Jun 17, 2016
542c76e
under-sampling instance hardness threshold pep8
dvro Jun 17, 2016
48d66a2
Change gitignore to avoid commmitting backup emacs file
Jun 18, 2016
d085bc3
PEP8
Jun 18, 2016
7b6ba9e
PEP8 and examples for IHT
Jun 18, 2016
d78887d
Correct the error with PCA in the example
Jun 18, 2016
27d7e0e
Merge pull request #1 from glemaitre/instance_hardness
dvro Jun 18, 2016
aca7dfb
Merge pull request #68 from dvro/instance_hardness
glemaitre Jun 18, 2016
bddc4a4
Update the notebook
Jun 18, 2016
7117abd
Update the README.md
glemaitre Jun 18, 2016
9643df2
Update the README.md
glemaitre Jun 18, 2016
3b0d322
Update the README.md
glemaitre Jun 18, 2016
46c3147
Minor documentation edits
proinsias Jun 14, 2016
44955a6
Merge pull request #64 from proinsias/proinsias-rus-docs
glemaitre Jun 19, 2016
bd25c8b
Update the documentation
Jun 20, 2016
e2115d6
Raise an error at fitting time if the ratio do not make sense.
Jun 21, 2016
49cf84e
RENN added
dvro Jun 24, 2016
6f3c6fa
example RENN added
dvro Jun 24, 2016
c70ae30
RepeatedEditedNearestNeighbors pep8
dvro Jun 24, 2016
e9b5a81
Merge pull request #73 from dvro/renn
glemaitre Jun 24, 2016
7ac94b8
Update the RENN with test and doc
Jun 24, 2016
399f4a7
Advance the compatibility with scikit-learn
Jun 22, 2016
fbeb485
chnage smote initialisation
Jun 22, 2016
ef11cc8
Finish to update the doc
Jun 22, 2016
e6e9a28
Improve testing of instance hardness threshold
Jun 24, 2016
8303f1a
Add data for testing
Jun 24, 2016
eede039
Update the version and the README file
Jun 24, 2016
d73f287
Update Readme
Jun 24, 2016
edf4747
Update the README
Jun 24, 2016
4762158
Update readme
Jun 24, 2016
28e1116
Update README.md
Jun 24, 2016
8a0f010
Change RENN for scikit-learn compatibility
Jun 24, 2016
5c0d42f
Update the readme
Jun 24, 2016
c02db3d
Remove unecessary import
Jun 24, 2016
0fcd502
Renaming the base class such as in sklearn
Jun 25, 2016
13f647d
Forgot to add the base class
Jun 25, 2016
20b2458
Update the README.md
Jun 25, 2016
bca099d
Replace fit_transform method with the new fit_sample API.
Jun 25, 2016
52ea6fd
Merge pull request #79 from chkoar/fix-fit_sample-in-examples
glemaitre Jun 25, 2016
567ef55
Modify Pipeline object to conform the current API of samplers
Jun 25, 2016
d2b6102
Inherit from sklearn.pipeline.Pipeline instead of copy.
Jun 26, 2016
76a9148
Enforce to get same data at fitting and sampling
Jun 26, 2016
49753dd
Clean more thing using the makefile
Jun 26, 2016
72a16a7
[WIP] Adding testing for pipeline (#1)
glemaitre Jun 26, 2016
47e3507
Merge pull request #80 from chkoar/pipeline
glemaitre Jun 26, 2016
562a746
Doc fix
Jun 26, 2016
3497ccf
Update the doc for pipeline
Jun 26, 2016
a833fac
Update doc
Jun 26, 2016
77d57ea
Merge pull request #81 from chkoar/pipeline
glemaitre Jun 26, 2016
1c30fbe
DOC solve issue sphinx
Jun 26, 2016
b6e015e
Change the package name
Jun 26, 2016
ed14eb2
Remove any mentions of unbalanced_dataset
Jun 26, 2016
2c4a363
Merge pull request #83 from chkoar/remove_unbalanced_mentions
glemaitre Jun 26, 2016
c687bc3
Add adasyn
Jun 27, 2016
d60c411
Merge pull request #86 from glemaitre/adasyn
glemaitre Jun 27, 2016
fd90d17
Add circle ci for the documentation
Jun 27, 2016
2568315
Solve the problem with yaml circle
Jun 27, 2016
ae01164
Install seaborn in circleci
Jun 27, 2016
d2f7e98
Remove unsued package
Jun 27, 2016
daf94ee
Add circleci badge
Jun 27, 2016
62f9f19
Appveyor first attempt
Jun 27, 2016
627c78c
Update the appveyor
Jun 27, 2016
da7ec38
use codec for encoding issue when opening file - issue #87
Jun 27, 2016
e97ea2f
Correct the error in ADASYN
Jun 27, 2016
5d3af5e
Add requirements.txt
Jun 30, 2016
b8904ed
Add additional file for pypi release
Jun 30, 2016
0e88bad
Change the doc
Jun 30, 2016
4fcca54
Update the notebook
Jul 5, 2016
ceee5d7
Refactoring Init (OverSamplers)
chkoar Jun 28, 2016
be6fadc
Complete test and small error for ROS
Jun 28, 2016
7eb10b9
Update SMOTE
Jun 28, 2016
af4fbda
Update ADASYN and SMOTE help
Jun 28, 2016
fdbc6e4
Update API under-sampling
Jun 28, 2016
6a3c5de
Update the ensemble method
Jun 29, 2016
fa91ee1
Finish the combine method
Jun 29, 2016
5685f15
Remoce unecessary package
Jun 29, 2016
1a9a1b8
PEP8
Jun 29, 2016
642a5d4
add logger in base class
Jul 4, 2016
445c7c6
Prevent logger from being pickled
chkoar Jul 4, 2016
6482830
move the logger at init and use a copy of the dictionary
Jul 4, 2016
ed88516
Modify verbose for logging messages
Jul 4, 2016
19be425
Get logger in fit and sample using private method
chkoar Jul 5, 2016
12ac7d8
implement setstate for the pickle
Jul 5, 2016
e5ea26d
Change maintainers
Jul 5, 2016
721bc2a
Change the printing style in logging
Jul 5, 2016
7bad028
[DOC] Fix minor typo
proinsias Jun 30, 2016
ea6afe0
Merge pull request #90 from proinsias/patch-1
glemaitre Jul 6, 2016
342c36c
Change the md to rst
Jul 6, 2016
47dd145
Update the setup and README
Jul 6, 2016
b1f92ee
Rename the license file
Jul 6, 2016
11f3865
Update the contributors page for rst style
Jul 6, 2016
7506185
Address issue #93
Jul 7, 2016
dccb47e
Add a todo list
Jul 7, 2016
5f20c3d
Update the opening of the README in setup.py - address issue #94
Jul 8, 2016
45e6457
Merge branch 'refactor'
Jul 9, 2016
3978694
Upate the doc
Jul 9, 2016
53a4485
Update the install for conda and pypi release
Jul 9, 2016
aa5e428
Update the version for pypi
Jul 9, 2016
6db5116
Add conda recipe in the repo directly
Jul 10, 2016
071c13c
Bump version: 0.1.1 → 0.1.2.dev0
Jul 10, 2016
b0391fe
Add support with bumpversion
Jul 10, 2016
038daee
Rename UnbalancedDataset to imbalanced-learn
chkoar Jul 14, 2016
c10bfdb
Update all the badges
Jul 19, 2016
b48e384
Change repository name
Jul 19, 2016
c81d5a0
Update webhook
Jul 19, 2016
80bf2d5
Avoid committing with bumpversion
Jul 19, 2016
9546677
Update gitter webhook
Jul 19, 2016
bd3566a
bumpversion 0.1.1 -> 0.1.2
Jul 19, 2016
34143ba
Modidy conda recipe
Jul 19, 2016
a900985
Add the methods which have been implemented in the 0.1.X release
Jul 19, 2016
d6fc822
Address issue #100 - Add exeption when no NN in majority class are found
Jul 19, 2016
28bc2db
bumpversion 0.1.2 -> 0.1.3
Jul 19, 2016
85acb8b
Avoid to recopy the data in RENN
Jul 21, 2016
0f6a5aa
Bump version: 0.1.1 → 0.2.0.dev0
Jul 10, 2016
6bc9e96
Add the api changes in the todo list
Jul 10, 2016
4e52c71
Add doctest
Jul 10, 2016
54c12a6
Avoid testing CNN for doctest
Jul 11, 2016
da02bbc
Update the docstring
Jul 19, 2016
2a03bf6
added AllKNN under-sampling method (#97)
dvro Jul 21, 2016
3655756
Remove fetch doctest in pipeline
Jul 21, 2016
8cdfaad
added RENN and AllKNN to plot_unbalanced_dataset.ipynb and removed ve…
dvro Jul 23, 2016
cfc73bc
Address issue #107 - ADASYN docstring (#108)
glemaitre Jul 25, 2016
0c46346
Remove collections import from SMOTEENN
glemaitre Jul 26, 2016
b26be15
Resolve #111 - Handle multiclass/binary targets
glemaitre Jul 29, 2016
a7c6158
adding make_imbalance function
dvro Jul 30, 2016
84e6af4
Merge pull request #115 from dvro/datasets
dvro Jul 31, 2016
cca1d42
Add visual studio project files in .gitignore (#120)
chkoar Jul 31, 2016
c027b25
Solve issue #116 - Create proper RandomState in EasyEnsemble (#117)
glemaitre Jul 31, 2016
4266580
Update the testing file
Jul 31, 2016
19969f6
[MRG] Make imbalance (#119)
glemaitre Jul 31, 2016
23f1ffc
Fix issue #124
glemaitre Aug 9, 2016
2b8643f
Update the doc and the notebook
Aug 9, 2016
6b5c9d9
Remove UnbalancedDataset references (#127)
chkoar Aug 17, 2016
4c24526
Address #131 - Replace nonzero by flatnonzero whenever possible (#132)
glemaitre Aug 30, 2016
f905274
Close #133 - Change assert by assert_true (#134)
glemaitre Aug 30, 2016
43c6a6a
Solving the issue of the stopping criterion of the RENN
Aug 30, 2016
4ebe7df
Update the history
Aug 30, 2016
75e886e
Add stopping criteria
Aug 30, 2016
197c119
Fix the bug about the indices of CNN
Aug 30, 2016
db4a0e5
Update the history
Aug 30, 2016
ae403a0
Fix the warning in Nearmiss to inform the user about the number of sa…
Aug 30, 2016
5619c72
Solve the issue when having only one subset
Aug 31, 2016
dd5d0c0
Update the history
Aug 31, 2016
7721f07
[MRG] Address issue #129 - Add specific stopping criteria for the REN…
glemaitre Aug 31, 2016
a5379ab
Merge branch 'issue_130'
Aug 31, 2016
80f8f6b
Merge branch 'issue_137'
Aug 31, 2016
f052175
Address issue #140 - Add condition to raise warning in NearMiss
glemaitre Aug 31, 2016
1660c23
Merge branch 'issue_142'
Aug 31, 2016
49a5692
[MRG] Address issue #113 - Create toy example for testing (#118)
glemaitre Aug 31, 2016
13d59a9
Add badge for appveyor
Aug 31, 2016
ae05226
Add a fetch_benchmark
Sep 3, 2016
e37df40
Add the test for the benchmark
Sep 4, 2016
2480c01
Update the fetching and test for benchmark
Sep 4, 2016
2bca0d3
Create the benchmark for undersampling methods
Sep 4, 2016
90736e9
Solve the last issue
Sep 4, 2016
61fa204
Add benchmark for over-sampling and combine methods
Sep 4, 2016
1d139ea
Add time complexity
Sep 4, 2016
733ca95
Fix the plot for benchmark
Sep 4, 2016
9f3f276
Solve problem of variable
Sep 4, 2016
3b7087a
Fix variable name
Sep 4, 2016
8349e83
Update the documentation
Sep 4, 2016
67f48c7
Add a readme in benchmark
Sep 4, 2016
4d4cec1
Change extension README
Sep 4, 2016
f73d1c0
Add README
Sep 4, 2016
2bbce49
Chnage the plot
Sep 5, 2016
264b04d
Add current benchmark results
Sep 8, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
9 changes: 6 additions & 3 deletions .coveragerc
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@

[run]
branch = True
source = unbalanced_dataset
include = */unbalanced_dataset/*
source = imblearn
include = */imblearn/*
omit =
*/setup.py
*/benchmarks/*

[report]
exclude_lines =
Expand All @@ -16,4 +17,6 @@ exclude_lines =
raise AssertionError
raise NotImplementedError
if 0:
if __name__ == .__main__.:
if __name__ == .__main__.:
if self.verbose:
show_missing = True
12 changes: 11 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,14 @@ docs/_build/
# PyBuilder
target/

*~
# vim
*.swp

# emacs
*~

# Visual Studio
*.sln
*.pyproj
*.suo
*.vs
10 changes: 9 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ env:
global:
# Directory where tests are run from
- TEST_DIR=/tmp/test_dir
- MODULE=unbalanced_dataset
- MODULE=imblearn
- OMP_NUM_THREADS=4
- OPENBLAS_NUM_THREADS=4
matrix:
Expand All @@ -36,3 +36,11 @@ env:
install: source build_tools/travis/install.sh
script: bash build_tools/travis/test_script.sh
after_success: source build_tools/travis/after_success.sh

notifications:
webhooks:
urls:
- https://webhooks.gitter.im/e/188e3c7a5180fd4f2120
on_success: always # options: [always|never|change] default: always
on_failure: always # options: [always|never|change] default: always
on_start: never # options: [always|never|change] default: always
16 changes: 16 additions & 0 deletions AUTHORS.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
History
-------

Development lead
~~~~~~~~~~~~~~~~

The project started in August 2014 by Fernando Nogueira and focused on SMOTE implementation.
Together with Guillaume Lemaitre, Dayvid Victor, and Christos Aridas, additional under-sampling and over-sampling methods have been implemented as well as major changes in the API to be fully compatible with scikit-learn_.

Contributors
------------

Refers to GitHub contributors page_.

.. _scikit-learn: http://scikit-learn.org
.. _page: https://github.com/scikit-learn-contrib/imbalanced-learn/graphs/contributors
179 changes: 179 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@

Contributing code
=================

This guide is adapted from (scikit-learn)[https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md]

How to contribute
-----------------

The preferred way to contribute to imbalanced-learn is to fork the
[main repository](https://github.com/scikit-learn-contrib/imbalanced-learn) on
GitHub:

1. Fork the [project repository](https://github.com/scikit-learn-contrib/imbalanced-learn):
click on the 'Fork' button near the top of the page. This creates
a copy of the code under your account on the GitHub server.

2. Clone this copy to your local disk:

$ git clone git@github.com:YourLogin/imbalanced-learn.git
$ cd UnbalancedDataset

3. Create a branch to hold your changes:

$ git checkout -b my-feature

and start making changes. Never work in the ``master`` branch!

4. Work on this copy on your computer using Git to do the version
control. When you're done editing, do:

$ git add modified_files
$ git commit

to record your changes in Git, then push them to GitHub with:

$ git push -u origin my-feature

Finally, go to the web page of your fork of the imbalanced-learn repo,
and click 'Pull request' to send your changes to the maintainers for
review. This will send an email to the committers.

(If any of the above seems like magic to you, then look up the
[Git documentation](http://git-scm.com/documentation) on the web.)

Contributing Pull Requests
--------------------------

It is recommended to check that your contribution complies with the
following rules before submitting a pull request:

- Follow the
[coding-guidelines](http://scikit-learn.org/dev/developers/contributing.html#coding-guidelines)
as for scikit-learn.

- When applicable, use the validation tools and other code in the
`sklearn.utils` submodule. A list of utility routines available
for developers can be found in the
[Utilities for Developers](http://scikit-learn.org/dev/developers/utilities.html#developers-utils)
page.

- If your pull request addresses an issue, please use the title to describe
the issue and mention the issue number in the pull request description to
ensure a link is created to the original issue.

- All public methods should have informative docstrings with sample
usage presented as doctests when appropriate.

- Please prefix the title of your pull request with `[MRG]` if the
contribution is complete and should be subjected to a detailed review.
Incomplete contributions should be prefixed `[WIP]` to indicate a work
in progress (and changed to `[MRG]` when it matures). WIPs may be useful
to: indicate you are working on something to avoid duplicated work,
request broad review of functionality or API, or seek collaborators.
WIPs often benefit from the inclusion of a
[task list](https://github.com/blog/1375-task-lists-in-gfm-issues-pulls-comments)
in the PR description.

- All other tests pass when everything is rebuilt from scratch. On
Unix-like systems, check with (from the toplevel source folder):

$ make

- When adding additional functionality, provide at least one
example script in the ``examples/`` folder. Have a look at other
examples for reference. Examples should demonstrate why the new
functionality is useful in practice and, if possible, compare it
to other methods available in scikit-learn.

- Documentation and high-coverage tests are necessary for enhancements
to be accepted.

- At least one paragraph of narrative documentation with links to
references in the literature (with PDF links when possible) and
the example.

You can also check for common programming errors with the following
tools:

- Code with good unittest coverage (at least 80%), check with:

$ pip install nose coverage
$ nosetests --with-coverage path/to/tests_for_package

- No pyflakes warnings, check with:

$ pip install pyflakes
$ pyflakes path/to/module.py

- No PEP8 warnings, check with:

$ pip install pep8
$ pep8 path/to/module.py

- AutoPEP8 can help you fix some of the easy redundant errors:

$ pip install autopep8
$ autopep8 path/to/pep8.py

Filing bugs
-----------
We use Github issues to track all bugs and feature requests; feel free to
open an issue if you have found a bug or wish to see a feature implemented.

It is recommended to check that your issue complies with the
following rules before submitting:

- Verify that your issue is not being currently addressed by other
[issues](https://github.com/scikit-learn-contrib/imbalanced-learn/issues?q=)
or [pull requests](https://github.com/scikit-learn-contrib/imbalanced-learn/pulls?q=).

- Please ensure all code snippets and error messages are formatted in
appropriate code blocks.
See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks).

- Please include your operating system type and version number, as well
as your Python, scikit-learn, numpy, and scipy versions. This information
can be found by runnning the following code snippet:

```python
import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)
```

- Please be specific about what estimators and/or functions are involved
and the shape of the data, as appropriate; please include a
[reproducible](http://stackoverflow.com/help/mcve) code snippet
or link to a [gist](https://gist.github.com). If an exception is raised,
please provide the traceback.

Documentation
-------------

We are glad to accept any sort of documentation: function docstrings,
reStructuredText documents (like this one), tutorials, etc.
reStructuredText documents live in the source code repository under the
doc/ directory.

You can edit the documentation using any text editor and then generate
the HTML output by typing ``make html`` from the doc/ directory.
Alternatively, ``make`` can be used to quickly generate the
documentation without the example gallery. The resulting HTML files will
be placed in _build/html/ and are viewable in a web browser. See the
README file in the doc/ directory for more information.

For building the documentation, you will need
[sphinx](http://sphinx.pocoo.org/),
[matplotlib](http://matplotlib.sourceforge.net/), and
[pillow](http://pillow.readthedocs.org/en/latest/).

When you are writing documentation, it is important to keep a good
compromise between mathematical and algorithmic details, and give
intuition to the reader on what the algorithm does. It is best to always
start with a small paragraph with a hand-waving explanation of what the
method does to the data and a figure (coming from an example)
illustrating it.
4 changes: 3 additions & 1 deletion LICENSE.md → LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
The MIT License (MIT)

Copyright (c) 2014 Fernando M. F. Nogueira
Copyright (c) 2014 Fernando M. F. Nogueira,
Guillaume Lemaitre,
Dayvid Victor

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
7 changes: 7 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

recursive-include doc *
recursive-include examples *
include AUTHORS.rst
include CONTRIBUTING.ms
include LICENSE
include README.rst
14 changes: 9 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,20 @@ clean:
rm -rf coverage
rm -rf dist
rm -rf build
rm -rf doc/auto_examples
rm -rf doc/generated
rm -rf doc/modules
rm -rf examples/.ipynb_checkpoints

test:
$(NOSETESTS) -s -v unbalanced_dataset

# doctest:
# $(PYTHON) -c "import unbalanced_dataset, sys, io; sys.exit(unbalanced_dataset.doctest_verbose())"
$(NOSETESTS) -s -v imblearn

coverage:
$(NOSETESTS) unbalanced_dataset -s -v --with-coverage --cover-package=unbalanced_dataset
$(NOSETESTS) imblearn -s -v --with-coverage --cover-package=imblearn

html:
conda install -y sphinx sphinx_rtd_theme numpydoc
export SPHINXOPTS=-W; make -C doc html

conda:
conda-build conda-recipe
84 changes: 0 additions & 84 deletions README.md

This file was deleted.

Loading