-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Weights parameter of datasets.make_classification changed to array-like from list only - Issue 14760 #14764
Merged
Merged
Changes from 33 commits
Commits
Show all changes
37 commits
Select commit
Hold shift + click to select a range
5cb4bda
Finalizes fix for #12202 from abandonned PR by @parul-l
CatChenal a39baed
Completes 12202 fix abandoned by @parul-l
CatChenal a5b73bb
Completes 12202 fix abandoned by @parul-l
CatChenal e9f9594
Extends 12202 fix over feature_extraction/image.py
CatChenal 35da14a
Closes #12202; white space removal
CatChenal 0730c7a
Closes #12202; white space removal2
CatChenal f81beda
Closes #12202; 3.5 compliance; added >>> in docstring code.
CatChenal 294ce61
Closes #12202; indentation discrep.
CatChenal 4f12330
Closes #12202; indentation discrep.2
CatChenal 2a8d782
Example output formating; @jnotham
CatChenal d14b3b6
Example output formating; forgot flake8
CatChenal 3f8b0b1
Closes #12202; Removed excessive indentation in docstring (#wimlds)
CatChenal 5fc5496
conflict resolution?
CatChenal 236dafb
Closes #12202; Fixed inconsistent indentation in docstring (#wimlds)
CatChenal 0bbbea4
Closes #12202 (#wimlds); intentation, v3.5 compliance
CatChenal f173bf6
Closes #12202 (#wimlds); Output format issue solved with addition of …
CatChenal aeae3b0
Closes #12202 (#wimlds); Output format issue solved with addition of …
CatChenal e38bca9
Closes #12202 (#wimlds); Testing doctest direc.: removed DONT_ACCEPT_…
CatChenal f9352ab
Closes #12202 (#wimlds); Removed blank lines in doctest example.
CatChenal de75f0b
weigts in make_classification as sequence not list (#wimlds)
CatChenal 106916f
resolved merge
CatChenal ec833ed
fix conflicts with upstream/master
CatChenal 8e3cb1b
split if-statement
CatChenal 776e74e
added test `test_make_classification_weights_type` in test_samples_ge…
CatChenal 4b700b8
fixed flake8 & pylint errors
CatChenal 0c7fec4
flake8 err in test file
CatChenal cac527a
Added parametrized tests for weights type.
CatChenal 6110038
Added untrapped TypeError in samples_generator.py and tests.
CatChenal 7946476
Added untrapped TypeError in samples_generator.py and tests.
CatChenal fd2eae6
Minor changes as per @NicolasHug
CatChenal a6dd8d9
Corrected `assert` statement in `test_make_multilabel_classification_…
CatChenal e9f89bf
Added coverage for weiths resizing in `test_make_classification_weigh…
CatChenal 0c2a124
Corrected `test_make_classification_weights_array_or_list_ok` as per …
CatChenal b185474
Prettified docstr + updated whats_new/v0.22.rst.
CatChenal 9e7c60b
Fixed rst problem in whats_new/v0.22.rst. :func:package.module.method…
CatChenal d0bf043
Fixed rst :func: ref as per @thomasjpfan.
CatChenal 11a179d
Removed organization link in release notes. #WiMLDS`
CatChenal File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -91,7 +91,8 @@ def make_classification(n_samples=100, n_features=20, n_informative=2, | |
n_clusters_per_class : int, optional (default=2) | ||
The number of clusters per class. | ||
|
||
weights : list of floats or None (default=None) | ||
weights : array-like of shape (n_classes,) or (n_classes - 1,), | ||
(default=None) | ||
The proportions of samples assigned to each class. If None, then | ||
classes are balanced. Note that if ``len(weights) == n_classes - 1``, | ||
then the last class weight is automatically inferred. | ||
|
@@ -160,22 +161,27 @@ def make_classification(n_samples=100, n_features=20, n_informative=2, | |
" features") | ||
# Use log2 to avoid overflow errors | ||
if n_informative < np.log2(n_classes * n_clusters_per_class): | ||
raise ValueError("n_classes * n_clusters_per_class must" | ||
" be smaller or equal 2 ** n_informative") | ||
if weights and len(weights) not in [n_classes, n_classes - 1]: | ||
raise ValueError("Weights specified but incompatible with number " | ||
"of classes.") | ||
msg = "n_classes({}) * n_clusters_per_class({}) must be" | ||
msg += " smaller or equal 2**n_informative({})={}" | ||
raise ValueError(msg.format(n_classes, n_clusters_per_class, | ||
n_informative, 2**n_informative)) | ||
|
||
if weights is not None: | ||
if len(weights) not in [n_classes, n_classes - 1]: | ||
raise ValueError("Weights specified but incompatible with number " | ||
"of classes.") | ||
if len(weights) == n_classes - 1: | ||
if isinstance(weights, list): | ||
weights = weights + [1.0 - sum(weights)] | ||
else: | ||
weights = np.resize(weights, n_classes) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That part isn't covered by the tests. I think you can cover it easily by setting n_classes=3 in |
||
weights[-1] = 1.0 - sum(weights[:-1]) | ||
else: | ||
weights = [1.0 / n_classes] * n_classes | ||
|
||
n_useless = n_features - n_informative - n_redundant - n_repeated | ||
n_clusters = n_classes * n_clusters_per_class | ||
|
||
if weights and len(weights) == (n_classes - 1): | ||
weights = weights + [1.0 - sum(weights)] | ||
|
||
if weights is None: | ||
weights = [1.0 / n_classes] * n_classes | ||
weights[-1] = 1.0 - sum(weights[:-1]) | ||
|
||
# Distribute samples among clusters by weight | ||
n_samples_per_cluster = [ | ||
int(n_samples * weights[k % n_classes] / n_clusters_per_class) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently this is not rendered nicely.
To render nicely:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @thomasjpfan.
Would you please document how you reached that end-point to verify the rendering? My doc tree does not have a
/modules/generated/
path.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you build the html documentation using these instructions, there will be a new folder:
doc/_build
which containsdoc/_build/html/stable/index.html
which is the landing page of the scikit-learn. From there you can navigate to themake_classification
docs by going to the API page.