ENH Add parameter return_X_y to make_classification#30196
Conversation
|
@adrinjalali could you review this? |
adrinjalali
left a comment
There was a problem hiding this comment.
Please have a look at fetch_openml on how to document return_X_y and the documentation of the return values.
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
OmarManzoor
left a comment
There was a problem hiding this comment.
Thanks for the PR @SuccessMoses
I added a few comment
|
@OmarManzoor Thanks for the review. I am working on it |
OmarManzoor
left a comment
There was a problem hiding this comment.
LGTM. Thanks @SuccessMoses
|
CC: @adrinjalali |
| ) | ||
| if len(weights) == n_classes - 1: | ||
| if isinstance(weights, list): | ||
| weights = weights + [1.0 - sum(weights)] |
There was a problem hiding this comment.
this isn't modifying the existing variable, it's allocating a new chunk of memory, and the nameweights refers to that new chunk. So the original data passed to this function is never changed. Therefore this change in this PR is unnecessary.
There was a problem hiding this comment.
I did not intend to modify the variable weights passed by the user this is why I created new variable weights_
There was a problem hiding this comment.
You won't be changing the original variable, that's now how python works 😉
Investigate this example:
import numpy as np
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
def f(a):
a = a + 1
return a
print(f(a))
print(a)There was a problem hiding this comment.
this isn't modifying the existing variable, it's allocating a new chunk of memory,
Thank you @adrinjalali for the correction. make_classification returns a bunch object which also contains a dictionary of the original value of parameters like n_samples, n_features and weights that was used to generate the data.
reassigning the variable weights will change its content. Is there a way to work around this?
Reference Issues/PRs
Fixes #16532
What does this implement/fix? Explain your changes.
make_classificationAny other comments?
The dataset returned by
load_irisis a Bunch, which is more descriptive. #16532 proposes same.