New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sklearn.datasets.make_classification modifies its weights input. #9865

Closed
stuz5000 opened this Issue Oct 2, 2017 · 6 comments

Comments

Projects
None yet
5 participants
@stuz5000

stuz5000 commented Oct 2, 2017

Description

sklearn.datasets.make_classification modifies its weights parameters. Rude!

Steps/Code to Reproduce


init_weights = [0.4]
weights = list(init_weights)

X,y = sklearn.datasets.make_classification(n_samples=100,
                                           n_features=4, n_informative=2, n_redundant=2, n_repeated=0,
                                           n_classes=2, n_clusters_per_class=2, weights=weights,
                                           flip_y=0.01, class_sep=1.0, hypercube=True,
                                           shift=0.0, scale=1.0, shuffle=True, random_state=123)

assert weights == init_weights

Expected Results

weights is modified. Should be copied

Actual Results

Above assert is hit.

Versions

Current:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/samples_generator.py

165: weights.append(1.0 - sum(weights))

Should be:

  weights = weights + [1.0 - sum(weights)]
@stuz5000

This comment has been minimized.

Show comment
Hide comment
@stuz5000

stuz5000 Oct 2, 2017

Actually, because also:

        weights[-1] = 1.0 - sum(weights[:-1])

... need to add:

  weights=tuple(weights)  # Or
  weights=list(weights)

... or similar at the start. (Should copy to ndarray is weights is ndarray or Series?)

stuz5000 commented Oct 2, 2017

Actually, because also:

        weights[-1] = 1.0 - sum(weights[:-1])

... need to add:

  weights=tuple(weights)  # Or
  weights=list(weights)

... or similar at the start. (Should copy to ndarray is weights is ndarray or Series?)

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Oct 3, 2017

Member
Member

jnothman commented Oct 3, 2017

@infinite-Joy

This comment has been minimized.

Show comment
Hide comment
@infinite-Joy

infinite-Joy Oct 5, 2017

Sorry if this is a silly question, but I am not able to understand why is this an issue.

infinite-Joy commented Oct 5, 2017

Sorry if this is a silly question, but I am not able to understand why is this an issue.

@stuz5000

This comment has been minimized.

Show comment
Hide comment
@stuz5000

stuz5000 Oct 5, 2017

stuz5000 commented Oct 5, 2017

@kaichogami

This comment has been minimized.

Show comment
Hide comment
@kaichogami

kaichogami Oct 6, 2017

Contributor

Hi, I would like to work on this issue.

Contributor

kaichogami commented Oct 6, 2017

Hi, I would like to work on this issue.

@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Oct 6, 2017

Member

Go for it! It would be great to add a test as well.

Member

lesteve commented Oct 6, 2017

Go for it! It would be great to add a test as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment