Renaming of label to classes #1591

Closed
amueller opened this Issue Jan 17, 2013 · 32 comments

Comments

Projects
None yet
6 participants
Owner

amueller commented Jan 17, 2013

I don't like saying this but to me it seems that the parameter labels of confusion_matrix should be called classes.
In general, we should do a git grep labels and git grep classes.
I think we use classes in most places now (in particular classifier.classes_) and we should probably not use the name labels anywhere.

Update

We use labels_ in clustering. I guess that should not be renamed to classes_ as that would be very confusing.
So do we keep labels_ there? And use classes everywhere in the supervised setting?

Contributor

hrishikeshio commented Feb 12, 2013

Hi I'm new here. I'm 250th on kaggle :)
I really want to contribute to this project. Maybe I can get started with this issue. Can you guide me If its not too much trouble?

Owner

amueller commented Feb 12, 2013

Hi. Thanks for wanting to contribute. I'd really like some other core dev to give their 👍 on this issue before you start working on it. Give me a minute and maybe I'll find something nice for you :)

Owner

GaelVaroquaux commented Feb 12, 2013

I am 👍 as long as there is a good deprecation path.

Contributor

hrishikeshio commented Feb 12, 2013

@amueller Great! Thanks!

Owner

amueller commented Feb 12, 2013

I did the numbers and I think I also convinced @arjoly that classes is probably the way to go. So if @GaelVaroquaux agrees I guess you can get started if you like. The idea would be to do a git grep labels and deprecate all parameters with that name and add a new parameter that does the same and is called classes.

You can also first have a go at #1652 as @nAk123 didn't seem to be working on it and it is pretty straight-forward - it also needs a deprecation, though.

Owner

amueller commented Feb 12, 2013

Another nice addition would be #1521 which requires adding tests to the metrics module.

Contributor

hrishikeshio commented Feb 12, 2013

I think i will start with #1652 so I can settle in.
What do you mean by "it needs a deprecation"?

Owner

amueller commented Feb 12, 2013

I commented there.

Owner

amueller commented Feb 12, 2013

@GaelVaroquaux we should add a comment to the contributor docs about establishing backward compatibility and deprecations.

Owner

arjoly commented Feb 12, 2013

I did the numbers and I think I also convinced @arjoly that classes is probably the way to go. So if @GaelVaroquaux agrees I guess you can get started if you like. The idea would be to do a git grep labels and deprecate all parameters with that name and add a new parameter that does the same and is called classes.

From the user point of view, consistency wins.

Owner

GaelVaroquaux commented Feb 12, 2013

@GaelVaroquaux we should add a comment to the contributor docs about
establishing backward compatibility and deprecations.

Yes, that would be great!

Owner

amueller commented Feb 18, 2013

@hrishikeshio you could now try this one maybe?

Contributor

hrishikeshio commented Feb 18, 2013

Sounds interesting. I will work on it tomorrow :) Thanks @amueller

Contributor

hrishikeshio commented Feb 19, 2013

What deprecation warning shoud I write? When will we completely remove "labels"?

    if labels is not None:
        warnings.warn("``labels`` is deprecated and will be removed in"
                      "version 0.15. Use ``classes`` instead",
                      DeprecationWarning)
        classes = labels
Owner

amueller commented Feb 19, 2013

Two version ahead, so 0.15.

Contributor

hrishikeshio commented Feb 19, 2013

How to give deprecation warning when someone calls attribute?
like this:

print alg.labels_
Owner

amueller commented Feb 19, 2013

@property
@deprecated("it will be removed in 0.15. Use 'classes_' instead")
def labels_(self):
    return self.classes_
Owner

amueller commented Feb 19, 2013

If you want, a pull request putting this in the developer docs would also be welcome ;)

Contributor

hrishikeshio commented Feb 19, 2013

Thanks. Maybe I will do that.

Contributor

hrishikeshio commented Feb 19, 2013

Should I update the tests too?

Owner

amueller commented Feb 19, 2013

Ideally I think you would leave the old tests in, catch the deprecation warnings and add new tests. That way we know we are backward compatible and also have tests for the new behavior.

Contributor

hrishikeshio commented Feb 19, 2013

Done for clusters/
except for kmeans. I need advice.
The _kmeans.py has lots of functions/attributs/variables with "label" in it.
Should those all be deprecated?
Passes all the tests but I also I need to update the tests to make sure it is giving deprecation warnings.
Please review and advice if I am on the right path.
Also, what about .pyx files? I have never seen those before. Seem like cython files.

Owner

amueller commented Mar 9, 2013

I didn't think about the clustering algorithms :-/ I don't think we should rename labels_ to classes_.

Owner

ogrisel commented Mar 11, 2013

I didn't think about the clustering algorithms :-/ I don't think we should rename labels_ to classes_.

I was not sure my-self but in retrospect why wouldn't using classes_ everywhere (even for clustering algorithms) be a bad idea (despite introducing yet another API change)?

Owner

amueller commented Mar 11, 2013

Because classes_ in supervised learning means the unique classes present in y, i.e. the string-names of the labels (as now supported) (shape n_classes)

The labels_ attribute on the other hand contains the cluster-indices assigned to the training set. (shape n_samples)

Owner

ogrisel commented Mar 11, 2013

Argl indeed, I forgot myself.

Owner

amueller commented Mar 11, 2013

Yeah also totally overlooked that :-/

Contributor

hrishikeshio commented Mar 12, 2013

Ok. Can you merge only the files you need? Or do I need to undo changes using git?

Owner

amueller commented Mar 12, 2013

I will give it a try when I find time.

Owner

amueller commented Mar 12, 2013

Any one else feel welcome to do it before ;)

Contributor

rajatkhanduja commented May 16, 2014

After reading the comments here and on the PR, I am still not sure what needs to be done further to resolve this. Can someone help me out here?

Owner

amueller commented Jul 16, 2014

closing this. I think the codebase is pretty consistent now.

amueller closed this Jul 16, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment