Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG + 1] load_iris dataset: added return_X_y option #7049

Merged
merged 12 commits into from
Jul 29, 2016

Conversation

manu-chroma
Copy link
Contributor

Reference Issue

Add return_tuple option to data loaders that return a Bunch #6670

What does this implement/fix? Explain your changes.

  1. added return_X_y option in load_iris
  2. Wrote tests for the same

Any other comments?

  1. Majority of the code is taken from [MRG] Added load_iris return_X_y option #6704
  2. Some improvements based on the discussion on the same PR
  3. First contribution to scikit-learn

return_X_y : boolean, default=False.
If True, returns (data, target) instead of a Bunch object.
See below for more information about the `data` and `target` object

Returns
-------
data : Bunch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should change this line to reflect how the function can return a Bunch or a tuple now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's precisely what this line defines. Don't you think ?

  If True, returns (data, target) instead of a Bunch object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well on line 266 in the diff it still says that it returns data : Bunch. Would it be apt to include information about it possibly returning (data, target) as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thanks.

@amueller
Copy link
Member

lgtm apart from comments

@amueller
Copy link
Member

I restarted CI

@manu-chroma
Copy link
Contributor Author

@amueller I did the fixes. Do I need to squash commits or do anything else here ?

@@ -264,6 +270,8 @@ def load_iris():
meaning of the features, and 'DESCR', the
full description of the dataset.

(data,target) : tuple if ``return_X_y`` is True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8?

@amueller
Copy link
Member

lgtm apart from style comments. No, nothing else to do (except fix these)

@amueller amueller changed the title [MRG] load_iris dataset: added return_X_y option [MRG + 1] load_iris dataset: added return_X_y option Jul 28, 2016
@amueller
Copy link
Member

@nelson-liu +1 ?

@nelson-liu
Copy link
Contributor

still looking, give me a second

@nelson-liu
Copy link
Contributor

yup LGTM thanks @manu-chroma

@nelson-liu
Copy link
Contributor

nelson-liu commented Jul 28, 2016

does this warrant a what's new? or should there be one after this is implemented in all the loaders.

@nelson-liu
Copy link
Contributor

Also perhaps a versionadded?

@amueller
Copy link
Member

good catch. definitely a versionadded for the keyword. We can also add a whatsnew, and edit that in the future when we add this functionality to the other functions.

@nelson-liu
Copy link
Contributor

@manu-chroma does that sound doable? You would have to add an entry in whats_new.rst, and the appropriate "versionadded" tags below your new parameter (see this as an example)

@manu-chroma
Copy link
Contributor Author

@nelson-liu Sure, I can do that.

I was thinking of updating the whats_new.rst after I have implemented this for all loaders, which I suppose wouldn't take more than a few days. Thoughts ?

@nelson-liu
Copy link
Contributor

yeah, that's one possibility but i think it would be a bit better to create an entry now, and then edit it as the functionality for more loaders is added with their associated PR's.

@manu-chroma
Copy link
Contributor Author

Okay, I'll update the PR in a few hours. I suppose this change should come under Enhancements in whats_new.rst ?

@nelson-liu
Copy link
Contributor

yup, that's a good assumption.

@manu-chroma
Copy link
Contributor Author

Hey, I updated my branch before editing whats_new.rst

After that I added the version tag (f4d6168) and tried running sudo make html-noplot
It failed. Here's the error log. Error is because of a different file.

manu@hp:~/github/scikit-learn/doc$ sudo make html-noplot 
sphinx-build -D plot_gallery=0 -b html -d _build/doctrees   . _build/html/stable
Running Sphinx v1.4.5
WARNING: sphinx.ext.pngmath has been deprecated. Please use sphinx.ext.imgmath instead.
loading pickled environment... not yet created
/home/manu/github/scikit-learn/sklearn/cross_validation.py:43: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
/home/manu/github/scikit-learn/sklearn/learning_curve.py:23: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the functions are moved. This module will be removed in 0.20
  DeprecationWarning)

Exception occurred:
  File "/home/manu/github/scikit-learn/sklearn/tree/tree.py", line 61, in <module>
    "mae": _criterion.MAE}
AttributeError: 'module' object has no attribute 'MAE'
The full traceback has been saved in /tmp/sphinx-err-0pnBuo.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
Makefile:47: recipe for target 'html-noplot' failed
make: *** [html-noplot] Error 1

@nelson-liu
Copy link
Contributor

Did you rerun python setup.py build_ext --inplace?

@manu-chroma
Copy link
Contributor Author

Yes, that was the case. Really sorry for forgetting the standard procedure.

I've updated whats_new and added version tag. Thoughts ?

@@ -258,6 +258,15 @@ def load_iris():

Read more in the :ref:`User Guide <datasets>`.

Parameters
----------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an extra space here. Sphinx will complain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. 👍

@amueller amueller merged commit b8be019 into scikit-learn:master Jul 29, 2016
@amueller
Copy link
Member

thanks :)

Parameters
----------

.. versionadded:: 0.18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i'm late to the party, shouldn't the versionadded be under return_X_y? and do we need a versionadded under the new return type (say around line 280)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. Do you want to do a PR to fix it? (sorry I'm swamped)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure ill do it / fix the other comments i had on this PR @amueller @manu-chroma

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for replying late.

I intend on sending the next PR this weekend (adding return_X_y option to load_breast_cancer dataset)
Should I take care of this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine, i made a fix at #7138 . Just make sure to do it for next time :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry and I'll keep that in mind.

olologin pushed a commit to olologin/scikit-learn that referenced this pull request Aug 24, 2016
* load_iris dataset:added return_X_y option

* Updated return type description

* improved return type description

* Removed extra line

* Added extra line

* Added test sentence.

* Remove sample text

* Fixes

* pep8

* Added version tag

* added entry in whats_new

* fixed extra space
TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016
* load_iris dataset:added return_X_y option

* Updated return type description

* improved return type description

* Removed extra line

* Added extra line

* Added test sentence.

* Remove sample text

* Fixes

* pep8

* Added version tag

* added entry in whats_new

* fixed extra space
@manu-chroma manu-chroma deleted the load_iris branch June 20, 2017 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants