Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] added return_X_y option to toy datasets in sklearn.datasets #7154

Merged
merged 8 commits into from Aug 7, 2016

Conversation

Projects
None yet
4 participants
@manu-chroma
Copy link
Contributor

manu-chroma commented Aug 6, 2016

Reference Issue

Fixes #6670

What does this implement/fix? Explain your changes.

added return_X_y option to following datasets

  1. load_digits
  2. load_diabetes
  3. load_linnerud
  4. load_boston

Any other comments?

Continuation of #7049

@@ -440,6 +446,10 @@ def load_digits(n_class=10):
sample, 'target_names', the meaning of the labels, and 'DESCR',
the full description of the dataset.
(data, target) : tuple if ``return_X_y`` is True
.. versionadded:: 0.18

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

I assume this doesn't get formatted under the description of the return type...

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

I.e. it would not be clear what this applies to

This comment has been minimized.

Copy link
@manu-chroma

manu-chroma Aug 6, 2016

Author Contributor

This is what it currently looks like
screenshot from 2016-08-06 16-59-14

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

It should be indented.

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

It's not rendering correctly at all: it's being parsed as a return value name.

This comment has been minimized.

Copy link
@manu-chroma

manu-chroma Aug 6, 2016

Author Contributor

You're talking about this line ?
(data, target) : tuple if return_X_y is True

This comment has been minimized.

Copy link
@manu-chroma

manu-chroma Aug 6, 2016

Author Contributor

I understood the mistake. Fixing.

This comment has been minimized.

Copy link
@manu-chroma

manu-chroma Aug 6, 2016

Author Contributor

screenshot from 2016-08-06 17-21-53

This comment has been minimized.

Copy link
@nelson-liu

nelson-liu Aug 6, 2016

Contributor

oops, that's my bad @jnothman thanks for pointing it out.

If True, returns ``(data, target)`` instead of a Bunch object.
See below for more information about the `data` and `target` object.
.. versionadded:: 0.18

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

Similarly, I think this needs indenting to render in the correct place.

This comment has been minimized.

Copy link
@manu-chroma

manu-chroma Aug 6, 2016

Author Contributor

Fixed.

@@ -468,14 +478,17 @@ def load_digits(n_class=10):
flat_data, target = flat_data[idx], target[idx]
images = images[idx]

if return_X_y:
return flat_data, target.astype(np.int)

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

I think the astype should precede this branching. Even do it right after the file reading block.

@@ -468,7 +468,7 @@ def load_digits(n_class=10, return_X_y=False):
delimiter=',')
with open(join(module_path, 'descr', 'digits.rst')) as f:
descr = f.read()
target = data[:, -1]
target = data[:, -1].astype(np.int)

This comment has been minimized.

Copy link
@manu-chroma
@@ -233,8 +233,13 @@ Enhancements
- Added new return type ``(data, target)`` : tuple option to
:func:`load_iris` dataset,
(`#7049 <https://github.com/scikit-learn/scikit-learn/pull/7049>`_)

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

can you just put the PR numbers together at the end of the dataset listing?

This comment has been minimized.

Copy link
@manu-chroma

manu-chroma Aug 6, 2016

Author Contributor

Is this better ?
screenshot from 2016-08-06 17-34-34

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

I just meant commas between them... But whatever.

@manu-chroma

This comment has been minimized.

Copy link
Contributor Author

manu-chroma commented Aug 6, 2016

whats_new.rst

screenshot from 2016-08-06 17-30-36

@@ -277,7 +277,7 @@ def load_iris(return_X_y=False):
(data, target) : tuple if ``return_X_y`` is True
.. versionadded:: 0.18
.. versionadded:: 0.18

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

Thanks for fixing these up

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Aug 6, 2016

LGTM

@jnothman jnothman changed the title [MRG] added return_X_y option to toy datasets in sklearn.datasets [MRG+1] added return_X_y option to toy datasets in sklearn.datasets Aug 6, 2016

manu-chroma added some commits Aug 6, 2016

@@ -231,10 +231,15 @@ Enhancements
By `Sebastian Säger`_ and `YenChen Lin`_.

- Added new return type ``(data, target)`` : tuple option to

This comment has been minimized.

Copy link
@nelson-liu

nelson-liu Aug 6, 2016

Contributor

should the double quotes extend to after "tuple"...?

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 6, 2016

Member

actually, I'd rather see mention of the parameter name here...

Returns
-------
data : Bunch
Dictionary-like object, the interesting attributes are: 'data' and
'targets', the two multivariate datasets, with 'data' corresponding to
the exercise and 'targets' corresponding to the physiological
measurements, as well as 'feature_names' and 'target_names'.
(data, target) : tuple if ``return_X_y`` is True

This comment has been minimized.

Copy link
@agramfort

agramfort Aug 7, 2016

Member

does this render fine?

This comment has been minimized.

Copy link
@manu-chroma

manu-chroma Aug 7, 2016

Author Contributor

yeah, think so.

image

@agramfort

This comment has been minimized.

Copy link
Member

agramfort commented Aug 7, 2016

looks like it's ok on circle.

merging.

thx @manu-chroma

@agramfort agramfort merged commit 8994d0e into scikit-learn:master Aug 7, 2016

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

TomDLT added a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016

[MRG+1] added return_X_y option to toy datasets in sklearn.datasets (s…
…cikit-learn#7154)

* added return_X_y support to more dataset loaders

* fix typo

* updated whats_new.rst

* fix indentation for version added tag

* call astype before the branching

* better formatting in whats_new.rst

* better formatting

* updated what's new
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.