Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe doc #4762

Merged
merged 10 commits into from May 7, 2019

Conversation

Projects
None yet
2 participants
@martindurant
Copy link
Member

commented Apr 30, 2019

  • Tests added / passed
  • Passes flake8 dask

Written to add extra information to the derived docstring for dataframe.describe ( fixes #4401 )

partially fixes #3265 (simplest solution)

@@ -1537,7 +1537,7 @@ def quantile(self, q=0.5, axis=0, method='default'):

@derived_from(pd.DataFrame)
def describe(self, split_every=False, percentiles=None, percentiles_method='default'):
# currently, only numeric describe is supported
"""Currently, only numeric describe is supported """

This comment has been minimized.

Copy link
@mrocklin

mrocklin Apr 30, 2019

Member

We might use this function as a test for the derived_from functionality change in this PR.

As an example, I think that there is already a test in test_utils.py that import dask.dataframe and checks the state of one of the docstrings.

This comment has been minimized.

Copy link
@martindurant

martindurant Apr 30, 2019

Author Member

good idea

martindurant added some commits Apr 30, 2019

]
if extra:
bits.insert(1, extra.rstrip('\n') + '\n\n')
bits.insert(1, indent)

This comment has been minimized.

Copy link
@mrocklin

mrocklin Apr 30, 2019

Member

Would it be possible to move extra below the "this is copied" disclaimer?

Signature: dd.DataFrame.describe(self, split_every=False, percentiles=None, percentiles_method='default')
Docstring:
Generate descriptive statistics that summarize the central tendency,
dispersion and shape of a dataset's distribution, excluding
``NaN`` values.

Currently, only numeric describe is supported

This docstring was copied from pandas.core.frame.DataFrame.describe.

Some inconsistencies with the Dask version may exist.

Analyzes both numeric and object series, as well
as ``DataFrame`` column sets of mixed data types. The output
will vary depending on what is provided. Refer to the notes
below for more detail.

This comment has been minimized.

Copy link
@martindurant

martindurant Apr 30, 2019

Author Member

done

@martindurant

This comment has been minimized.

Copy link
Member Author

commented May 1, 2019

The errors are py2-only, and I really don't understand them or know if they are caused by this code (seem to be import errors of "methods"). Help!

@mrocklin

This comment has been minimized.

Copy link
Member

commented May 4, 2019

The errors are py2-only, and I really don't understand them or know if they are caused by this code (seem to be import errors of "methods"). Help!

The docstring munging code runs at import time (we have @derived_from decorators in the main file). In your situation I would create a Python 2 environment, try importing the module, and see what happens. Maybe I would also use the Python debugger (if possible at this stage) or maybe put in a pdb statement?

I'm not sure if that helps or not. Hopefully those steps move towards a good solution.

martindurant added some commits May 6, 2019

Do not add lines to empty docstring
(this is only relavant to -OO optimized bytecode, which strips docstrings)
@martindurant

This comment has been minimized.

Copy link
Member Author

commented May 6, 2019

There we go :) It was to do with docstring stripping during bytecode optimisation.

@martindurant

This comment has been minimized.

Copy link
Member Author

commented May 6, 2019

I think this should be merged now, can press the button if no one objects by tomorrow morning

@martindurant martindurant merged commit cae23c2 into dask:master May 7, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

almaleksia added a commit to almaleksia/dask that referenced this pull request May 10, 2019

jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this pull request May 14, 2019

Describe doc (dask#4762)
* Add disclaimer to dataframe.describe() [skip ci]

Fixes dask#4401

* Simplest revamt of dataframe doc generation

* add tests

* extra docstring after disclaimer

* add a little to the relavant docstring [skip ci]

* py2fixes

* Do not add lines to empty docstring

(this is only relavant to -OO optimized bytecode, which strips docstrings)

* skip on py2

Thomas-Z added a commit to Thomas-Z/dask that referenced this pull request May 17, 2019

Describe doc (dask#4762)
* Add disclaimer to dataframe.describe() [skip ci]

Fixes dask#4401

* Simplest revamt of dataframe doc generation

* add tests

* extra docstring after disclaimer

* add a little to the relavant docstring [skip ci]

* py2fixes

* Do not add lines to empty docstring

(this is only relavant to -OO optimized bytecode, which strips docstrings)

* skip on py2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.