Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Validate the Returns section in the docstrings #23138

Closed
datapythonista opened this issue Oct 13, 2018 · 5 comments
Closed

DOC: Validate the Returns section in the docstrings #23138

datapythonista opened this issue Oct 13, 2018 · 5 comments
Labels
CI Continuous Integration Docs good first issue
Milestone

Comments

@datapythonista
Copy link
Member

scripts/validate_docstrings.py validates that the content of a docstring follows our standards. There are still some of those standards that the script does not validate, and this gives the users the wrong impression that the docstring is all right, when it's not really the case. All the missing validations are listed in #20298.

The format of the Returns section consists of a first line with just the type, and afterwards indented the description. In the case a tuple is returned, then the format is name : type or types for each parameter, and indented descriptions after them. The descriptions need to start with a capital letter and finish with a period. See: https://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html#section-4-returns-or-yields

For this issue is required:

  • Change scripts/validate_docstings.py to give errors if some of the described formats are not satisfied.
  • Add tests in scripts/tests/test_validate_docstrings.py
@hongshaoyang
Copy link
Contributor

would like to try taking this on!

few questions:

  1. how can i use test_validate_docstrings.py to create bad docstrings and use it in the validate_docstrings script?
  2. how can i know the structure of doc.returns or doc.yields?

@datapythonista
Copy link
Member Author

To run the tests you should simply call (in a pandas dev environment, and in the project root): python -m pytest scripts if I'm not wrong.

If you want to know the value of variables in the validate_docstrings.py script, you can run it, and add print(whatever) in a line where it contains the value. You can run the script like ./scripts/validate_docstrings.py pandas.DataFrame.head to run just one docstring and make it easier. Is this what you're asking?

@hongshaoyang
Copy link
Contributor

thanks for the quick response! so technically, the docstring for pandas.DataFrame.head should fail? here's the Returns section:

Returns
-------
obj_head : same type as caller
    The first `n` rows of the caller object.

And according to https://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html#section-4-returns-or-yields

But in this case, no name will be provided, unless the method returns or yields more than one value (a tuple of values).

So the Returns section should not provide a name for its return value?

@datapythonista
Copy link
Member Author

In this case it should fail because there is a single value being returned, and it contains the name obj_head.

The correct cases are:

Returns
--------
int
    The single integer value that the function returns.
Returns
--------
id : int
    The function returns a tuple, and the first value is an integer with the id.
name : str
    The second value of the tuple is a string with the name.

@igorfassen
Copy link
Contributor

I've added some checks in validate_docstrings.py, and some corresponding tests.

i've created the following pull request : #23432

(sorry, i'm not very familiar with github, i hope I've done this the right way...)

@jreback jreback added this to the 0.24.0 milestone Dec 30, 2018
thoo added a commit to thoo/pandas that referenced this issue Dec 30, 2018
* upstream/master:
  REF/TST: replace capture_stdout with pytest capsys fixture (pandas-dev#24501)
  BUG: fix .iat assignment creates a new column (pandas-dev#24495)
  DOC: add checks on the returns section in the docstrings (pandas-dev#23138) (pandas-dev#23432)
  ENH: Add strings_as_fixed_length parameter for df.to_records() (pandas-dev#18146) (pandas-dev#22229)
  TST: Skip db tests unless explicitly specified in -m pattern (pandas-dev#24492)
  Mix EA into DTA/TDA; part of 24024 (pandas-dev#24502)
  DOC: Fix building of a single API document (pandas-dev#24506)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Docs good first issue
Projects
None yet
Development

No branches or pull requests

4 participants