Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: SciPy extensions for code style and docstring guidelines. #13955

Merged
merged 15 commits into from
Mar 24, 2022

Conversation

WarrenWeckesser
Copy link
Member

@WarrenWeckesser WarrenWeckesser commented Apr 28, 2021

I've been keeping notes on various requests for changes that come up now and then in PRs in NumPy and SciPy. In this PR, I've documented some of the common requests that I've seen in a new section of the docs called "Code and Documentation Style Guide - The Missing Bits". These are issues that are not covered in the usual coding and documentation PEPs and guides, and that can be expressed as a fairly simple rule.

The motivation for these guidelines is this: if a core developer makes a request to change a PR that is a blocking request, and if the change can be expressed as a fairly simple rule, and if there is consensus among the core devs that the requested change should block the PR, then the rule should be documented. That way it is clear that the requested change is not just the reviewer's personal preference, but in fact is the preferred style approved by the SciPy devs.

It is unlikely that everyone will agree that we need all these new guidelines. In fact, I expect that for some of these guidelines, a reaction will be "it is not important, we don't need it". That's fine! But then we have to be sure that in the future, if a reviewer requests a such a change in a PR, the request is just a suggestion (i.e. an expression of personal preference), and not a request that will block the PR.

For example, I personally don't care about the relatively trivial guidelines about where to put the space when a long string is broken at a word boundary, or about the suggestion to always include a blank line as the last line of a multiline docstring. If everyone else feels the same, then I'll take those out. But that means reviewers who do happen to have a preference should not block a PR that doesn't follow the guidelines. (There is nothing wrong with stating the preference in a review comment, but it shouldn't block a PR.)

@WarrenWeckesser WarrenWeckesser marked this pull request as draft April 28, 2021 22:10
@tylerjereddy tylerjereddy added the Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org label Apr 29, 2021
doc/source/dev/missing-bits.rst Show resolved Hide resolved
doc/source/dev/missing-bits.rst Outdated Show resolved Hide resolved
doc/source/dev/missing-bits.rst Outdated Show resolved Hide resolved
doc/source/dev/missing-bits.rst Outdated Show resolved Hide resolved
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up Warren. Most/all of this looks like good guidance, and useful to document. My one caution is that this should be documentation primarily for us (maintainers), you cannot expect new contributors to read and adhere to all that. And if they don't, keep in mind https://numpy.org/devdocs/dev/reviewer_guidelines.html#communication-guidelines. These are important to not make contributing to SciPy feel painful.

Related: pre-commit is now much more user-friendly than manual pre-commit hooks were in the past, so integrating that and making it run tools/lint_diff.py (same as in CI) would also be useful I think. I haven't used it much, but quite a few people have told me they like it.

doc/source/dev/missing-bits.rst Outdated Show resolved Hide resolved
@tupui
Copy link
Member

tupui commented Apr 29, 2021

Related: pre-commit is now much more user-friendly than manual pre-commit hooks were in the past, so integrating that and making it run tools/lint_diff.py (same as in CI) would also be useful I think. I haven't used it much, but quite a few people have told me they like it.

Also these tools now offer a way to not kill git blame and such (see here). So having something like Black would remove lots of discussions. I set these up (black, pre-commit, isort) on a few projects at work and it's really convenient.

Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for proposing that! (Do you have the feedback on skipping the CI?)

doc/source/dev/missing-bits.rst Show resolved Hide resolved
doc/source/dev/missing-bits.rst Show resolved Hide resolved
doc/source/dev/missing-bits.rst Show resolved Hide resolved
doc/source/dev/missing-bits.rst Show resolved Hide resolved
* For a new argument added to an existing function, two locations have been
used for the the 'versionadded' markup, [TBD: which is preferred?]:

* At the end of the description of the argument in the "Parameters" section
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on my side for this option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been roughly 11 months and no one has indicated a different preference, so let's go with the first option.

statement makes the import *optional*; this guideline says explicitly
that the import statement must not be included.


Copy link
Member

@tupui tupui Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
How to use Random Number Generator (RNG) and seed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In case random numbers are needed, `np.random.Generator` API must be used.
Use the helper function to validate the parameter `seed`, `random_state` [TBD: which is preferred?]::
from scipy._lib._util import check_random_state
def foo(seed=None):
rng = check_random_state(seed)
sample = rng.uniform(...) # for instance
If you want to write an example using random numbers in the documentation::
>>> rng = np.random.default_rng()
>>> sample = rng.uniform(...) # for instance
.. warning:: Do **not** specify the seed. This line will get overwritten and the same seed
will be used for conveniency for all examples. The value that you can use locally is:
`1638083107694713882823079058616272161`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tupui, do you think you could split this into two parts, one for generating random data in the "Examples" section of a docstring, and one for using random data in unit tests? I hadn't followed very closely the discussions in #13863, but in #14182, @rkern pointed me to the important conclusions, including the part about a predefined seed being used when Sphinx builds the docs. This means a developer can test the code that will be used in an example, and be sure that the seed will match the one used in Sphinx. We need to get some version of this comment into these docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back end of June. I will get to this as soon as I'm back if this is still a thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been updated and ready to include on my side.

Copy link
Contributor

@mdhaber mdhaber Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than

In case random numbers are needed, np.random.Generator API must be used.

seeming to apply to test code and not just docstrings, this looks good to me. Pick any one of seed/random_state/rng for now. We can use the new decorator for backward compatible keyword renaming to change the name once we settle on something. We can use the new decorator for backward compatible keyword renaming to change the name once we settle on something.

That comment linked above looks like a good idea to me. It is good to include the seed here in the meantime, but it would also be nice to have something closer to English that we can import rather than having to refer to the docs every time. It would also be good if that thing could be automatically replaced in the user-facing docstring so that we don't need to remember to remove the seed before pushing (and inevitably add it back in and remove it again as a PR develops).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #15852 for that.

@ev-br
Copy link
Member

ev-br commented May 1, 2021

One other thing to add here might be not using asserts outside of test code. Errors need to be raised explicitly.

@tupui
Copy link
Member

tupui commented May 2, 2021

I am also seeing this in some PR as we add seeding, we should also decide about how to name this: seed, random_sate, something else?

@tupui
Copy link
Member

tupui commented May 10, 2021

Another point, when raising an exception: how to document parameters? There are many format used here. Sometimes we do ValueError("`param` is not good"), ValueError("``param`` is not good") or ValueError("param is not good").

Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should prioritize this PR. Possibly backport it, but not necessary as we push people to look at the devdoc anyway. All these are quite important things that we always have to talk about during reviews (and we could add a link to this in the PR template).

Parameters
----------
x : float
x must be nonnegative.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be work talking about how to format the params in the description vs others things with double backticks.

Suggested change
x must be nonnegative.
`x` must be nonnegative.

@WarrenWeckesser
Copy link
Member Author

WarrenWeckesser commented Jun 6, 2021

Hmmm... why are the Azure tests running? I thought [skip azp] was working these days.

@WarrenWeckesser
Copy link
Member Author

WarrenWeckesser commented Jun 7, 2021

Hmmm... why are the Azure tests running? I thought [skip azp] was working these days.

Proposed fix for the handling of [skip azp] is in #14197.

@WarrenWeckesser WarrenWeckesser marked this pull request as ready for review March 22, 2022 09:22
[skip azp] [skip actions]
@WarrenWeckesser
Copy link
Member Author

WarrenWeckesser commented Mar 22, 2022

I have updated the PR and moved it out of "draft" status.

These are unresolved items where conversations have been started here but would benefit from being split out as separate follow-up issues or PRs:

  • Use of backticks around variable names: reconciling the NumPy docstring guidelines with (1) desired
    appearance of the rendered web pages, and (2) occasional incorrect links when single backticks are used.
  • Use of delimiters around parameter names in error and warning messages.
  • Using (and seeding) random numbers in docstrings and in unit tests. (@tupui has provided a starting point for these guidelines in the comments.)

I don't know what is happening with CircleCI; that's the one test that I wanted to run!

@ilayn
Copy link
Member

ilayn commented Mar 22, 2022

We are using git+ssh instead of https and things changed on git side. https://stackoverflow.com/questions/70663523/the-unauthenticated-git-protocol-on-port-9418-is-no-longer-supported

@tupui
Copy link
Member

tupui commented Mar 22, 2022

I don't know what is happening with CircleCI; that's the one test that I wanted to run!

@WarrenWeckesser You need to merge main to this PR. Since you branched there has been a few changes which require that.

@tupui
Copy link
Member

tupui commented Mar 22, 2022

These are unresolved items where conversations have been started here but would benefit from being split out as separate follow-up issues or PRs:

  • Use of backticks around variable names: reconciling the NumPy docstring guidelines with (1) desired
    appearance of the rendered web pages, and (2) occasional incorrect links when single backticks are used.

We already are enforcing something. We should describe the current state at least. It does not prevent us for discussing future changes.

  • Using (and seeding) random numbers in docstrings and in unit tests. (@tupui has provided a starting point for these

I would not delay this. A lot of PR are wrong with respect to this and I always struggle to point out official doc for it.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything that's here LGTM. +1 for merging as is, and leaving other topics for follow-ups.

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest LGTM, if we're holding other discussions for followups.

doc/source/dev/missing-bits.rst Outdated Show resolved Hide resolved
There is ongoing discussion in scipygh-13049.  We can add the appropriate guideline once that issue is officially resolved.

Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>
@WarrenWeckesser
Copy link
Member Author

I forgot to add the skip notation in the last commit. The failed test is the Azure timeout that has been happening lately, and is not related to this PR.

@ilayn
Copy link
Member

ilayn commented Mar 24, 2022

This is already a very nice set of guidelines and we can fix the rough edges later. Already approved by maintainers and +1 from me too. In it goes thanks @WarrenWeckesser and all reviewers

@ilayn ilayn merged commit 1045161 into scipy:main Mar 24, 2022
@ilayn
Copy link
Member

ilayn commented Mar 24, 2022

@melissawm @tylerjereddy there is already a review request for you but I think working on separate PRs would be better to avoid the review bloat on this, hence the merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants