Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docstring style/format is not consistent #2974

Closed
IsaacG opened this issue Mar 10, 2022 · 9 comments
Closed

Docstring style/format is not consistent #2974

IsaacG opened this issue Mar 10, 2022 · 9 comments

Comments

@IsaacG
Copy link
Member

IsaacG commented Mar 10, 2022

The docstring param and return blocks vary from file to file. See:

» grep -hr -e ':return: [^ ]\+ ' -e ':param [^ ]\+:' | sed 's/^ *//' | sed 's/\(\([^ ]* \)\{5\}\).*/\1/'| sort -u
:param amount: Amount of seats
:param appetizers: list of appetizer
:param azara_record: tuple - a
:param budget: float - amount
:param budget: float - the
:param card: str - given
:param combined_record_group: tuple of tuples
:param coordinate: str - a
[...]
:return: int amount of prep
:return: integer count of student
:return: int - index at
:return: int - maximum value
:return: int - non-exchangeable value.
:return: int - number of
:return: int - number raised
:return: int remaining bake time
:return: int - the value
[...]

(Full output)

The predominant style appears to be:

:param <name>: <type> - <description>
:return: <type> - <description>

where the description begins with a lowercase letter and ends with a period.

Happy to send a PR if this is acceptable.

@github-actions

This comment was marked as resolved.

@BethanyG
Copy link
Member

BethanyG commented Mar 10, 2022

@IsaacG -- I think these are largely fine as they stand. As long as they generally follow this format:

def some_function(some_parameter):
    """Function purpose in the form of a command ending in a period.
    
    :param <name>: <type> - <description>
    :return: <type> - <description>
    
    Additional notes, as needed.  Especially clarifications for error messaging or edge cases.
    """
    ....

    return <some function return value>

TL;DR: I think it's too much to be ridged around the fine details, as long as we stay within what PEP257 and PEP8 describe and additionally have some consistency with how we mark out the :params: and :returns:.

As the intro to PEP257 says:

The aim of this PEP is to standardize the high-level structure of docstrings: what they should contain, and how to say it (without touching on any markup syntax within docstrings). The PEP contains conventions, not laws or syntax.

“A universal convention supplies all of maintainability, clarity, consistency, and a foundation for good programming habits too. What it doesn’t do is insist that you follow it against your will. That’s Python!”
—Tim Peters on comp.lang.python, 2001-06-16

I consider the - and the <description> optional (less optional in the docstrings of early concept exercises).

I do think that having an additional colon after the :param: or :return: <type> is hard to read, makes things confusing, and is not mentioned in any of the docstring formats I took a quick look at. I did see both single and double dashes used in both PEP8 & 257.

Not every concept exercise is going to have these docstrings stubs. We'll probably not have them after the classes exercise - or they will be less verbose. Eventually, we'd like students to make ones of their own. But until/unless we write specific exercises around a specific style or format of docstrings, I think we shouldn't be enforcing more than the general PEP257 format (with the demarcations for params and returns noted previously).

So I don't think a PR from you for concept exercises is needed at this time. @Metallifax has taken on reviewing docstrings for concept exercises as time allows (adding in summary sentences where needed), and can clean up the small set of cases where <type> has been omitted, or some other issue arises.

@Metallifax
Copy link
Contributor

Hey @IsaacG,
I'll keep an eye out for missing types in the param/return lines as @BethanyG said when I have the time to devote to the repository again, which should be pretty soon (recent midterms ate up a lot of my time). If there's a way to extend pydocstyle or any other docstring linter to check for missing types, that'd be our best bet for uniformity from exercise to exercise (in my opinion), I just haven't found anything like that yet and pydocstyle seems to not care about if types are there or not in its current configuration. Thanks for pointing this out though and I'll make a note of this for future PRs. Cheers.

@BethanyG
Copy link
Member

BethanyG commented Mar 11, 2022

@Metallifax - if/when either you or I have time, maybe we can take a look at some of the programs in the documentation generation space for Python. In particular:

Sphinx
pydoctor
pdoc
doxygen

Also: What the Python Tutorial Says

But also consumers of the generators:
docusaurus
ReadtheDocs
MkDocs

Sphinx-RTD style comes the closest (ish) right now to what we are doing - but that doesn't necessarily mean we want to follow it. For one, separating the type into its own line is quite hard to read, and the whole format suffers from an extreme excess of periods and colons. 😱

  • This post by Thomas Cokelaer comes the closest I've found yet to showing how a Sphinx docstring format then gets processed by Sphinx RST.

  • This post lays out how the PSF/Python core team uses Sphinx and reST to produce Python's documentation.

The intricacies of machine processing and auto-generation are fairly burdensome to someone who's learning to code in Python, and there is a high likelyhood that the very next project or team they are on will require something different. Cases in point:

So it does feel like a sort of "losing battle" beyond enforcing a few points. That being said, I can update this issue with any tooling I find that may or may not be helpful.

@IsaacG
Copy link
Member Author

IsaacG commented Mar 11, 2022

Would recording an official preferred style be worth having?

@BethanyG
Copy link
Member

Not at the moment. I think we're not quite there yet. Maybe after this last pass by @Metallifax, and some noodling on what we might want to cover in a doctstring and doctest series of concept exercises.

But even then, it would be for track exemplar code and (selected) stub files. I am not about to go and require them on all submissions or mentor notes. And we will NOT be putting any in test files - that would really screw up the code that gets displayed on the website currently.

The overarching message I 'd like to convey to students is that having doctstings that are useful to those reading the code later are a really good thing. And that generally, it is a really good idea to follow conventions in PEP257. The earlier you have the habit, the easier it is. Like having unit tests, it shows you are a good developer.

Having them written so that documentation is automated is 🦄 ✨, and the hallmark of a stellar dev team -- but also (like working with unicode) fraught with complication the further you dig.

@IsaacG
Copy link
Member Author

IsaacG commented Mar 11, 2022

Sorry if I wasn't clear. I definitely didn't mean to imply that we should be pushing a style for student submissions! I was thinking it'd be helpful for stubs, examples and exemplars to keep them consistent. Thanks!

@BethanyG
Copy link
Member

Same answer. Not there yet.

@Metallifax
Copy link
Contributor

@BethanyG Sorry if I left ya hanging for a minute, I just wanted to research the subjects here and devote some time to the topic.

if/when either you or I have time, maybe we can take a look at some of the programs in the documentation generation space for Python.

Will this be a part of a future exercise or is this just so we can hone in on our style eventually? Or more exciting, a documentation site for the repository to make use of our new found documentation skills 🤔?

The intricacies of machine processing and auto-generation are fairly burdensome to someone who's learning to code in Python, and there is a high likely-hood that the very next project or team they are on will require something different.

Then if you'd like my nooby advice, I'd go with the fewest steps possible for the student, which PDoc does well. I was able to generate some nice looking html files with a simple one liner inside just a regular old exercise that a student would receive via the exercism cli:

pdoc --html . --output-dir=./docs

The problem is that PDoc only works with Google and Numpy format, and while they support reST directives, they seem to have some issues displaying reST docstring format at the moment and have an outstanding issue since 2020 where they've implementing them.

In the meantime, I'll just keep trucking along as you say and keep an eye on the issue while we figure this out and follow your format suggestion in your first reply. I'll also keep an eye out for tooling as well and update the thread with some good contenders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants