-
-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent functions and method docstring style #2322
Comments
I support the goal of standardizing docstrings to make them consistent (and describing the arguments) though I will note:
I agree this is a discussion item we should return to after 1.10 is released. |
Yes, I'd suggest we do it incrementally starting with some of the core classes like Dimensioned, LabelledData, NdMapping, HoloMap, DynamicMap, Element and Dataset. |
Could we decide on a format now so I can make sure any new docstrings that get written conform to the standard? @jlstevens You say you don't like the verbosity of NumPy style docstrings, could you make a suggestion on how to shorten it, otherwise that is my strong vote. @jbednar Any opinions? |
Sorry, I wasn't aware of this issue when I made
holoviz-dev/nbsite#67. I'm mentioning that issue now
just to say that using an existing format supported by existing tooling
(e.g. numpydoc) will have benefits for things like api documentation
generation. I realize there are other concerns though.
…On Wed, Sep 12, 2018 at 5:26 PM Philipp Rudiger ***@***.***> wrote:
Could we decide on a format now so I can make sure any new docstrings that
get written conform to the standard? @jlstevens
<https://github.com/jlstevens> You say you don't like the verbosity of
NumPy style docstrings, could you make a suggestion on how to shorten it,
otherwise that is my strong vote.
@jbednar <https://github.com/jbednar> Any opinions?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2322 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAHiaNG36333KWozKL8VqOBgYFo52dLks5uaTWlgaJpZM4R_RCx>
.
|
Thanks @ceball, somehow missed your issue even though I was pinged in it. I strongly agree that supporting existing tooling is a strong argument to go with a standard format and I have a strong preference for numpydoc. |
How will we handle the fact that huge groups of all HoloViews Elements accept the same argument, without duplicating that info across every Element's docstring (and thus making it impossible to change the wording over time)? |
My primary concern is method and function level docstrings. Class-level docstrings need improvements too but not as urgently as methods and functions, which are currently, imo, unusable. The class level docstrings are already covered reasonably well because parameters have docstrings and they can also refer back to the docs for a baseclass, e.g. to cover the supported data formats and explain interfaces in detail. |
How will we handle the fact that huge groups of all HoloViews Elements
accept the same argument, without duplicating that info across every
Element's docstring (and thus making it impossible to change the wording
over time)?
Do inheritance for docstrings ;) If using a style like numpydoc, it should
be straightforward...
…On Wed, Sep 12, 2018 at 6:04 PM Philipp Rudiger ***@***.***> wrote:
My primary concern is method and function level docstrings. Class-level
docstrings need improvements too but not as urgently as methods and
functions, which are currently, imo, unusable. The class level docstrings
are already covered reasonably well because parameters have docstrings and
can refer back to the docs for a baseclass.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2322 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAHiQVirg8Hm3wNdZd_-ybiCskW3a9Gks5uaT6EgaJpZM4R_RCx>
.
|
I'm probably not thinking very clearly about this specific situation, but what I'm concerned about is that inheritance is all or nothing. For Parameters that's handled by associating the docstring with each parameter, so that as parameters are gradually built up via inheritance, all of that can be provided for hv.help. But for positional arguments, e.g. of the constructor, we can't handle that by class-level docstring inheritance, because each class needs its own documentation, even though all the classes accept the same first positional argument, and large groups of them treat that data the same. Can that be handled by docstring inheritance of the constructors (not the class)? Yes, if we don't use those constructor docstrings already for subclass-specific stuff, in which case we can use numpydoc docstrings in a few key base class constructors? |
Not quite following, constructors (i.e. |
I was joking about parsing the numpy docstring for parameters, return
values, etc, and doing inheritance.
…On Wed, Sep 12, 2018 at 7:00 PM Philipp Rudiger ***@***.***> wrote:
Not quite following, constructors (i.e. __init__) generally don't have a
docstring, the class level docstring is usually used instead.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2322 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAHiZ1kyGz213GJisQV6USWzD2tLr9Aks5uaUusgaJpZM4R_RCx>
.
|
That's my point. The constructor arguments are shared across many classes, but each class has its own class docstring, and so I don't see a way to inherit the documentation of positional parameters of constructors. Chris was joking about inheritance for docstrings, but I was just talking about inheritance of docstrings, which can work for methods but not classes, and if we don't generally document the constructor, then it won't work. So I'm still not sure what's proposed be solved by just using numpydoc. |
As I mentioned above I'm primarily concerned about method and function docstrings which are currently huge walls of text which make it incredibly difficult to figure out what each argument means. Class level docstrings also need improvement but unlike methods they generally have only one argument that needs explaining (i.e. |
I can't really read it on my phone, but seems like someone already had th e
idea: https://github.com/meowklaski/custom_inherit
Anyway I don't mean to distract. I'm with Philipp (i.e. seems like a good
idea to agree on numpy style and start using it).
…On Wed, Sep 12, 2018 at 7:32 PM Philipp Rudiger ***@***.***> wrote:
So I'm still not sure what's proposed be solved by just using numpydoc.
As I mentioned above I'm primarily concerned about method and function
docstrings which are huge walls of text which make it incredibly difficult
to figure out what each argument means. Class level docstrings also need
improvement but unlike methods they generally have only one argument that
needs explaining (i.e. data), because everything else is a parameter.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2322 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAHiR-F5YzTT0e2aF-meeuXnnjNRyXEks5uaVNFgaJpZM4R_RCx>
.
|
Ah; sorry! I didn't read far enough back to see that indeed, this issue is just about improving existing walls of text. That said, I don't think we should make any decision about docstring formats without also addressing the issues about positional constructor arguments that users have raised (e.g. #2925 and #2947), because it would be very confusing to have multiple competing conventions for our docstrings, each addressing specific issues. So if we use numpydoc format to avoid wall-of-text function and method docstrings, then we should also presumably be using numpydoc format to document positional parameters for our constructors, right? And then I don't know how to avoid duplicating all that numpydoc-format documentation for the positional arguments across all our classes. |
Sure I'm not necessarily against that, it's just that in the majority of cases it'd be something like this: data: array or dataframe or dict
...
**params
Dictionary of parameters for the class where the parameters aren't documented a second time. |
I don't have much to say other than that the numpy style looks fairly readable to me, at least from a user perspective. From a developer perspective, I'll need to find a way to get emacs to fold the docstrings away: applying this format consistently will hugely increase the number of lines taken up by docstrings which will be distracting when trying to read and write code. |
Personally, I'm ok with docstrings dominating code files; that's what people should be reading first (what is this code meant to do?) anyway. :-)
This seems like it will cause untold confusion. For some reason, numpydoc calls function arguments "Parameters" and not arguments. Won't this make users think it's about our Param-based Parameters that are listed in
It doesn't seem to me that
Given that one of Jean-Luc's next tasks is to add an output decorator to Param, should we not see if that will also let us declare the return type of a method or function using Param? Putting that information in the docstring allows docs to be made but doesn't allow any other automated processing of it like would be needed for boxflow. |
I think it's not a numpy thing, but a python thing:
(https://docs.python.org/2.7/faq/programming.html#faq-argument-vs-parameter) I've noticed people in general gradually changing how they speak about stuff, but it's been happening slowly. Anyway, however it might have been in the past, only the situation now matters really. I.e. when we say things like "the parameters of a parameterized function", we are going to cause confusion :( |
Not just a Python thing: https://codeburst.io/parameters-arguments-in-javascript-eb1d8bd0ef04 So maybe the confusion is inescapable. |
My example was not meant to be exhaustive it was just meant to highlight that class level descriptions of the constructor would generally only describe the |
I see this as a very specific mechanism to declare pipelines and data flow not for general return type annotations and documentation. Param is too limited to describe the return types in enough detail and I definitely do not want to litter all our code with these annotations. |
I'm still not hearing any answer to the underlying problem: how will we document the data argument, given that it is shared across many classes, not identical for every class, and very complicated to describe? |
I don't mean that to get documentation we'd always have to be able to express the return types fully in Param. I mean that, (a), if return types are specified, then that information can be used (where present), (b) a similar mechanism can be used to make human-readable docstrings describing the return type, where the annotation is not present and/or not sufficient to describe it, and (c) it would be odd to have a completely separate, independent convention for documenting return types that is not aware of the annotations for return types. It seems just like a Parameter -- typically we try to add a very specific type programmatically, e.g. Similarly, for return values, one could add annotations when it's very simple what's returned ( I'm not arguing strongly that it has to be done this way, but I think it's something more in keeping with how Param works than adopting a convention with specially formatted big docstrings. And I'm not seeing how the code would get littered with the annotations any more than the code would get littered with the same information put in a docstring. |
Every class will have to describe exactly what it supports and refer back to the |
I just meant that everyone (including the official docs) in the python world used to talk only about "arguments", but that's changed. At least, that's how I remember it (I might be wrong). |
I agree with Jim on these two points:
This would bother me as well.
I also agree with this in principle though I don't yet see how this would work in practice. Specially formatted strings in code are a nightmare to manage. That said, I don't want to be arguing that we should have betters docs/docstrings. As a result, I'm not objecting to this proposal even though I am not particularly enthusiastic about it (as a developer, I would find that having docstrings dominate a file just makes it harder to work with the code). |
I also agree that we should call arguments |
Just my two cents as a relatively new user who's unfamiliar with the codebase: I've definitely suffered because the documentation is really tough to use. At several points I considered abandoning Holoviews completely, despite all the amazing functionality it offers, just because of this one weakness. Also I just wanted to throw Google style docstrings out there as an alternative to NumPy style. These can be parsed by Sphinx with an extension. |
I really sympathize, and kudos for sticking with it. This needs to be the main priority as soon as we push 1.11.0 out the door and if possible we should at least tackle some of the core classes and methods before then. |
Dynamically generated docstrings sound good in theory but I see no reasonable way to make them interleave sensible with actual text written by a developer and to ensure that they are presented in a readable format. I'd prefer to refer the user to some more exhaustive reference than automatically interpolate some potentially complex (list of) type(s) that may have been declared as part of the output. The output should be readable and so I'd much prefer on coming up with conventions to refer users to actual docs when discussing accepted I also do not want to annotate return types using some special param only syntax and litter all or even a significant fraction of our code with it and if we only do it in some cases it is as good as useless for consistent documentation. We would then also have to have a separate decorator to annotate the input types. Basically I'm pretty strongly convinced that annotating input and return types with param is a horrible way to auto-generate docs. It sounds like you're envisioning as the param decorators as some general type system that we will apply to all our methods and functions, which I also think is a really bad idea. We're not going to put param decorators around every public function and method in holoviews, which is what would be needed to generate a consistent set of documentation. My understanding was that param output declaration would be very useful for declaring pipelines on things like operations but not that they would become our own hand-rolled typing system, which we're going to use throughout the codebase. So I would strongly argue we adopt either a NumPy or Google docstring style very soon, rather than waiting on param decorators, which imo are not well suited to the task anyway. |
Here you're talking about methods? Seems like there is a continuum where the tradeoffs differ. If you have an arbitrary method with several complex positional arguments, there's no point to having some separate way to annotate the return type, because you're going to have a long docstring already and it's annoying to go to some special place and syntax to add one more bit of information. Plus such a method can never be automated in any meaningful way; it's for a human to call. But the other end of the extreme is what I would have thought is a common case, with a method that only at most accepts parameters, with no positional or special keyword arguments. Such a method is worth annotating the return type with Param, because that one last bit of information will let it be handled automatically by things like boxflow -- on top of the parameter declarations, the output type is just one tiny bit more information needed, at which point it can be used to chain pipelines while doing full validation of outputs to inputs. So for So you and I are probably thinking of different scenarios when you're imagining using these tools; I'm not imagining hand-rolling a complete typing system to cover arbitrary methods, because there would be no use for such a thing, but I do expect that simple parameterized pipeline stages can be captured well and have important usages. For HoloViews you are the one who has the set of methods in mind and knows whether most methods that need documentation fall in one or the other of those categories (i.e. lots of positional arguments vs none or only a standard widely shared positional argument), so it's your call to make. |
Okay, I'm glad we were mostly talking past each other here then. Yes, I'm only talking about methods and functions here, I was hoping that was made clear by the opening paragraph of this issue.
This really only applies to operations, of which there are only a handful and where I'd be very happy to do this. What I've been talking about in this issue is every single other public method and function in holoviews, of which there are 100x more than there are operations. |
The point I do agree with is that the constructors for all the core datastructures will need to be given some thought, but that should be fairly straightforward as in most cases the positional arguments are simply: |
Sounds good.
And yet operations are mainly what I use, perhaps because the rest aren't well documented. :-) |
Since we've not made a decision and started the conversion I don't think we need this issue anymore. |
HoloViews is built on param which has the great advantage of providing a neat way of documenting the parameters of classes. While this is a great way to structure class level parameters, it does not provide anything that documents methods and functions. The current state of the docstrings on methods and functions in HoloViews is therefore what I'd call highly inconsistent and therefore quite frustrating for users.
While a large paragraph explaining what the method does is often sensible, docstrings are far more commonly used to quickly figure out what that one particular argument was called, what options it supports and what type it expects. I therefore think there is a pretty desperate need for us to standardize on some standard docstring format at least for functions and methods.
I'd be strongly in favor of moving to simple NumPy-style docstrings with sections for Parameters and Return values:
Readability of this is infinitely better than having to scan a large paragraph for the argument that I care about. After 1.10 is released I'd like to start a systematic review of all docstrings (many of which are hopelessly outdated) and upgrade them to a consistent style. I have no strong preference for any specific format but think that explicitly listing the arguments and return values will result in a much improved user experience and help us consider and address current API inconsistencies.
The text was updated successfully, but these errors were encountered: