Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google docstring formatting for multi-line class attributes not recognized/converted properly for use in intellisense popup #3347

Closed
macintacos opened this issue Sep 15, 2022 · 12 comments
Assignees
Labels
bug Something isn't working docstrings fixed in next version (main) A fix has been implemented and will appear in an upcoming version P2

Comments

@macintacos
Copy link

macintacos commented Sep 15, 2022

Type: Bug

Behaviour

Expected vs. Actual

I apologise if this should be opened in a different repository. I am running this code with a Python 3.10.7 interpreter.

Take the following code:

from dataclasses import dataclass

@dataclass
class Foo:
    """Does the thing.

    Attributes:
        bar (int): A short description
        buzz (str): A longer description that needs to be broken up over multiple lines,
            because there is additional context that I want to provide, and I don't
            want to go past the 88 line length limit for whatever reason.
    """
    bar: int
    buzz: str

Note: the above docstring split over multiple lines is valid as far as Google's own style-guide is concerned. The docs for the class Attributes section points to the function Args section and says that it follows the same ruleset. Quoting below (emphasis mine):

List each parameter by name. A description should follow the name, and be separated by a colon followed by either a space or newline. If the description is too long to fit on a single 80-character line, use a hanging indent of 2 or 4 spaces more than the parameter name (be consistent with the rest of the docstrings in the file). The description should include required type(s) if the code does not contain a corresponding type annotation. If a function accepts *foo (variable length argument lists) and/or **bar (arbitrary keyword arguments), they should be listed as *foo and **bar.

With that in mind, when I go to type out a variable that initializes an instance of that class, I see that the bar variable definition is "hoisted" to the top of the intellisense pop-up, as expected. However, the multi-line definition for the second parameter has an awkward line break:

CleanShot 2022-09-15 at 16 37 30@2x

I could probably get over the mangling of the multi-line string in the pop-up, but the behavior is a bit more disappointing when you start typing the variable that has that multi-line docstring:

CleanShot 2022-09-15 at 16 38 42@2x

In the above image, we can see that the pop-up does not show the full docstring for the parameter in question.

Ideally what would happen would be that:

  • if you're writing out a docstring for a parameter and you've purposefully added a newline, VSCode would recognize it as a continuation of the previous line as long as the current line is indented in some uniform way (2/4 spaces).
  • If you are using intellisense and filling out a parameter that has this newline added (at the correct indentation level), you should be able to see the full documentation for that parameter (again, as long as the further lines are indented from the original parameter's "level").
  • Additionally, it would be great if one could create multiple paragraphs for a single parameter, so that documentation for specific parameters could be a bit more rich. This could be recognized if there are multiple line breaks, but the new paragraph would still need to be at the same indentation level to be considered part of the parameter it is documenting (meaning, 2/4 spaces indented)

Steps to reproduce:

  1. Create the class as shown above, with pydocstyle turned on with --convention=google passed.
  2. Review the Intellisense popup as you autocomplete the class initialization.

Diagnostic data

  • Python version (& distribution if applicable, e.g. Anaconda): 3.10.7
  • Type of virtual environment used (e.g. conda, venv, virtualenv, etc.): Venv
  • Value of the python.languageServer setting: Pylance
Output for Python in the Output panel (ViewOutput, change the drop-down the upper-right of the Output panel to Python)

XXX

User Settings


languageServer: "Pylance"

linting
• pydocstyleArgs: "--convention=google"
• pydocstyleEnabled: true

formatting
• provider: "black"

Extension version: 2022.14.0
VS Code version: Code 1.71.1 (e7f30e38c5a4efafeec8ad52861eb772a9ee4dfb, 2022-09-08T19:41:58.611Z)
OS version: Darwin arm64 21.6.0
Modes: Unsupported
Sandboxed: No

System Info
Item Value
CPUs Apple M1 Pro (10 x 24)
GPU Status 2d_canvas: enabled
canvas_oop_rasterization: disabled_off
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
metal: disabled_off
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_renderer: enabled_on
video_decode: enabled
video_encode: enabled
vulkan: disabled_off
webgl: enabled
webgl2: enabled
webgpu: disabled_off
Load (avg) 5, 8, 10
Memory (System) 32.00GB (0.04GB free)
Process Argv --crash-reporter-id 07e484e5-ada3-4296-adea-39d6c26a0c7c --crash-reporter-id 07e484e5-ada3-4296-adea-39d6c26a0c7c
Screen Reader yes
VM 0%
@karthiknadig karthiknadig transferred this issue from microsoft/vscode-python Sep 15, 2022
@macintacos macintacos changed the title Google formatting for multi-line attributes not recognized/converted properly for use in intellisense popup Google docstring formatting for multi-line class attributes not recognized/converted properly for use in intellisense popup Sep 16, 2022
@judej judej added the needs investigation Could be an issue - needs investigation label Sep 19, 2022
@r-moret
Copy link

r-moret commented Dec 5, 2022

Any update on this? I'm having the same problem

@judej judej added bug Something isn't working P2 and removed needs investigation Could be an issue - needs investigation labels Apr 11, 2023
@pdc1
Copy link

pdc1 commented Jul 9, 2023

As an aside, if you put \ at the end of the lines that are continued, you get the expected results. This does not seem appropriate, but it might give some insight into this bug.

For example:

        buzz (str): A longer description that needs to be broken up over multiple lines, \
            because there is additional context that I want to provide, and I don't \
            want to go past the 88 line length limit for whatever reason.

@simlal
Copy link

simlal commented Aug 4, 2023

I've had the same issue for method parameters, only on the parameter intellisense display not the method itself (there the multiline params display even though the formatting is not perfect).

As an aside, if you put \ at the end of the lines that are continued, you get the expected results. This does not seem appropriate, but it might give some insight into this bug.

For example:

        buzz (str): A longer description that needs to be broken up over multiple lines, \
            because there is additional context that I want to provide, and I don't \
            want to go past the 88 line length limit for whatever reason.

I agree it does fix the problem but it's not part of the official Sphinx docs so yeah might not be appropriate agreed.

@bschnurr bschnurr removed their assignment Jan 4, 2024
@rchiodo rchiodo self-assigned this May 24, 2024
@rchiodo rchiodo added the fixed in next version (main) A fix has been implemented and will appear in an upcoming version label Jun 12, 2024
@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

This issue has been fixed in prerelease version 2024.6.100, which we've just released. You can find the changelog here: CHANGELOG.md

@rchiodo rchiodo closed this as completed Jun 14, 2024
@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

The fix for this issue is behind an experimental feature flag:

"python.analysis.supportRestructuredText": true

It looks like we missed something though. The parameter raising only happens if the parameters are in a Parameters section and not an Attributes section. We've logged a separate bug to address that problem, but otherwise parameters should work across multiple lines.

@macintacos
Copy link
Author

@rchiodo Why is did we choose "Parameters" as a section? That's not even in the documentation for Google's docstring format from what I can tell 🤔

I'll give it a test today to see if it handles things the way I expect though.

@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

We chose Parameters because that's a common section header in a lot of docstrings. Like Numpy docstrings. The logic for the parameter hoisting was looking for 'parameters' to hoist and that's how it finds them. Args (like google docs) would have worked too. Attributes doesn't.

The example you posted is actually the docstring we apply to an __init__ function if there's no docstring on the __init__, so it needs to special case other sections too.

@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

Here's it also working with Args as the section header:

image

@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

I'd like to also point out that most of the work for this was done using a parser for restructured text. Although close in many regards, restructured text is not Google doc string format. We're using this parser here:

https://stsewd.dev/tree-sitter-rst/

If you paste your docstring into that online parser, you'll see the AST we generate. We're trying to convert google docstrings into restructured text before parsing, but there's likely things that don't work well (and it's not always easy to tell google docstrings from just personal preferences for how to write docstrings).

This is an example:

Returns:
        A dict mapping keys to the corresponding table row data
        fetched. Each row is represented as a tuple of strings. For
        example:

        {b'Serak': ('Rigel VII', 'Preparer'),
         b'Zim': ('Irk', 'Invader'),
         b'Lrrr': ('Omicron Persei 8', 'Emperor')}

        Returned keys are always bytes.  If a key from the keys argument is
        missing from the dictionary, then that row was not found in the
        table (and require_all_keys must have been False).

That generates the following AST:

image

That AST gives us the following markdown:

image

Which is obviously off.

However if we modify the docstring to be what I think it would be intended in RST:

Returns
--------

A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example::

            {b'Serak': ('Rigel VII', 'Preparer'),
             b'Zim': ('Irk', 'Invader'),
             b'Lrrr': ('Omicron Persei 8', 'Emperor')}

Returned keys are always bytes.  If a key from the keys argument is
missing from the dictionary, then that row was not found in the
table (and require_all_keys must have been False).

We generate correct markdown:

image

Given the number of ways people write docstrings, RST seemed like a more flexible endpoint, so we chose it as the basis for our parsing. We're hoping we can make it work for most other formats too, but as you can see it's not there yet.

@macintacos
Copy link
Author

I appreciate the context @rchiodo, much appreciated!

We chose Parameters because that's a common section header in a lot of docstrings. Like Numpy docstrings. The logic for the parameter hoisting was looking for 'parameters' to hoist and that's how it finds them. Args (like google docs) would have worked too. Attributes doesn't.

Does numpy say that it follows Googles docstring format? Or is it doing its own thing?

The documentation that Google provides does not say that "Args" is a valid section for a class docstring, it's intended to just be used in method/function docstrings. "Attributes" is called out as what is used for class properties/attributes in class docstrings. "Parameters" is not mentioned as a valid section. I just want to make sure that this handles Google's docstring format, which is what I opened this issue about (not Numpy's format).

I totally get that there might be some implementation issues related to dealing with Google docstrings, but unless the changes that have been made properly handle them, I don't think this issue should be closed.

That being said, I will wait for that bug to be fixed and this feature is no longer "experimental" to see if all of my concerns are addressed. Thank you for taking the time to make this better!

@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

Does numpy say that it follows Googles docstring format? Or is it doing its own thing?

Numpy follows sphinx formatting which uses restructured text.

Edit: It seems Sphinx supports restructured text and google. Their default is restructured but their docs also mention support for google too. It would be interesting to see how they detect the difference, although I suppose we could just offer a mode or something.

@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

Here's the issue about attributes hoisting:
#6012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working docstrings fixed in next version (main) A fix has been implemented and will appear in an upcoming version P2
Projects
None yet
Development

No branches or pull requests

9 participants