Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docstring formatting does not always respect line length in dynamic mode #9126

Closed
stinodego opened this issue Dec 14, 2023 · 3 comments · Fixed by #9129
Closed

Docstring formatting does not always respect line length in dynamic mode #9126

stinodego opened this issue Dec 14, 2023 · 3 comments · Fixed by #9129
Assignees
Labels
bug Something isn't working docstring Related to docstring linting or formatting formatter Related to the formatter

Comments

@stinodego
Copy link
Contributor

First off, thanks so much for the great work on the docstring formatter! I'm very much looking forward to using it for the Polars code base.

When running the new docstring formatter on our code base, I found a number of examples where dynamic line length mode did not produce the expected result. See the three examples below:

def example1():
    """
    Docstring example containing a class.

    Examples
    --------
    >>> @pl.api.register_dataframe_namespace("split")
    ... class SplitFrame:
    ...     def __init__(self, df: pl.DataFrame):
    ...         self._df = df
    ...
    ...     def by_first_letter_of_column_values(self, col: str) -> list[pl.DataFrame]:
    ...         return [
    ...             self._df.filter(pl.col(col).str.starts_with(c))
    ...             for c in sorted(
    ...                 set(df.select(pl.col(col).str.slice(0, 1)).to_series())
    ...             )
    ...         ]
    """


class Nested:
    def example2():
        """
        Regular docstring of class method.

        Examples
        --------
        >>> df = pl.DataFrame(
        ...     {"foo": [1, 2, 3], "bar": [6, 7, 8], "ham": ["a", "b", "c"]}
        ... )
        """


def example3():
    """
    Pragma comment.

    Examples
    --------
    >>> af1, af2, af3 = pl.align_frames(
    ...     df1, df2, df3, on="dt"
    ... )  # doctest: +IGNORE_RESULT
    """

In ruff 0.1.8, running the ruff docstring formatter on these with line-length = 88 will produce line lengths over 88.

I think examples 1 and 2 are clear violations. Example 3 is debatable whether the result is desired or not.

@MichaReiser MichaReiser added docstring Related to docstring linting or formatting formatter Related to the formatter labels Dec 14, 2023
@MichaReiser
Copy link
Member

MichaReiser commented Dec 14, 2023

Thanks for the feedback!

I wonder if this is happening because we only subtract $indent.level * indent.width$ but not necessarily the space required by >>>?

@BurntSushi would you mind taking a look

@MichaReiser MichaReiser added the bug Something isn't working label Dec 14, 2023
@MichaReiser MichaReiser added this to the Formatter: Stable milestone Dec 14, 2023
@stinodego
Copy link
Contributor Author

I wonder if this is happening because we only subtract but not necessarily the space required by >>> ?

That might be it actually. I figured it might be because of edge cases with classes or something, but indeed a simple example like this also exceeds the line length, but only by 4 characters:

def example3():
    """
    ...

    Examples
    --------
    >>> af1, af2, af3 = pl.align_frames(
    ...     df1, df2, df3, on="dt", foo="looooooooooong_string"
    ... )
    """

BurntSushi added a commit that referenced this issue Dec 14, 2023
This fixes a bug where the current indent level was not calculated
correctly for doctests. Namely, it didn't account for the extra indent
level (in terms of ASCII spaces) used by by the PS1 (`>>> `) and PS2
(`... `) prompts. As a result, lines could extend up to 4 spaces beyond
the configured line length limit.

We fix that by passing the `CodeExampleKind` to the `format` routine
instead of just the code itself. In this way, `format` can query whether
there will be any extra indent added _after_ formatting the code and
take that into account for its line length setting.

Fixes #9126
@BurntSushi
Copy link
Member

Nice find! I put up a fix in #9129.

@MichaReiser You were exactly correct! At least for this bug, it was indeed that the line length setting on the nested call to the formatter wasn't taking the extra spaces from the doctest prompt into account.

BurntSushi added a commit that referenced this issue Dec 14, 2023
This fixes a bug where the current indent level was not calculated
correctly for doctests. Namely, it didn't account for the extra indent
level (in terms of ASCII spaces) used by by the PS1 (`>>> `) and PS2
(`... `) prompts. As a result, lines could extend up to 4 spaces beyond
the configured line length limit.

We fix that by passing the `CodeExampleKind` to the `format` routine
instead of just the code itself. In this way, `format` can query whether
there will be any extra indent added _after_ formatting the code and
take that into account for its line length setting.

We add a few regression tests, taken directly from @stinodego's
examples.

Fixes #9126
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working docstring Related to docstring linting or formatting formatter Related to the formatter
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants