Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to sort re-exports in __all__ (sort_exports from isort) #1198

Closed
Jackenmen opened this issue Dec 11, 2022 · 22 comments
Closed

Option to sort re-exports in __all__ (sort_exports from isort) #1198

Jackenmen opened this issue Dec 11, 2022 · 22 comments
Assignees
Labels
rule Implementing or modifying a lint rule

Comments

@Jackenmen
Copy link
Contributor

Isort allows sorting __all__ when sort_reexports is used:
https://github.com/PyCQA/isort/blob/f547290178f86b6e5f0007cf1cb31d4736dcdaee/isort/core.py#L235-L241

@charliermarsh charliermarsh added enhancement isort Related to import sorting labels Dec 11, 2022
@charliermarsh
Copy link
Member

Is this undocumented behavior?

@Jackenmen
Copy link
Contributor Author

isort has not had a release for over a year so it's one of the features that never ended up getting released. Actually, that reminds me of another (more important) feature that also has not been released yet, and seems like ruff isn't handling it properly either, will make an issue in a moment.

@Jackenmen
Copy link
Contributor Author

Note that there's another functionality that is available in a released version of isort - sorting literals that are prepended with # isort: list or # isort: tuple:
PyCQA/isort#1358
sort_reexports option is actually using the same code for sorting.

One thing that both of these don't support is __all__ with multiple sections in it that should be sorted separately, e.g. something like what typing's code does (well, sort of, it doesn't seem like they're that good at keeping it alphabetical manually 😄):
https://github.com/python/cpython/blob/606adb4b891b52c8b9a53d29d594e996f117c0b3/Lib/typing.py#L42-L152
I'm actually not sure if isort's literal sorting even supports other profiles such as Black's, it seems to just sort it all in a pre-defined way.

@charliermarsh
Copy link
Member

Oh yeah, this was brought to my attention on Twitter. I think we could support this! There are some other things I need to get done before I spend time on it though.

@Jackenmen
Copy link
Contributor Author

Jackenmen commented Apr 15, 2023

I have a suggestion to extend this feature request a bit and allow sorting by the definition order of names (functions/classes/variables) that are put in __all__.

So for example:

from .submodule import reexported_name

__all__ = (
    "reexported_name",
    "function_b",
    "X",
    "function_a",
    "function_c",
    "Y",
)


def function_b():
    ...


class X:
   ...


def function_a():
    ...


def function_c():
    ...


class Y:
   ...

The reason for such a suggestion is that I think it makes it easier to find out what names of public APIs are missing in __all__ and what names shouldn't be in it (especially the latter one) since it allows one to just go through the file and look through the names in __all__ in the same order.

@kdeldycke
Copy link

For reference, there's another tool available to sort __all__: https://github.com/cpendery/asort

I guess it could be used to get rid of isort while we wait to have this feature implemented in ruff.

@kdeldycke
Copy link

For reference, there's another tool available to sort __all__: https://github.com/cpendery/asort

Hmm, forget about asort, I just stumble upon an issue: asort breaks docstring indention (asort#8)

That being said, regarding the sorting order, I also proposed to add support for another one that would be a case-insensitive lexicographical sorting. Not sure how popular or common this is.

@charliermarsh
Copy link
Member

@cpendery - Yeah I'm open to including this. We'll need to decide how it should be categorized (i.e., does it go in the isort rule set? Or the Ruff-specific rules?) but it seems like a reasonable thing to include.

@cpendery
Copy link

@cpendery - Yeah I'm open to including this. We'll need to decide how it should be categorized (i.e., does it go in the isort rule set? Or the Ruff-specific rules?) but it seems like a reasonable thing to include.

@charliermarsh I think it probably should be categorized as Ruff specific rule since it was never an officially released isort feature, especially if we extend it with the additional functionality of multiple sorting options

@kkirsche
Copy link

I have a suggestion to extend this feature request a bit and allow sorting by the definition order of names (functions/classes/variables) that are put in __all__.

This existed with isort's # isort: unique-list code styling comment as described in #7929.

@harshil21
Copy link

We've tried using isort's sort_exports in our project and it has several problems (from python-telegram-bot/python-telegram-bot#4052 (comment)):

We found a new package https://pypi.org/project/sort-all/ which fixes these issues, however that package has one drawback of removing the comments in the __all__.

@Bibo-Joshi
Copy link

Adding to @harshil21 s comment, I'd like to mention that we're also interested in sorting __slots__ in addition to __all__.

@AlexWaygood
Copy link
Member

I'm working on a PR to implement this feature.

@AlexWaygood AlexWaygood self-assigned this Jan 9, 2024
@AlexWaygood
Copy link
Member

AlexWaygood commented Jan 10, 2024

One thing that both of these don't support is __all__ with multiple sections in it that should be sorted separately, e.g. something like what typing's code does (well, sort of, it doesn't seem like they're that good at keeping it alphabetical manually 😄):
https://github.com/python/cpython/blob/606adb4b891b52c8b9a53d29d594e996f117c0b3/Lib/typing.py#L42-L152

So, a question on this: what should ruff do if it encounters something like this?

__all__ = [
    "d",
    "c",
    # a comment introducing a new section for 'b' and 'a':
    "b",
    "a",
]

Should it:

  1. Emit the violation on this snippet (since __all__ is not sorted), but refuse to autofix it, since the comment on its own line introduces an ambiguity regarding the user's intent

  2. Sort __all__ like this, assuming that the user wanted to use the comment to create a new section:

    __all__ = [
        "c",
        "d",
        # a comment introducing a new section for 'b' and 'a':
        "a",
        "b",
    ]
  3. Same as (2), but only if the user enabled a specific configuration flag on the command line or in their config file to enable "sorting per-section"?

I think the thing we probably shouldn't do in this kind of situation is this:

__all__ = [
    "a",
    "b",
    # a comment introducing a new section for 'b' and 'a':
    "c",
    "d",
]

@Jackenmen
Copy link
Contributor Author

For alphabetical definitely option 2. For sort by definition order I'm not really sure what would work best, I guess anything is fine though I would definitely prefer if whatever it does didn't require a manual fix.

@MichaReiser
Copy link
Member

This is an interesting question. I would expand the example by one that uses a non-section defining comment:

__all__ = [
	"c",
	# TODO: Remove once #0000 is complete
	"a"
]

In this case, I would expect the comment to move with a.

From what I understand is that isort requires the use of isort suppression comments to suppress import sorting. What if we would do the same for __all__?

__all__ = [
    "c",
    "d",
    # a comment introducing a new section for 'b' and 'a':
    "a",  #  noqa: RUFXXX Don't move 
    "b",
]

or

__all__ = [
    "c",
    "d",
    # a comment introducing a new section for 'b' and 'a':  #  noqa: RUFXXX Don't move
    "a",  
    "b",
]

@AlexWaygood
Copy link
Member

This is an interesting question. I would expand the example by one that uses a non-section defining comment [...] In this case, I would expect the comment to move with a.

Yes, that's exactly the kind of thing I was worrying about w.r.t. Option (2) -- thanks!

From what I understand is that isort requires the use of isort suppression comments to suppress import sorting. What if we would do the same for __all__?

That's an interesting idea! Let's call that Option (4). I worry with Option (4) that the precise semantics of the noqa directive might be hard for users to get their head around, though. You'd be applying a noqa directive to a specific line in __all__, but the addition of the noqa directive could have an impact on ruff's treatment of all items in __all__.

Here's three possible things the noqa directive could mean if it were applied to a comment at "index 3" in a multiline __all__ (I don't find (2) very plausible, but it would at least be simpler to implement than (3) 😅)

  1. The comment should remain at "index 3", but all items in __all__ are free to "move around" it.
  2. The comment should remain at "index 3". All items above it should be sorted, but all items below it should be untouched.
  3. The comment should remain at "index 3", and should be treated as a section delimiter. All items in the range of (previous_delimiting_comment_index + 1)..3 should be sorted within their section, and all items in the range of 4..index_of_previous_delimiting_comment should be sorted within their section.

I guess I'm leaning towards Option (3) in my original selection of options:

  • By default, comments move with the items below them with multiline __all__ definitions.
  • But, if a specific configuration flag is enabled, the behaviour changes so that ruff treats all comments on their own line as being "section delimiters" -- it will sort imports according to each section it spots, but won't disturb the sections.

If neither of the above is what the user wants, they can just # noqa the whole definition of __all__ (or just not opt into the rule at all!), so Option (1) in my original selection isn't very appealing here.

@AlexWaygood
Copy link
Member

For sort by definition order I'm not really sure what would work best

FWIW, my initial implementation of this rule is just going to do "sort by alphabetical order", so that I can get a MVP out there :)

We can possibly implement "sort by definition order" as a followup.

@MichaReiser
Copy link
Member

MichaReiser commented Jan 10, 2024

That's an interesting idea! Let's call that Option (4). I worry with Option (4) that the precise semantics of the noqa directive might be hard for users to get their head around, though. You'd be applying a noqa directive to a specific line in all, but the addition of the noqa directive could have an impact on ruff's treatment of all items in all.

That's fair. Do you know how common the use of sections is? If they're uncommon, then disabling sorting for the entire __all__ might be a reasonable option.

Is there some sort of a community standard when it comes to formatting section comments, that we could use to identify them? If not, could we allow users to configure a regex to identify the section comments?

I guess I'm leaning towards Option (3) in my #1198 (comment):

I fear that a project-level configuration might not be granular enough, but maybe it is. I really don't know.

FWIW, my initial implementation of this rule is just going to do "sort by alphabetical order", so that I can get a MVP out there :)

That's an excellent plan. Let's get sorting working and figure out how to disable sorting at a later stage :)

@AlexWaygood
Copy link
Member

Do you know how common the use of sections is? If they're uncommon, then disabling sorting for the entire __all__ might be a reasonable option.

I think they're reasonably common, but I don't really know -- I might just think this because I spend a fair amount of time staring at typing.py over at CPython, and, as pointed out earlier, we use them there :)

Is there some sort of a community standard when it comes to formatting section comments, that we could use to identify them?

I don't think so, not really :( Historically, there hasn't been a well-established tool for people to check the ordering of their __all__ definitions with, really, so I don't think it's really mattered in the past.

I fear that a project-level configuration might not be granular enough, but maybe it is. I really don't know.

Here's an idea: as well as simple noqa comments to switch isort on and off, isort also supports slightly more sophisticated "action comments": https://docs.astral.sh/ruff/linter/#action-comments. What if we had a version of that for this rule, as well as a global config setting? So, if ruff saw this:

__all__ = [  # ruff: sort-by-section
    ...
]

...then that would override the global config setting, and ruff would treat all comments taking up their own line in that __all__ definition as being section delimiters.

@MichaReiser
Copy link
Member

Here's an idea: as well as simple noqa comments to switch isort on and off, isort also supports slightly more sophisticated "action comments": docs.astral.sh/ruff/linter/#action-comments. What if we had a version of that for this rule, as well as a global config setting? So, if ruff saw this

I'm a bit hesitant of adding more "magic comments" because they're hard to remember and not easily understood. You can't click on them and get an explanation of what they do; you need to know in which places they're allowed, and they don't work well with auto formatters (formatter needs to understand them as well). That's why I would favor a solution that doesn't require a new pragma comment.

AlexWaygood added a commit that referenced this issue Jan 16, 2024
## Summary

This implements the rule proposed in #1198 (though it doesn't close the
issue, as there are some open questions about configuration that might
merit some further discussion).

## Test Plan

`cargo test` / `cargo insta review`. I also ran this PR branch on the CPython
codebase with `--fix --select=RUF022 --preview `, and the results looked
pretty good to me.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Andrew Gallant <andrew@astral.sh>
@AlexWaygood AlexWaygood added rule Implementing or modifying a lint rule and removed isort Related to import sorting labels Jan 16, 2024
@AlexWaygood
Copy link
Member

Okay, #9474 has now been merged! The next release of Ruff should include a rule and autofix to keep your __all__ definitions sorted. The sort order you get is an "isort-style" sort:

  • SCREAMING_SNAKE_CASE members come first
  • After that come CamelCase members
  • Anything else comes last
  • Within each casing category, members are sorted according to a natural sort.

Currently the rule is not configurable. We might consider adding some configuration options in the future, but for now we've decided to wait to see how much demand there is for that in the community.

The rule has been added as RUF022 rather than as an isort rule, since it seems like it doesn't really have much to do with imports. While isort does have this functionality, it's never been documented.

Enjoy tidying up your __all__ definitions! 😄

AlexWaygood added a commit that referenced this issue Jan 22, 2024
## Summary

This PR introduces a new rule to sort `__slots__` and `__match_args__`
according to a [natural sort](https://en.wikipedia.org/wiki/Natural_sort_order), as was
requested in #1198 (comment).

The implementation here generalises some of the machinery introduced in
3aae16f
so that different kinds of sorts can be applied to lists of string
literals. (We use an "isort-style" sort for `__all__`, but that isn't
really appropriate for `__slots__` and `__match_args__`, where nearly
all items will be snake_case.) Several sections of code have been moved
from `sort_dunder_all.rs` to a new module, `sorting_helpers.rs`, which
`sort_dunder_all.rs` and `sort_dunder_slots.rs` both make use of.

`__match_args__` is very similar to `__all__`, in that it can only be a
tuple or a list. `__slots__` differs from the other two, however, in
that it can be any iterable of strings. If slots is a dictionary, the
values are used by the builtin `help()` function as per-attribute
docstrings that show up in the output of `help()`. (There's no
particular use-case for making `__slots__` a set, but it's perfectly
legal at runtime, so there's no reason for us not to handle it in this
rule.)

Note that we don't do an autofix for multiline `__slots__` if `__slots__` is a dictionary: that's out of scope. Everything else, we can nearly always fix, however.

## Test Plan

`cargo test` / `cargo insta review`.

I also ran this rule on CPython, and the diff looked pretty good

---

Co-authored-by: Micha Reiser <micha@reiser.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

No branches or pull requests

9 participants