Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespaced packages in a repo with a shared .isort.cfg are misidentified as third-party #1704

Open
stpierre opened this issue Apr 13, 2021 · 12 comments
Labels
question Further information is requested

Comments

@stpierre
Copy link
Contributor

Consider a git repo that contains two namespaced packages, my.first.library and my.second.library, with a single shared .isort.cfg:

root
  .git/
  .isort.cfg
  my_first_library/
    my/
      first/
        library/
          __init__.py
  my_second_library/
    my/
      second/
        library/
          __init__.py

The fact of containing two libraries is actually incidental; AFAICT the issue is the extra directory in there. But probably the most common use case for this would be a monorepo containing multiple libraries.

In this case, running isort . inside my_first_library/ will erroneously identify imports from my.first.library as third-party. The same is true in my_second_library/ with imports from my.second.library. Copying .isort.cfg into my_first_library/ is sufficient to solve this, but in our case we have a monorepo with dozens and dozens of python libraries and would rather not duplicate our .isort.cfg if we don't have to. A symlink also works, but we're hoping for a built-in solution.

I know this is a serious edge case, but is there any way to give isort a hint about where the root directory of the package is to help it identify first-party packages?

Note that this works fine for non-namespaced packages, so it has the feel of a bug, but it's also probably low priority.

@timothycrosley
Copy link
Member

timothycrosley commented Apr 17, 2021

@stpierre the most direct fix for this would be to tell isort inside that one config that both of these folders are root src directories: see the option here: https://pycqa.github.io/isort/docs/configuration/options/#src-paths

inside the config this would look something like:

[settings]
src_paths = my_first_library,my_second_library

Is this an acceptable solution for your case, or do you need an approach that would dynamically add to that list as new sub folders get added?

@timothycrosley timothycrosley added the question Further information is requested label Apr 17, 2021
@stpierre
Copy link
Contributor Author

We'd want something dynamic -- we're frequently creating or retiring libraries, and adding them to .isort.cfg every time isn't awesome.

For now we're just symlinking .isort.cfg into the directories containing namespaced packages so we have a workaround. Just wish we didn't need one. :)

@timothycrosley
Copy link
Member

As a more permanent solution, would support for glob expansion in source paths be a good solution for your code layout?

[settings]
src_paths = *

@stpierre
Copy link
Contributor Author

I think that would work, but it would have another (possibly unintended) side-effect: everything in the repo would be considered first-party. As a workaround I think it's probably inferior in most cases to creating a symlink to .isort.cfg, but I imagine that might be a feature some people would appreciate. By my way of thinking, within a library only that library should be considered "first party," but I could see an argument made that everything in a particular repo should be "first party."

@timothycrosley
Copy link
Member

I see your point! I don't think there is a single correct answer, the way I generally see it is that in the common mono repo case, usually independent to the number of nested packages, they all belong to you so they can be seen as first party - as that nicely groups them apart from packages taken from the internet. I suppose, for some, an even better solution would be an additional grouping:

  • Stdlib
  • thirdparty
  • company / repository
  • firstparty

That isn't yet supported cleanly, but both uses cases are things I can put more thought into and accept additional feedback on.

@stpierre
Copy link
Contributor Author

Yeah, that would be a good solution for us.

@con-cso
Copy link

con-cso commented May 11, 2021

* Stdlib

* thirdparty

* company / repository

* firstparty

That is also something I'd like to see. I am not using a monorepo, but several in-house packages as external, pre-built dependencies. They all share the same vendor namespace:

from webob.exc import HTTPBadRequest
from foo.platform import BaseThing  # external package, same namespace
from foo.app import LocalThing  # local package

isort 5.7.0 then sorts webob smack in the middle, making the linters in my pre-commit chain complain. Pylint example:
[C0412(ungrouped-imports), ] Imports from package foo are not grouped

For my use case, a naive "always sort the local namespace last" would be sufficient. Or did I miss some arcane edge case?

Edit: I was not being clear, I meant without explicitly using known_first_party as we use a very modular approach spread across multiple teams.

@glumia
Copy link
Contributor

glumia commented Jun 17, 2021

As a more permanent solution, would support for glob expansion in source paths be a good solution for your code layout?

[settings]
src_paths = *

Hey there! First and foremost thanks for your work on this awesome tool ♥️

I have a very similar problem with a monorepo and this would be a perfect solution (I actually tried to use a wildcard hoping it was supported ahah). All our internal libraries live inside a lib folder and a functions/* path, and even if I can add them one by one to src_paths I'm pretty sure that sooner or later we'll miss some of them.

@timothycrosley I can actually try to work on it by myself in the weekend, if you're willing to guide me a bit and think it would be a good addition to isort :)

@timothycrosley
Copy link
Member

@glumia

Hey there! First and foremost thanks for your work on this awesome tool

Thanks! That means a lot!

I can actually try to work on it by myself in the weekend, if you're willing to guide me a bit and think it would be a good addition to isort :)

Are you still interested in working on this? If so, I think the easiest path would be checking for a "*" character here, and if found using the path.glob functionality and adding all directories found:

https://github.com/PyCQA/isort/blob/main/isort/settings.py#L455

@glumia
Copy link
Contributor

glumia commented Jun 22, 2021

Sure! Perfect, thanks :)

@Grillpfanne
Copy link

As a more permanent solution, would support for glob expansion in source paths be a good solution for your code layout?

[settings]
src_paths = *

You write that glob expansion is not yet supported, while the docs say it is (https://pycqa.github.io/isort/docs/configuration/options.html#src-paths).
I couldn't get * or ** to work, neither when passing as cli argument nor when using a .isort.cfg file.

Did I do something wrong or is the documentation wrong?

Appreciate the help and the work you do on isort!

@doublethefish
Copy link

According to PEP 420 this issue is about Nested Namespace Packages. It seems isort doesn't yet have any explicit handling for that type of package (auto_identify_namespace_packages doesn't seem to cover it).

There are two ways of thinking about this problem:

  1. Submodules of Nested Namespace Packages should be treated as being at the same level no matter where they are imported (for example the interpreter effectively merges them for us)
  2. Nested Namespace Packages should differentiate between local-submodules (local-sibling submdoules) and remote-submodules (non-local sibling submodules) in the same Namespace (which might be more logical and readable to developers)

Either option means the output is consistent irrespective of where isort is run from. I think this is a key goal.

For me Option 2 is more intuitive when reading and reasoning about code in monorepos and across repos, and should behave like:

# Nested Namespace Package:
project1/
   src/
        parent/
            child/
                a.py  # should treat parent.child.a_sibling as first-party, but parent.child.b_sibling as third-party (because of the project1 vs project2 root dirs)
                a_sibling.py
project2/
    src/
        parent/
             child/
                b.py # should similarly treat parent.child.b_sibling as first-party, but parent.child.a_sibling as third-party
                b_sibling.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants