Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address misleading import behavior when given files #51

Closed
asmeurer opened this issue Feb 1, 2022 · 10 comments · Fixed by #54
Closed

Address misleading import behavior when given files #51

asmeurer opened this issue Feb 1, 2022 · 10 comments · Fixed by #54
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@asmeurer
Copy link

asmeurer commented Feb 1, 2022

I found it somewhat misleading when I ran slotscheck mymod/ and it instead tested the installed version of mymod. This is especially since the command line options seem to imply that the argument is a file unless the -m option is provided.

I also tried this in an environment that doesn't have my module installed and it couldn't import it, which I also found surprising given that the module is there in the directory I ran slotscheck from.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Feb 1, 2022

Hi @asmeurer, thanks for posting!

The cause here is the difference between running python -m slotscheck [FILES] and slotscheck [FILES]. This is why the 'files' case is documented with python -m in the readme. I do mention the details deeper into the docs. The approach is similar to pyanalyze.

I agree though that this could be made more explicit, as the behavior of running with python -m is not that well known. I can think of three ways to improve the situation:

  1. Add a link to 'how does slotscheck resolve imports' up front in the README
  2. Print a warning if slotscheck is given files, but imports them from elsewhere. Note that an error is not appropriate, because it can be valid behavior if developing a library in editable mode, and checking through pre-commit.
  3. Add an --auto-add-pypath option which automatically adjusts the PYTHONPATH based on the files it it given. I did consider this, but my feeling is that this would be too 'magical' behavior. It'd be more explicit to have users set their own PYTHONPATH.

What are your thoughts on the preferred solution?

@ariebovenberg ariebovenberg added documentation Improvements or additions to documentation question Further information is requested labels Feb 1, 2022
@asmeurer
Copy link
Author

asmeurer commented Feb 1, 2022

Is it not possible to make it "just work", e.g., by making sure the current directory is at the front of the sys.path? If I pass it a file or path, I would expect it to operate on that path, especially given that there is a separate -m option to pass a module. I'm unclear why you would ever want the current behavior. I must be misunderstanding the workflow you are describing in your point number 2.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Feb 1, 2022

I do agree that slotscheck should do away with the 'suprise' of importing from elsewhere. My remark was that it could be useful in case the module is installed in editable mode. But I suppose people should just explicitly use python -m slotscheck or PYTHONPATH to handle that case.

I'm hesitant to automatically adjust sys.paththough. This is because automatically changing something so global and fundamental is bound to have unintended consequences. I think it's telling that tools like mypy or pyanalyze don't automagically do this either when passed files.

I'd therefore suggest this change:

  1. When passed files, slotscheck will refuse to run if it would import them from somewhere else.
  2. Better document the use of python -m slotscheck and PYTHONPATH for scanning files.

@ariebovenberg ariebovenberg added this to the v0.11 milestone Feb 1, 2022
@asmeurer
Copy link
Author

asmeurer commented Feb 1, 2022

I'm a little unclear why it even works this way to begin with. Isn't . at the front of sys.path by default?

@ariebovenberg
Copy link
Owner

@asmeurer that was my impression as well, but I discovered recently that . is not in sys.path when running entrypoints/scripts like pip, pyanalyze or slotscheck.

python -m <module> does ensure the current working directory is in sys.path. The docs say this:

If this option is given, the first element of sys.argv will be the full path to the module file (while the module file is being located, the first element will be set to "-m"). As with the -c option, the current directory will be added to the start of sys.path.

Why does this even work in the first place then? slotscheck determines the module name from the files given, but it must use the import logic to import it. If the current directory is not in sys.path but there is a module with the same name installed, it will assume the import is fine and continue.

That behavior can result in surprising behavior, and should be changed.

@asmeurer
Copy link
Author

asmeurer commented Feb 1, 2022

It sounds to me like slotscheck should always add the parent directory of the given module to the path before importing. Based on what you are saying, even if you did python -m slotscheck a/b/module it wouldn't work.

Actually it would be more complicated than that because you might get passed a submodule or a specific .py file, so you'd have to check for __init__.pys (and hope that somehow namespace packages aren't relevant). I'm sure someone's solved this problem before, but I wouldn't know where to look.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Feb 1, 2022

What happens when given a/b/module.py is that slotscheck:

  1. uses __init__.py files to determine the full path of the module (i.e. a.b.module).
    The issue with implicit/native namespace packages is documented. You can have a look in test_discovery.py what cases are tested. Thankfully implicit/native namespace packages are not widely used.
  2. uses the standard import mechanism to import a.b.module.
    If the a directory is in sys.path (like when running with python -m) everything goes OK.

The implicit namespace package discovery seems to have been solved by mypy (since they support it), but I don't expect the logic to be reusable.

@asmeurer
Copy link
Author

asmeurer commented Feb 1, 2022

I meant the case when a is not actually a module itself, like when you have src/module.

Regarding implicit namespace packages, IMO I wouldn't care about those unless someone requests it (I still frankly don't understand when you would ever want to use one).

@ariebovenberg
Copy link
Owner

ariebovenberg commented Feb 2, 2022

  • if a module like src/a/b/module (with src not having __init__.py) is installed in editable mode, slotscheck will import a.b.module through site-packages, which is linked to the actual files, so it will scan the up-to-date code. If the package is not installed, you'll get an import error. You'd have to add PYTHONPATH=src to the slotscheck invocation.
  • on namespace packages: I find they're useful to prefix internal packages at companies (acme.foo, acme.bar), but I always use the pkgutil-style ones that are maximally compatible.

@ariebovenberg ariebovenberg changed the title Cannot check package without installing it Address misleading import behavior when given files Feb 2, 2022
@ariebovenberg ariebovenberg self-assigned this Feb 2, 2022
@ariebovenberg ariebovenberg added enhancement New feature or request and removed question Further information is requested labels Feb 2, 2022
@ariebovenberg ariebovenberg mentioned this issue Feb 4, 2022
8 tasks
@ariebovenberg
Copy link
Owner

This explicit check is implemented in v0.11.

Example of the message:

$ slotscheck mylib
Cannot check due to import ambiguity.
The given files do not correspond with what would be imported:

  'import mylib' would load from:
  /Users/arie/.pyenv/versions/3.10.0/envs/mylib/lib/python3.10/site-packages/mylib/__init__.py
  instead of:
  /Users/arie/code/mylib-repo/mylib/__init__.py

You may need to define $PYTHONPATH or run as 'python -m slotscheck'
to ensure the correct files can be imported.

See slotscheck.rtfd.io/en/latest/discovery.html
for more information on why this happens and how to resolve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants