Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for detecting whether an import is only made during type checking #64

Closed
antdking opened this issue Aug 5, 2019 · 21 comments · Fixed by #130
Closed

Add support for detecting whether an import is only made during type checking #64

antdking opened this issue Aug 5, 2019 · 21 comments · Fixed by #130

Comments

@antdking
Copy link

antdking commented Aug 5, 2019

Specifically, be able to detect when imports are used under typing.TYPE_CHECKING.

Raising this in Grimp, as it appears the functionality isn't in place, and so can't be implemented with a custom contract.

While I understand that this is solved by introducing an interface, this results in a lot of boilerplate code when all you're seeking is basic static analysis + hints in an editor.

Given a Contract:

name = Prevent cross-reliance on API Clients
type = independence
modules =
    ulama.api_client_1
    ulama.api_client_2
    ulama.remote_transfer

and modules of:

# ulama/api_client_1.py
class ApiClient1: ...

# ulama/api_client_2.py
class ApiClient2: ...

# ulama/remote_transfer.py
from typing import TYPE_CHECKING
  from .api_client_1 import ApiClient1
  from .api_client_2 import ApiClient2

class RemoteTransfer:
  from_client: ApiClient1
  to_client: ApiClient2

I would expect this to pass the contract.

to reiterate, understand it'd be better to introduce a stop gap, but feel there's a use-case for keeping things simple in the code, while still being able to offer an external contract to prevent mistakes.

@seddonym
Copy link
Owner

Hi Anthony,

Thanks for this - and sorry for the delay in responding.

Am I right in thinking the code sample should read:

...
from typing import TYPE_CHECKING

if TYPE_CHECKING:
  from .api_client_1 import ApiClient1
  from .api_client_2 import ApiClient2
...

I think potentially it's a good idea, though I would lean towards it not being the default behaviour, as arguably if something knows about a type, then it's a dependency.

Happy to look at a PR if you're eager for this functionality to get included, if not I'll just leave it as an open issue for now.

@blokje
Copy link

blokje commented Aug 9, 2021

@seddonym This issue seems stale, any idea if you will ever import this or can you give a pointer on where to start?

We are looking to implement import-linter but use if TYPE_CHECKING: all over the place. We have (at this moment) no way to implement import-linter in a strict enough manner.

@seddonym
Copy link
Owner

Hi @blokje, I'm not actively planning to work on this but I would be happy to consider a PR (or two).

Perhaps the best way to approach it would be as follows.

First - as a general principle this should not be default behaviour for Grimp nor for Import Linter.

Grimp

Perhaps something like this:

>>> graph.get_import_details(
    importer='myproject.foo',
    imported='myproject.bar',
    include_type_checking=True,
)

[
    {
        'importer': 'myproject.foo',
        'imported': 'myproject.bar',
        'line_number': 5,
        'line_contents': 'from . import baz',
        'type_checking_only': True,
    },
]

Import Linter

[importlinter:contract:1]
name=Foo doesn't import bar or baz
type=forbidden
source_modules=
    myproject.foo
forbidden_modules=
    myproject.bar
    myproject.baz
ignore_type_checked_only = True

Under the hood, the contract could remove all the type_checking_only=True imports before beginning analysis.

What do you think?

@yakMM
Copy link
Contributor

yakMM commented Dec 21, 2022

Okay I had fun in grimp and tried to understand how AST works, I think I should be able to work something out and maybe open a PR at some point.

However I have a few questions regarding design decisions:

  1. I feel like it is quite hard to see if the TYPE_CHECKING attribute comes from the typing module or not. Is it okay to trigger the "filter" on all if TYPE_CHECKING calls? In other words, if the name TYPE_CHECKING is used as a variable, we will have false positives:
TYPE_CHECKING = None

if TYPE_CHECKING:
    import foo
  1. Do I need to filter all kind of if statements or just the "most used / pythonic" ones:
import typing as t

if t.TYPE_CHECKING:
    import foo  # This is flagged in my proof of concept

if t.TYPE_CHECKING is True
    import foo  # This is not

if t.TYPE_CHECKING == True
    import foo  # This is not

if t.TYPE_CHECKING != False
    import foo  # This is not

@seddonym
Copy link
Owner

I feel like it is quite hard to see if the TYPE_CHECKING attribute comes from the typing module or not.

Yeah, I have a feeling if it might actually be computationally impossible to know for sure without actually running the program. So, if we are to include this, we should probably just keep it very simple and make its limitations clear in the docs.

I'd be happy, in the first instance, just to support the narrow case of something being enclosed in the if TYPE_CHECKING form documented in the Python docs. I'm comfortable with the false positive of point 1. and for none of the items in point 2.

Honestly, if it adds too much complexity to Grimp I'm likely to push back, as I'd like to reduce the burden of a potential rewrite in a faster language such as Rust. 😄

@yakMM
Copy link
Contributor

yakMM commented Dec 22, 2022

All right thanks for the input.

It isn't too complicated I think, neither a big rewrite.
Would it be too much breaking to always add the information in the import details dict, without adding a new parameter to get_import_details?

>>> graph.get_import_details(
    importer='myproject.foo',
    imported='myproject.bar'
)

[
    {
        'importer': 'myproject.foo',
        'imported': 'myproject.bar',
        'line_number': 5,
        'line_contents': 'from . import baz',
        'type_checking_only': False,
    },
]

@yakMM yakMM mentioned this issue Dec 22, 2022
4 tasks
@seddonym
Copy link
Owner

Would it be too much breaking to always add the information in the import details dict, without adding a new parameter to get_import_details?

I think so. I want to minimise the overhead for the static analysis, so this particular check should only run if it's explicitly enabled.

@yakMM
Copy link
Contributor

yakMM commented Dec 22, 2022

There should be no significant overhead, all the AST tree is parsed anyways.

As for backward compatibility, it's about adding one entry to the returned dictionnary. Do upstream projects check all the items of the dict or only the one they need?

@seddonym
Copy link
Owner

There should be no significant overhead, all the AST tree is parsed anyways.

I tested your current implementation on a large code base and I'm pleased to say it looks like you're right - in fact your change to use the NodeVisitor actually speeds things up from 26s to 22s! Removing the check for the If statement doesn't seem to affect it that much but I would like to do more testing to confirm it.

As for backward compatibility, it's about adding one entry to the returned dictionnary. Do upstream projects check all the items of the dict or only the one they need?

I'm not concerned about adding an entry to the dictionary, that's fine. To be honest my main concern is that I personally don't see this kind of import as any different from any others, so am reluctant to distinguish them by default, but I admit that might just be prejudice. Let me have a think about it. In the meantime the PR is shaping up nicely, very elegant solution to the problem 👏🏻

@seddonym seddonym changed the title Add support for detecting Conditional Imports Add support for detecting whether an import is only made during type checking Dec 27, 2022
@yakMM
Copy link
Contributor

yakMM commented Jan 12, 2023

I'm not concerned about adding an entry to the dictionary, that's fine. To be honest my main concern is that I personally don't see this kind of import as any different from any others, so am reluctant to distinguish them by default, but I admit that might just be prejudice. Let me have a think about it.

So I was testing adding a parameter to the function to suppress the type_checking entry, but I couldn't come up with something I'm satisfied with. Either we have to manually remove the entry in the ImportGraph class, which has a performance impact, either we have to pass the parameter down to the parsing logic so the parser doesn't add the information in the first place.

Either way, it also makes type checking annotations more complex because we have to duplicate DetailedImport and DirectImport depending on whether the type_checking info is asked.

All of this is making me lean more toward the initial solution of just adding the type_checking entry, letting the users eventually ignoring it.

@seddonym
Copy link
Owner

Apologies for the delay in responding.

Ok, I think I can get behind always including it in the return value of get_import_details, if it simplifies things. I would like add_import to remain backwards compatible though, so if type_checking isn't passed then it should default to False.

@seddonym
Copy link
Owner

It just occurred to me that there is another approach we could take. Rather than include the type_checking information in Grimp's import details, we could instead pass an argument to build_graph which excludes the imports from being added to the graph in the first place.

I appreciate this comes after you've already done a fair amount of work, but we should probably at least consider it, as it could impact the eventual speed of Grimp/Import Linter if and when I come to reimplement in a faster language.

Pros:

  • DetailedImport data structure stays as it is.
  • The graph is smaller.
  • Potential for a faster graph building algorithm in the future (speculative).
  • Import Linter would need to manipulate the graph less; it would just delegate the logic to Grimp. That would probably make it faster as it wouldn't have to call get_import_details on every single import and then remove it if it includes the type_checking flag.

Cons:

  • We've already done the work for the other approach.
  • There would be more complexity if Import Linter had to check multiple contracts with different requirements for type checking, as currently it just builds the graph once and then copies it for each contract. This might lead us in the direction of - at least to begin with - specifying ignore_type_checking at the top-level configuration rather than at the contract level.
  • Possibly there are other use cases that we haven't thought of that would want to know which imports are type checking imports.

Not sure where I stand on this yet but interested to know your thoughts.

@sobolevn
Copy link

My 2c

I would be really interested in this feature, because I use a lot of typing-only imports for better type-checking, however, these imports sometimes do not respect the rules I have for regular code. I would love to see ignore_typing_imports global settings for all types of contracts.

Next, technical problems.
I think that running the program is too slow, complex, and dangerous. But, we can always define what TYPE_CHECKING is. It might be only three things:

  • TYPE_CHECKING
  • typing.TYPE_CHECKING
  • t.TYPE_CHECKING (quite popular alias)

Others can be ignored. It is a users' duty to name it properly, and this is not a problem at all - to refactor this name.

@seddonym
Copy link
Owner

Thanks for the comment. So @sobolevn your vote would be for the global approach (pass it to build_graph) rather than including it in the metadata of every import? That's certainly how I feel right now. The reason is that I don't think differentiating between these kinds of import in a granular way is important enough to potentially impact speed. If someone really wants it for some use case they could build the graph twice to tell the difference. Open to other views though. @yakMM do you have a view?

I'm not intending to implement this myself, but the original PR is in pretty good shape (notwithstanding the adjustments that might be needed regarding the previous point). @yakMM where are you at with this - would be you interested in picking this one up at some point?

It might be only three things:

TYPE_CHECKING
typing.TYPE_CHECKING
t.TYPE_CHECKING (quite popular alias)

I'm happy with that.

@QasimK
Copy link

QasimK commented Sep 25, 2023

A global approach would also be sufficient for my use-cases

@yakMM
Copy link
Contributor

yakMM commented Sep 25, 2023

Hey, sorry I was MIA, this feature was not at the top of my priority list.

If everyone is happy with a global approach we can go with option 1, ie passing an argument to build_graph which excludes the imports from being added to the graph in the first place.

If I remember correctly. the approach of inheriting NodeVisitor from the original PR was alright from a design standpoint. I'll try to rebase and update the PR (or open a new one) with up to date code and option 1 this week.

@seddonym
Copy link
Owner

Great news!

If I remember correctly. the approach of inheriting NodeVisitor from the original PR was alright from a design standpoint.

Yes absolutely, in fact it slightly speeds up the building of the graph.

@yakMM
Copy link
Contributor

yakMM commented Sep 29, 2023

Draft PR here #130

If anyone wants to help with test/doc, feel free to PR against my fork https://github.com/yakMM/grimp/tree/64-type-checking

@seddonym
Copy link
Owner

Great, looking forward to reviewing - I'll try to take a look early next week.

@seddonym
Copy link
Owner

This is now released in Grimp 3.1. Nice work everyone!

@seddonym
Copy link
Owner

I've made an issue in Import Linter to make use of the flag. @yakMM are you interested in taking this one on? Should be much more straightforward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants