Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RemoveUnusedImports] Support string type annotations #353

Merged
merged 7 commits into from Jul 31, 2020

Conversation

zsol
Copy link
Member

@zsol zsol commented Jul 28, 2020

Summary

This PR adds support for detecting imports being used by string type
annotations, as well as imports suppressed by comments.

It breaks up the existing visitor into multiple smaller, single-purpose
visitors, and composes them together.

TODO:

  • add docstrings
  • document behavioral changes to RemoveImportsVisitor

Test Plan

existing and new tests

This PR adds support for detecting imports being used by string type
annotations, as well as imports suppressed by comments.

It breaks up the existing visitor into multiple smaller, single-purpose
visitors, and composes them together.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 28, 2020


DEFAULT_SUPPRESS_COMMENT_REGEX = (
r".*\W(lint-ignore: ?unused-import|noqa|lint-ignore: ?F401)(\W.*)?$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest this one instead:

r".*\W(noqa[^:]|noqa: ?F401|lint-ignore: ?unused-import|lint-ignore: ?F401)(\W.*)?$"

So it ignores things like # noqa, # noqa: F401, but not # noqa: IG61

Comment on lines 53 to 55
self.context.scratch.get(
self.SUPPRESS_COMMENT_REGEX_CONTEXT_KEY, DEFAULT_SUPPRESS_COMMENT_REGEX,
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing with suppressing comments at GatherUnusedImportsVisitor level is that in some use cases (linter rules in frameworks that already silent lines with such comments) this ends up in doing more work than needed.

Maybe the suppression of comments could be moved to some other visitor which adds lines to be excluded, so one could choose whether to have them collected or not by using that one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you can set the regex to something that will never match (like ^$), but we could add an explicit bool option to disable it. Would that help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, faster code is code never executed. Best thing would be if we could put these in a whole different, also reusable, place so they're never executed when they're not needed. Something like GatherIgnoredLines or something like that. But that's only a suggestion for those use cases where this isn't really needed.

zsol added 3 commits July 29, 2020 12:18
- move lint suppression out from visitor into codemod
- change suppression regex
)

METADATA_DEPENDENCIES = (
PositionProvider,
*GatherCommentsVisitor.METADATA_DEPENDENCIES,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only need to list the direct metadata dependency of RemoveUnusedImportsCommand here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimmylai, so the line *GatherCommentsVisitor.METADATA_DEPENDENCIES, should be removed, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@jimmylai
Copy link
Contributor

Overall looks good and the test coverage is high. I'll let @Kronuz to do another review and accept to see if he need more changes for reusability.

Comment on lines +16 to +18
DEFAULT_SUPPRESS_COMMENT_REGEX = (
r".*\W(noqa|lint-ignore: ?unused-import|lint-ignore: ?F401)(\W.*)?$"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, don't forget about this one as well, I think it's better than the current one:

DEFAULT_SUPPRESS_COMMENT_REGEX = (
    r".*\W(noqa[^:]|noqa: ?F401|lint-ignore: ?unused-import|lint-ignore: ?F401)(\W.*)?$"
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match # noqa though, which is the most common case

Copy link
Contributor

@Kronuz Kronuz Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right!

What do you think of:

r".*\W(noqa|lint-ignore)(: ?(F401|unused-import)|(?= )|$)(\W.*)?$"

)

METADATA_DEPENDENCIES = (
PositionProvider,
*GatherCommentsVisitor.METADATA_DEPENDENCIES,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimmylai, so the line *GatherCommentsVisitor.METADATA_DEPENDENCIES, should be removed, right?


def visit_Module(self, node: cst.Module) -> bool:
export_collector = GatherExportsVisitor(self.context)
node.visit(export_collector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimmylai, just wondering... won't calling node.visit(export_collector) and node.visit(annotation_visitor) here traverses the tree multiple times? wouldn't this be inefficient?

I was thinking maybe using multiple inheritance instead of this, so we make it that it only traverses the tree once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will traverse the tree multiple times. I don't think that's a big deal for a codemod like this. Running the codemod on a bunch of large files is still not visibly slower than before. I don't have benchmarks for you though :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use this in a linter, which will run over and over, while the user types, we want this as fast as possible, otherwise linting times will hit the experience. Although I’m not sure how expensive traversing the tree is, processing time would be exponential, potentially noticeable on big files. What do you think @jimmylai ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance is a concern for lint but not for codemod. In lint, we try to avoid creating a visitor and calling node.visit(visitor).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @m.visitused in this codemod is not supported in the lint framework since we worried about the the pattern can introduce some expensive checks. So we only use m.matches() in lint.

return True

@m.visit(
m.Import()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could we add other exceptions to this reusable class, like if we wanted to ignore import __strict__, for example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to apply filtering on the results of this visitor (just like how you pointed out that the suppression filtering belongs outside of this class). So I would filter for special module names after running GatherUnusedImportsVisitor, and keep the exceptions here to a minimum (i.e. the ones implied by Python itself)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree :)

Copy link
Contributor

@jimmylai jimmylai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make GatherCommentsVisitor, GatherNamesFromStringAnnotationsVisitor and GatherUnusedImportsVisitor reusable in lint framework and address the concerns on performance.
My proposal is to not use @m.visit and @m.call_if_inside in the implementation (just like GatherImportsVisitor and GatherExportsVisitor). Then we can configure to run those helper visitors in lint framework before running lint rules. So the computed data will be accessible in each lint rule. What do you think? @Kronuz @zsol

@zsol
Copy link
Member Author

zsol commented Jul 30, 2020

I'm ok with that but won't have time to implement it this week. How about we land this PR and open a new one for these proposed refactors?

@Kronuz
Copy link
Contributor

Kronuz commented Jul 30, 2020

@jimmylai, sounds good!

@zsol, thank you for working on this! It'll be super helpful!! 😀

@zsol zsol merged commit 6a5e739 into Instagram:master Jul 31, 2020
@zsol zsol deleted the better-unused-imports branch July 31, 2020 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants