Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulture : use pyflakes as an initial parse basis #605

Open
jayvdb opened this issue Aug 1, 2018 · 4 comments
Open

Vulture : use pyflakes as an initial parse basis #605

jayvdb opened this issue Aug 1, 2018 · 4 comments
Assignees

Comments

@jayvdb
Copy link
Member

jayvdb commented Aug 1, 2018

Pyflakes does a lot of analysis which can be used by vulture.

pyflakes has long ago decided that it will not report dead code.

However it does need to report other problems caused by dead code, and this can be done by tracking dead code.

There is a POC at

https://github.com/jayvdb/pyflakes/tree/dead-code

vulture can use this base on which to do more sophisticated analysis which is beyond scope of vulture.

One of the key benefits of using pyflakes is errors in the code can be detected by pyflakes base and vulture can fail when those errors would make vulture results unreliable.

@jayvdb
Copy link
Member Author

jayvdb commented Aug 1, 2018

Maybe @RJ722 & @MacBox7 can help with mentoring this project next year, if @jendrikseipp agrees with the concept. cc @myint

@jendrikseipp
Copy link
Contributor

Interesting. Is this project about changing Pyflakes, Vulture, or both? I suppose the added functionality for pyflakes can be seen in the test cases you added in the POC. Could you please say how Vulture's functionality could be extended by this, maybe with a concrete example?

@jayvdb
Copy link
Member Author

jayvdb commented Aug 2, 2018

Hi @jendrikseipp , I would view this as a potential pyflakes + vulture project.

The POC above would be merged to pyflakes before this GSoC project (perhaps a little bit revised, and maybe reduced).

That POC adds only limited use of the dead scopes it detects; limited intentially by the purpose of pyflakes.

One part of the project is enhancing vulture to use pyflakes to do the initial parse of files.
This would remove a lot of basic analysis from vulture, where that is already being done by pyflakes.

There are two ways this could be done.

  1. The Vulture node visitor would take the result of pyflakes as its basis, and then it revisits everything to find all the extra things needed by vulture.

  2. It subclasses the pyflakes.checker.Checker visitor like we are doing at https://gitlab.com/coala/bears/coala-pyflakes/blob/master/pyflakes_bears/PyFlakesASTBear.py#L17 to collect the extra bits of info that vulture needs, in the same visit.

how Vulture's functionality could be extended by this

pyflakes handles a lot of corner cases which I very strongly doubt vulture handles yet, having looked at the vulture visitor.

By using the pyflakes ast parser, vulture will benefit from those corner cases being solved. If it helps, I can spend a bit of time finding some of the bigger ones which impact vulture more (I have one below). But really the problem is that there are lots of tiny little ones which wouldnt be considered very important for vulture to handle correctly, and fixing them all one by one would result in lots of code churn, possibly bugs created, and a lot more unnecessary complexity in vulture. By deferring the initial parse to pyflakes, vulture could have reduced complexity, as its codebase would focus only on its primary function which is determining dead code.

There is still one very large 'corner' case which pyflakes doesnt handle well, and that is del. I notice that vulture doesnt handle this specifically at all, so maybe that is an area where this project might extend vulture beyond its current capabilities. However I must note that vultures general algorithm catches basic dead code created by del. Simple examples follow, but to really bypass vultures algorithm requires more complicated examples.

e.g. vulture has standard confidence that variable b here is unused:

def a():
    b = 100
    del b


a()

However it doesnt see that the following is an error, while pyflakes does detect this problem:

def a():
    b = 100
    del b
    return b


a()

But more importantly it cant see that the first b was unused in the following:

def a():
    b = 100
    del b
    b = 200
    return b


a()

pyflakes also doesnt catch that b was unused there. It does know it was deleted, but the del support is a bit of a hack so its error reporting system doesnt get a chance to see what happened here.

So this is one example where we might enhance pyflakes and vulture at the same time, if vulture was layered on top of pyflakes.

I had a crack at enhancing this in pyflakes a few years ago (PyCQA/pyflakes#37), but it isnt a priority for me, and it is not a lazy-day project, so it keeps falling off the to-do list. I have been incrementally solving problems with that patch, and @MacBox7 this year solved two of structural problems which complicated better del support, i.e. node<->scope links (PyCQA/pyflakes#339) and builtins (PyCQA/pyflakes#351).
There is one outstanding structural problem, which is the deferred functions and assignments logic inside pyflakes.

If we decided to go ahead with a del based project, the milestones of it might look like

  1. Unravel deferred functions in pyflakes and add better del support

  2. vulture to use pyflakes for initial parse

  3. vulture to use pyflakes del support to detect more cases of dead code &
    further improve better del support in pyflakes

@jendrikseipp
Copy link
Contributor

I am quite reluctant about making Vulture depend on Pyflakes since external dependencies usually incur a lot of extra maintenance cost. This is even worse when depending on the internal Pyflakes code instead of its command line interface. I'd only consider depending on Pyflakes if we see that it would add a lot of value to Vulture. The del issues you mention above don't really convince me, since the problem in your second example stems from erroneous code, not from unused code. The unused code in the third example is not reported by Vulture since it ignores variable scopes by design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants