Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow when analyzing complex function #3138

Closed
erictraut opened this issue Mar 2, 2022 · 29 comments
Closed

Stack overflow when analyzing complex function #3138

erictraut opened this issue Mar 2, 2022 · 29 comments
Labels
addressed in next version Issue is fixed and will appear in next published version bug Something isn't working

Comments

@erictraut
Copy link
Collaborator

This is a tracking bug. For the original, see microsoft/pylance-release#2427 (comment).

Thanks to @excubo-jg, I was able to get my hands on the code and repro the problem. It involves a method that is over 4000 lines long and has many dozens of local variables (all of them unannotated and therefore require inference), and about a thousand code flow statements (if, else, etc.). This is likely beyond our ability to analyze in a reasonable amount of time with reasonable resources (including stack space). I'll need to look for ways to abandon an analysis attempt for a function of this complexity rather than crashing.

@erictraut erictraut added the bug Something isn't working label Mar 2, 2022
@excubo-jg
Copy link

Reacting to your comment about missing annotations I have added those to hopefully almost all of the local variables in the method, renamed the old version to _notannotated and updated the repo. From a user perspective we did not expect the code complexity to pose such challenges.
As you can see from the log we are also getting plenty of "long operation" warnings but not much insight into what causes those. More verbose feedback from pylance would be appreciated to mitigate the code complexity - if possible.

@erictraut
Copy link
Collaborator Author

Refactoring code into subroutines will be the most effective way to reduce complexity.

@erictraut
Copy link
Collaborator Author

I've added logic to compute cyclical complexity of a function and abandon attempts to perform code flow analysis if the complexity is above a certain threshold. This prevents the type analyzer from crashing. If type checking is enabled, a diagnostic is emitted that indicates that the complexity limit has been reached. This diagnostic is suppressed when type checking is disabled because code flow analysis results are not as important when type checking diagnostics are disabled.

Even when skipping code flow analysis of the run method, it still took several minutes to evaluate the types of some variables used within this method because of the lack of type annotations and the large number of assignments to these variables. To address this, I've added some additional threshold limits and improved the scalability of a few other algorithms. These changes reduced the analysis time to about 4 seconds for the file. That's still very long, but it's much better than before.

You can further reduce this time by breaking the functionality into smaller, less-complex subroutines.

Thanks again for providing access to the code. I wouldn't have been able to diagnose this issue without access to the code.

@erictraut erictraut added the addressed in next version Issue is fixed and will appear in next published version label Mar 4, 2022
@excubo-jg
Copy link

Many thanks for the swift turnaround and the performance improvement. We will do further rework on the code as per your suggestions.

@erictraut
Copy link
Collaborator Author

This is addressed in pyright 1.1.227, which I just published. It will also be included in the next release of pylance.

@excubo-jg
Copy link

I ran VS Version 17.2.0 Preview 4.0 with pylance 2022.3.3 which should inlcude the fix. Alas, I get this as output:
Info: (13672) Pylance language server 2022.3.3 (pyright dd7420c) starting
Info: (13672) Server root directory: c:\program files\microsoft visual studio\2022\preview\common7\ide\extensions\microsoft\python\core\pylance\dist
Info: (13672) No configuration file found.
Info: (13672) No pyproject.toml file found.
Info: (13672) Setting pythonPath for service "DominionSolver": "C:\Users\Admin\anaconda3\envs\dominion\python.exe"
Warning: (13672) stubPath C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\typings is not a valid directory.
Info: (13672) Assuming Python version 3.9
Info: (13672) Assuming Python platform Windows
Info: (13672) Searching for source files
Info: (13672) Found 27 source files
Info: (13672) Background analysis(1) root directory: c:\program files\microsoft visual studio\2022\preview\common7\ide\extensions\microsoft\python\core\pylance\dist
Info: (13672) Background analysis(1) started
Info: (13672) [BG(1)] Long operation: checking: C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (644912ms)
Info: (13672) [BG(1)] Long operation: analyzing: C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (646178ms)
Info: (13672) [BG(1)] Long operation: getSemanticTokens range 3650:0 - 3751:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (3816ms)
Info: (13672) [BG(1)] Long operation: getSemanticTokens range 3650:0 - 3751:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (3714ms)
Info: (13672) [BG(1)] Long operation: getSemanticTokens range 3650:0 - 3751:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (3177ms)
Info: (13672) [BG(1)] Long operation: getSemanticTokens range 3650:0 - 3751:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (3068ms)
Info: (13672) [BG(1)] Long operation: getSemanticTokens range 3650:0 - 3751:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (3118ms)
Info: (13672) [BG(1)] Long operation: getSemanticTokens range 3650:0 - 3751:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (3069ms)
etc.

I have updated the repo with the current version of the file which causes the problems

@erictraut
Copy link
Collaborator Author

@excubo-jg, have you tried with the latest version of pylance (2022.4.2) which was released this week? Version 2022.3.3 is several versions ago, and I've fixed multiple bugs since then that were causing stack overflows.

I just checked out the latest version of your repo, and I'm not able to repro a crash. Analysis does take a long time, which is something I would like to investigate further.

Let me know if you're still able to repro the crash with the latest version.

@excubo-jg
Copy link

No, I did not experience a crash alas the analysis does not come to a conclusion - at least in reasonable time.

microsoft/pylance-release#2427 (comment) asked to reopen if problems are present after 2022.3.1

@erictraut
Copy link
Collaborator Author

Thanks for confirming that it no longer crashes. I've added some additional code to detect functions that are too complex to analyze in a reasonable time. Rather than attempting to analyze (taking tens of seconds or longer), it now emits an error indicating that the function is too complex for analysis. The DominionSimulation.run method in quick_dominion.py is one such method. It is over 4800 lines long and has a cyclomatic complexity of over 1600!!! With this mitigation in place, analysis of the module quick_dominion.py now takes just 2.5 sec rather than 30 sec.

@excubo-jg
Copy link

Many thank for coming back so quickly. Will these changes lead to pyright not being continuously executed? I did run some more test and the software never stops analyzing the code although VS is in the background without any user interaction. Also, especially the initial analysis times I am seeing are far longer than the 30sec you mention.

Not sure if this is relevant but the logfile differs between quick_dominion.py standalone (as in the share repo) and it being part of the whole project. The standalone analysis fails with "getSemanticTokens full" whereas the other does not.

Stand alone:
Info: (11232) Background analysis(1) started
Info: (11232) [BG(1)] Long operation: getSemanticTokens range 0:0 - 76:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (3059ms)
Info: (11232) [BG(1)] Long operation: checking: C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (436818ms)
Info: (11232) [BG(1)] Long operation: analyzing: C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (436818ms)
Info: (11232) [BG(1)] Long operation: getSemanticTokens full at C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (63560ms)
Info: (11232) [BG(1)] Long operation: getSemanticTokens delta previousResultId:1650614507184 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (63713ms)
Info: (11232) [BG(1)] Long operation: getSemanticTokens delta previousResultId:1650614575738 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (62530ms)
Info: (11232) [BG(1)] Long operation: getSemanticTokens delta previousResultId:1650614639763 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (60049ms)
Info: (11232) [BG(1)] Long operation: getSemanticTokens delta previousResultId:1650614702564 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\pylancedemo\ai\quick_dominion.py (60127ms)

Full project:
Info: (11084) Background analysis(1) started
Info: (11084) [BG(1)] Long operation: getSemanticTokens range 3600:0 - 3701:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (23570ms)
Info: (11084) [BG(1)] Long operation: checking: C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (448684ms)
Info: (11084) [BG(1)] Long operation: analyzing: C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (448685ms)
Info: (11084) [BG(1)] Long operation: getSemanticTokens range 3600:0 - 3701:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (12586ms)
Info: (11084) [BG(1)] Long operation: getSemanticTokens range 3600:0 - 3701:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (12723ms)
Info: (11084) [BG(1)] Long operation: getSemanticTokens range 3600:0 - 3701:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (13589ms)
Info: (11084) [BG(1)] Long operation: getSemanticTokens range 3600:0 - 3701:0 at C:\Users\Admin\Documents\Visual Studio 2019\Projects\DominionSolver\ai\quick_dominion.py (13265ms)

@erictraut
Copy link
Collaborator Author

erictraut commented Apr 22, 2022

I presume that you are modifying the file in the editor? This will cause pylance to request new semantic tokens, which explains what we see in the log.

Here's the log result I see with the latest version of pylance (2022.4.2) on my machine.

[Info  - 7:58:20 AM] (77301) [BG(1)] Long operation: checking: <redacted>/ai/quick_dominion.py (37655ms)
[Info  - 7:58:20 AM] (77301) [BG(1)] Long operation: analyzing:<redacted>/pylancedemo/ai/quick_dominion.py (37872ms)
[Info  - 7:58:26 AM] (77301) [BG(1)] Long operation: getSemanticTokens full at <redacted>/pylancedemo/ai/quick_dominion.py (6315ms)

Note that the times are much lower than what you're seeing. If it were a factor of 2x off, that could be explained by differences in processor or memory speeds. But your times are 12x longer than what I'm seeing. I don't have a good explanation for that. I'm also seeing only one (full) semantic tokens request come from VS Code rather than multiple range requests, which you are seeing. I don't understand why VS Code is issuing the range request in your case — and why it's doing so repeatedly.

In any event, the improvements I've made in pyright should largely mitigate these issues in next week's release of pylance.

Here is the pyright log after my recent change:

[Info  - 8:03:31 AM] [FG] Long operation: checking: <redacted>/pylancedemo/ai/quick_dominion.py (2314ms)
[Info  - 8:03:31 AM] [FG] Long operation: analyzing: <redacted>/pylancedemo/ai/quick_dominion.py (2314ms)

(Note that pyright doesn't support semantic tokens — a feature that's implemented in pylance, so you don't see the getSemanticTokens call.)

Let's wait until next week's pylance release and see if that addresses the problem you're seeing on your computer.

@excubo-jg
Copy link

Many thanks. Would it be possible to always provide the user with some information where the problem appears? "Long operation: getSemanticTokens range 3600:0 - 3701" narrows it down and gives an indication about a digestible length.
To clarify: I am using VS (preview), not VSC. I am reporting times after opening the project without any editing being done. So it is VS issuing these requests - on its own it seems -, not VSC.

@erictraut
Copy link
Collaborator Author

Logging is the best mechanism we have to provide feedback to the user when a problem occurs. There's a balance we need to strike between too much logging and not enough.

Ah, I didn't catch the fact that you're using VS rather than VS Code. That explains the difference in semantic token requests.

@bschnurr
Copy link
Member

I'm actually working with VS language server client team. they are requesting too often for semantic tokens. And next week's Pylance will have some updates to improve returning semantic tokens when nothing changes.

@excubo-jg
Copy link

I have upgraded to Version 17.2.0 Preview 5.0 which still uses pylance 2022.3.3 so no change re infinite looping and run times. There is some improvement - "go to definition" finally working - but VS puts warnings under imported constants and function calls yielding hundreds of warnings of the "XX is not defined" type. This clutters the interface quite badly with erroneous highlighting.

@erictraut
Copy link
Collaborator Author

@excubo-jg, please update to the latest version of pylance. Version 2022.4.3 will be released within the next 24 hours, and it contains the latest round of optimizations. Version 2022.3.3 is a month old at this point, and I don't want to be chasing down problems that have already been fixed.

And if you see problems with the latest version, please open a new issue so it is tracked appropriately. This issue has been closed.

@excubo-jg
Copy link

Sure, will do. Last question here: Could you please advise what I have to do to make VS (preview) use pylance 2022.4.3? The program installed 2022.3.3 and I do not see an option to change that.

@bschnurr bschnurr added addressed in next version Issue is fixed and will appear in next published version and removed addressed in next version Issue is fixed and will appear in next published version labels Apr 27, 2022
@bschnurr
Copy link
Member

bschnurr commented Apr 27, 2022

in VS there is no simple way of updating. You'll have to manually copy over the latest to your server root directory.

Info: (13672) Server root directory: c:\program files\microsoft visual studio\2022\preview\common7\ide\extensions\microsoft\python\core\pylance\dist

@excubo-jg
Copy link

ok, that sounds feasible. I am stuck now because I don't know where I can download the latest set of files for VS. The pylance-release-repo only refers to VSC as far as I can tell and the files in the VSC extension are in a different format.

@excubo-jg
Copy link

I don't see how I can contribute in a meaningful manner if I am expected to use the latest pylance version but neither does VS include it nor can I patch it in. Makes reporting bugs a moot point as long as VS does not ship with latest pylance version.

@debonte
Copy link
Collaborator

debonte commented May 26, 2022

Makes reporting bugs a moot point as long as VS does not ship with latest pylance version.

@excubo-jg, VS does ship with a recent Pylance version, however VS doesn't ship very often and it also has a significant stabilization period prior to shipping during which only critical changes are accepted. If we attempted to insert the latest Pylance build just before VS released, just to have the latest, that change would be rejected.

The best option at the moment is to use Visual Studio preview releases since they come out more frequently. For example, VS 17.3 Preview 1.x (released on May 10) includes Pylance 2022.4.2 (from mid-April), and looking at the VS release schedule, that's pretty much the newest version of Pylance that we could possibly have included there.

VS's relatively slow release cycle (compared with VS Code) is a fact of life for all Visual Studio components. If using the latest Pyright/Pylance builds is a priority to you, VS Code would be the answer. On the other hand, if you need features that are only available in VS, then use VS. Both choices have pros and cons.

We do value the input of our Visual Studio users, and hope that you will continue to provide feedback!

@excubo-jg
Copy link

Many thanks for your explanations about the process. As the issues I am seeing are not fully fixed I did provide updated feedback and a repo here microsoft/pylance-release#3012 but I do not see a reaction to it.

@excubo-jg
Copy link

I have documented the performance deterioration with the latest VS preview version: microsoft/pylance-release#3012 . I am puzzled why this is not taken up as I did provide access to the repo so any further in-depth analysis can be done. I am not seeing that the pertinent issues are being solved, more to the contrary.

@erictraut
Copy link
Collaborator Author

@excubo-jg, the issue that was causing the major performance issue has been fixed in pyright. Based on your other observations in the pylance-release issue tracker, it appears that the latest version of pyright has not been included in the most recent versions of VS. You may want to try reproducing with VS Code, where pylance is updated more rapidly, to see if your issue still exists there.

@excubo-jg
Copy link

I have opened the project in VS Code (pylance v2022.7.20) which does start well-behaved (detects the warning) but after changing "while" to "whle" in line 2247 goes bonkers, e.g. does not update the problems window after the bug is fixed although CPU is running for now minutes. In spite of having Trace enabled I am not seeing any updates in output-Python Language Server so I cannot provide more details. I did provide you with access to the repo and the link should be valid,

@erictraut
Copy link
Collaborator Author

The "fix" that I previously implemented involves a heuristic — a threshold of "complexity" beyond which pyright will refuse to analyze the code. The complexity measurement takes into account both the cyclomatic complexity and the total number of nodes in the control flow graph. My previous "fix" limited the total complexity to 1024. When you modified the "while" keyword to "whle", you dropped this value to slightly below 1024, and the code analyzer once again attempted to analyze that function. I've lowered the threshold to 512 to give it more buffer room. Lowering this threshold may negatively impact other users, so I may need to make further adjustments to fine tune it.

I'll note that your source file is an extreme case — one that static code analysis will never be able to handle well. It's also a case that frankly we do not test or optimize for. I encourage you to consider refactoring your code into subroutines. This will not only make it easier to understand and maintain, but it will make it possible for tools like code formatters and static analysis tools to handle the code complexity.

@excubo-jg
Copy link

Many thanks for the quick fix. Much ado about heuristically changing one line almost three months later.

Re handling the code complexity: The code structure has to accommodate for external restrictions so that limits the flexibility... The high cyclomatic complexity is driven by a significant part of the code being hundreds of elif-statements (switch-case equivalents) of limited local complexity. Any chance of handling switch-case elements more efficiently? They all are subroutines albeit only implicit ones.

The behavior I find the most annoying is not so much about the reaction time but about error/warning-messages that are not removed although the line in question has been fixed. If the analyzer stops its analysis before evaluating a changed error line due to time constraints the code still seems to get flagged as checked. Were it possible to track if code analysis is cut short before all active messages have been revisited and if so to later initiate separate checks on those lines to verify the messages e.g. remove them?

@erictraut
Copy link
Collaborator Author

As I said, this code is an extreme case, and the type analyzer will never handle it well. The best we can do in this case is avoid a crash. I don't plan to spend any more time on this issue.

@excubo-jg
Copy link

Many thanks for sharing your explanations.
Alas, from the user's perspective it stays unclear where the line to "too extreme" is being drawn: I have no visibility on when a code base is deemed tobe too complex. I would expect a clear warning or an option to cancel pylance. Currently I am experiencing haphazard performance jumps between new versions of pylance/vs. Consistent is getting >4k meritless warnings. Which relate to perceived code complexity as you describe it but that is not deducible from the messages themselves. One would be enough.
Also, from my experience there are general problems with vs-stability and deficiencies like git scrambling breakpoints, peek definition going into infinite loops., etc. Are these all caused by the code being is too extreme? This bug recently got fixed: microsoft/debugpy#969 I am a bit amiss on what to report and what not as being too extreme,
Re "not test or optimizing for": To get another performance point I put the code into pycharm. Which is more responsive while generating more relevant warning messages and no noise. I noticed some updating glitches so I will have to enquire about their definition of extremeness

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addressed in next version Issue is fixed and will appear in next published version bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants