-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IntelliSense is not reactive with sympy #7159
Comments
Sympy perf was recently mentioned in #7130. Eric was looking for someone to provide a concrete example, so I'll transfer this issue over to Pyright. |
The You would normally be able to disable this behavior in pyright by setting the I ran
I'm not sure why sympy maintainers decided to add a "py.typed" marker file when the library is not typed. Perhaps they didn't understand the implications of doing so. For most libraries, pyright can infer return types relatively quickly. Sympy, however, contains many functions and methods that have complex code flow graphs and large numbers of variables, which makes type inference very costly. If this library is important to you, I recommend reaching out to the maintainers to ask them to 1) remove the "py.typed" marker, since the library is clearly not typed, or 2) add more type annotations to the library's public interface. In the meantime, you could manually delete the "py.typed" marker and then set You might also be able to find type stub packages for the sympy library. If installed, they would override the sympy library itself. If you can't find an existing type stub package, you could attempt to create a partial set of stubs yourself. I'm going to close this issue because I don't see anything that we can do in pyright to address this issue. |
Fair enough. I think we should remove the py.typed file from sympy then. I merged the PR (see sympy/sympy#22337 (comment)) to add the py.typed file to sympy without really understanding what the implications would be. The effect on checking downstream codebases was what worried me although I was more worried about the type checkers misreporting types rather than editors locking up the CPU and consuming many gigabytes of RAM. I do think that pyright has performance problems highlighted by this. I usually have the pyright language server disabled in my editor because of this but I have just tested what happens if I reenable it and open a file from the sympy codebase. I can see that 1 core is at 100% for many minutes. I haven't waited long enough to see how long it would take for pyright to finish what it is doing. The memory used by node seems to climb up to 2GB, then drop down and climb up again repeatedly. I'm testing this with the py.typed file removed which doesn't seem to make any difference in this case. I don't see how I could enable a language server that can perform like this just because I opened a file in the editor. |
I opened a SymPy PR about a year ago (sympy/sympy#25103) and at the time I said:
Something seems to have become less efficient since then though. I just tested with the latest version of pyright (1.1.349) on the same computer and it not only takes longer than 5 minutes but ultimately crashes: $ time pyright sympy
<--- Last few GCs --->
[21851:0x58bbd80] 814593 ms: Scavenge 1987.2 (2078.7) -> 1979.9 (2078.9) MB, 7.5 / 0.0 ms (average mu = 0.795, current mu = 0.560) allocation failure;
[21851:0x58bbd80] 814673 ms: Scavenge 1988.4 (2079.6) -> 1981.1 (2080.9) MB, 4.7 / 0.0 ms (average mu = 0.795, current mu = 0.560) allocation failure;
[21851:0x58bbd80] 814712 ms: Scavenge 1988.8 (2080.9) -> 1981.5 (2096.9) MB, 5.4 / 0.0 ms (average mu = 0.795, current mu = 0.560) allocation failure;
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
1: 0xb7a940 node::Abort() [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
2: 0xa8e823 [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
3: 0xd5c940 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
4: 0xd5cce7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
5: 0xf3a3e5 [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
6: 0xf4c8cd v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
7: 0xf26fce v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
8: 0xf28397 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
9: 0xf0956a v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
10: 0x12ce7af v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
11: 0x16fb6b9 [/home/oscar/.nvm/versions/node/v18.16.0/bin/node]
real 13m35.492s
user 19m44.680s
sys 0m33.144s |
With pyright 1.1.300 I can run the full check in 7 minutes: $ time PYRIGHT_PYTHON_FORCE_VERSION=1.1.300 pyright sympy | tee pyright.txt
No configuration file found.
pyproject.toml file found at /home/oscar/current/active/sympy.
Loading pyproject.toml file at /home/oscar/current/active/sympy/pyproject.toml
Assuming Python version 3.11
Assuming Python platform Linux
Auto-excluding **/node_modules
Auto-excluding **/__pycache__
Auto-excluding **/.*
Searching for source files
Found 1517 source files
Emptying type cache to avoid heap overflow. Used 1888MB out of 2096MB.
Emptying type cache to avoid heap overflow. Used 1887MB out of 2096MB.
pyright 1.1.300
...
28040 errors, 48 warnings, 0 informations
Completed in 439.575sec
WARNING: there is a new pyright version available (v1.1.300 -> v1.1.349).
Please install the new version or set PYRIGHT_PYTHON_FORCE_VERSION to `latest` I suppose that the messages
explain why memory usage goes up to 2GB and drops down again. Maybe this cache emptying is not being handled as well in pyright 1.1.349 which might explain why it crashes after running out of memory. |
I did some bisecting to see where sympy became slow. It looks like a change between 1.1.307 and 1.1.308 caused a large perf regression: I tested with this code (and py.typed removed from the sympy site-packages folder) import sympy
def f():
sympy.acos(42) 1.1.307 Results:
1.1.308 Results:
|
@rchido, were you able to narrow the problem to a specific commit? |
Sorry but I didn't check individual commits. Just the difference between the two releases so far. |
It's this one: Previous commit timing:
6753a5d's timing:
|
@erictraut, I've been playing around with this to see what the difference was between those two changes. It seems like the newer change is just a lot more thorough in its evaluation of things. I added some instrumentation and found for the example, we:
I think this is because we go farther down the chain of subtypes to compute things. For instance, I put a filter in this function here to skip finding subtypes when the type comes from something in That completely changes the amount of time we spend computing types, actually making it faster than the previous change. Would that be a viable option? Have some way of making certain packages only explore subtypes to a certain level? (I actually implemented this in Pylance to see how it might work in practice). I don't think there's any obvious ways to fix the perf for every package though. It's really just computing the types for all of the untyped information in sympy that takes so long. The alternative, I think, would be to generate type stubs for sympy and ship them with Pylance. |
I'll submit a PR with the proposed change to see if that's an option or not. |
I added a bunch of logging around These are number of times the function is called: Post change
Pre change:
So then I did some investigation into when they start to differ, At Post change
Pre change
So I started debugging the difference and it comes to this function call:
That's where's a code difference (I mean that makes sense given the diff in output). The older code checks to see if a TypeVar is live or not. The newer code doesn't. That's where the difference comes in. But given my limited knowledge of what this is doing, that seemed on purpose to me, so not sure if I'm even going down the correct path to figure out why one was slower. |
I guess there's a little bit more information that I have. When I added logging for The majority of the output there is for an argument named Here's how I logged that output: function validateArgType(
argParam: ValidateArgTypeParams,
typeVarContext: TypeVarContext,
signatureTracker: UniqueSignatureTracker,
typeResult: TypeResult<FunctionType> | undefined,
options: ValidateArgTypeOptions
): ArgResult {
const result = validateArgTypeImpl(argParam, typeVarContext, signatureTracker, typeResult, options);
timingStats.addCallstack(argParam.paramName, argParam.paramType.category, argParam.argument.node?.id, result); // Function I added to console log callstacks
return result;
} I was wondering if it would be possible to cache the ArgResult? Maybe that doesn't work or isn't computable. |
If you see it being evaluated over and over again, that's because there's a loop present, and some types depend on others. We need to evaluate types until all of them "settle" to their final values. Until then, they are marked as "incomplete". Sympy's code contains many such loops (some of them doubly or triply nested) and many tricky dependencies between many variables whose types are not annotated. Pyright already aggressively caches the already-computed types for all parse nodes, but if the cached value is marked "incomplete", its value must be invalidated when some other node's type is updated within the loop. These are tricky issues to diagnose. I've spent an hour on this one already, but in the past, similar issues have required many hours of focused time to really understand the root cause and then formulate potential mitigations (or conclude that no such mitigation is required). |
There's a workaround we're going to ship with Pylance. I added stubs to the https://github.com/microsoft/python-type-stubs repository. If you download the https://github.com/microsoft/python-type-stubs/tree/main/stubs/sympy and put that tree into your Without stubs:
With stubs:
|
Thank you all for looking at this. I don't fully understand the suggestion from @rchiodo. Is it that pyright would add those stub files or that sympy could add them to achieve the attended effect? There has been some friction around adding type hints within sympy. It is possible that using stub files would be a more workable. I don't understand the implications of there being stub files in sympy as well as type hints in the code or stub files for sympy in pyright as well as type hints in the sympy codebase. |
You would add those stub files to your workspace typings folder. That's essentially what Pylance is doing (we have a folder we add to the list of places to search for stubs in). I believe sympy is working on adding types, but they haven't released in more than a year. These stubs should be a temporary workaround until sympy is fully typed. |
To be clear I am a sympy maintainer. I am asking if you are suggesting that we (sympy) should do something and/or to understand what the implications are of whatever pyright is planning to do in relation to what we are doing or might do in future. |
To be clear, Pyright isn't doing anything with relation to the stubs. The stubs won't every be as The stubs are really for Pylance. Pyright users can install them locally too (if perf of sympy is a concern), but the stubs are mostly there to make Pylance go faster. Longer term, I hope sympy becomes fully typed and we can just delete these extra type stubs. As a maintainer of sympy, it would be great if you could get sympy typed enough that Pyright is fast to analyze sympy. That'd probably be the point where we don't need these stubs anymore. |
Sorry I didn't answer this question:
Sympy can also take ownership of the stubs too. I believe it works if you simply add these pyi files to your source tree. The ones I generated were from the 1.12 release of sympy though. They don't match the dev branch. |
The main thing that makes Pyright/Pylance slow is unannotated return values. That's mostly what's in the stubs. Annotating all of the return values in sympy would make Pyright/Pylance a lot faster. |
You don't need to annotate every single symbol. You can
If needed, we could provide a way for you to find (3), so you can strategically add annotations to symbols that matter most. |
I agree but it is really hard to do. Symbolic computing is to some extent inherently weakly typed but also there are many historic design decisions in SymPy that make this harder. The big one is that in SymPy every different kind of symbolic expression node is a different Python class: In [1]: e1 = sin(x)
In [2]: e2 = e1 + 1
In [3]: e1
Out[3]: sin(x)
In [4]: e2
Out[4]: sin(x) + 1
In [5]: type(e1)
Out[5]: sin
In [6]: type(e2)
Out[6]: sympy.core.add.Add These types can also return different types from their constructors: In [7]: e3 = Add(x, x) # We try to construct an Add but we get a Mul
In [8]: e3
Out[8]: 2⋅x
In [9]: type(e3)
Out[9]: sympy.core.mul.Mul More discussion of this is in sympy/sympy#25103 and python/mypy#15182. |
I had a chance to look more deeply at this issue. I've tracked down the problem to a particular expression that spans about 750 lines in the |
Just in case everyone is not aware there was a SymPy PR (sympy/sympy#26528) related to this from @heejaechang that got merged. That probably means that there are differences between SymPy 1.12 vs current master relating to this issue. |
…ntry types if the tuple expression is embedded within another tuple, set, list, or dictionary expression. This addresses #7159.
This is included in pyright 1.1.365. |
Environment data
Code Snippet
Expected behavior
Hovering over the the variable x should show that x is a Symbol in no-time, or auto-completion should appear instantly when typing anywhere in the code above.
Actual behavior
It takes about 5 seconds to show the type information when hover with the cursor on x, and about 12 seconds to show any auto-completion.
The problem is using Symbol from sympy as all the other features seems to not run any delay. For example, the code below does not halt IntelliSense:
Logs
pylance-sympy-trace.log
The text was updated successfully, but these errors were encountered: