Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support lazy loading individual virtualenvs in a multi-root workspace #6009

Closed
pcasdf opened this issue Jun 13, 2024 · 16 comments
Closed

Support lazy loading individual virtualenvs in a multi-root workspace #6009

pcasdf opened this issue Jun 13, 2024 · 16 comments
Assignees
Labels
enhancement New feature or request fixed in next version (main) A fix has been implemented and will appear in an upcoming version needs repro Issue has not been reproduced yet

Comments

@pcasdf
Copy link

pcasdf commented Jun 13, 2024

My company uses a monorepo and multi-root VS Code workspace with over 40 Python projects. On initial startup of VS Code, the Python extension takes a ton of CPU and memory because it starts indexing every virtualenv. I really wish there were a config that allows us to choose to only begin indexing a virtualenv when a file within that project / sub-root is opened.

@karthiknadig karthiknadig transferred this issue from microsoft/vscode-python Jun 13, 2024
@github-actions github-actions bot added the needs repro Issue has not been reproduced yet label Jun 13, 2024
@heejaechang
Copy link
Contributor

good idea. something I will do soon.

@heejaechang heejaechang added the enhancement New feature or request label Jun 13, 2024
@bschnurr bschnurr added the fixed in next version (main) A fix has been implemented and will appear in an upcoming version label Jun 20, 2024
@rchiodo
Copy link
Contributor

rchiodo commented Jun 25, 2024

This issue has been fixed in prerelease version 2024.6.102, which we've just released. You can find the changelog here: CHANGELOG.md

@rchiodo rchiodo closed this as completed Jun 25, 2024
@pcasdf
Copy link
Author

pcasdf commented Jun 25, 2024

🙏 Y'all rock! Thank you so much!

@rchiodo
Copy link
Contributor

rchiodo commented Jun 25, 2024

I should explain how to use it. Well actually @heejaechang would be better at doing that. He implemented it :)

@pcasdf
Copy link
Author

pcasdf commented Jun 25, 2024

Was just about to ask, since it doesn't seem to be the default -- I don't see the setting under python.* either.

@heejaechang
Copy link
Contributor

Hi pcasdf. there is no setting. it is a new default behavior. we were postponing indexing third party libraries until there is a file opened from a workspace, but we didn't do that for user files. now we will postpone any file (third party or user file) until a file is opened for the workspace.

@pcasdf
Copy link
Author

pcasdf commented Jun 25, 2024

@heejaechang is it intended that given this structure:

root
- project a
- - main.py
- project b
- project c

if I open main.py in project a, I should see indexing begin for project b and c as well? That was the behavior that I described originally, which seems to still occur.

@pcasdf
Copy link
Author

pcasdf commented Jun 25, 2024

@heejaechang is it intended that given this structure:

root
- project a
- - main.py
- project b
- project c

if I open main.py in project a, I should see indexing begin for project b and c as well? That was the behavior that I described originally, which seems to still occur.

To clarify, my hope is that I can open a file in project a, and only then will project a begin to index, and project b and c should not try to index until I open a file under their tree.
Either way, the change that you made is still a huge improvement for us, because not all of our devs even touch Python files, but we share a single workspace among everyone. So this new default behavior is appreciated 🙏

@heejaechang
Copy link
Contributor

are you using multi root workspace as in vscode's multi root workspace support? or are you using it as different meaning?

if you have multi root workspace where root is

project a
    main.py
project b
project c

opening main.py won't cause project b or c to start indexing.

if you have 1 workspace as root and just have multiple folders under it as project a/b/c and then call it multi-root workspace, current change won't do anything.

@pcasdf
Copy link
Author

pcasdf commented Jun 25, 2024

We do use multi-root project with structure similar to:

root
project a
project b
project c

But I also removed root, leaving it as:

project a
project b
project c

And I see output similar to:

2024-06-25 22:48:39.277 [info] [Info  - 10:48:39 PM] (202134) Starting service instance "a"
2024-06-25 22:48:39.277 [info] [Info  - 10:48:39 PM] (202134) Starting service instance "b"
2024-06-25 22:48:39.286 [info] [Info  - 10:48:39 PM] (202134) Starting service instance "c"
2024-06-25 22:48:40.554 [info] [Info  - 10:48:40 PM] (202134) Background analysis(24) root directory: file:///home/discord/.vscode-server/extensions/ms-python.vscode-pylance-2024.6.102%2Bdiscord.3/dist
2024-06-25 22:48:40.555 [info] [Info  - 10:48:40 PM] (202134) Background analysis(24) started
2024-06-25 22:48:40.567 [info] [Info  - 10:48:40 PM] (202134) Background analysis(1) root directory: file:///home/discord/.vscode-server/extensions/ms-python.vscode-pylance-2024.6.102%2Bdiscord.3/dist
2024-06-25 22:48:40.571 [info] [Info  - 10:48:40 PM] (202134) Background analysis(1) started
2024-06-25 22:48:40.575 [info] [Info  - 10:48:40 PM] (202134) Background analysis(3) root directory: file:///home/discord/.vscode-server/extensions/ms-python.vscode-pylance-2024.6.102%2Bdiscord.3/dist
2024-06-25 22:48:40.575 [info] [Info  - 10:48:40 PM] (202134) Background analysis(23) root directory: file:///home/discord/.vscode-server/extensions/ms-python.vscode-pylance-2024.6.102%2Bdiscord.3/dist
2024-06-25 22:48:40.577 [info] [Info  - 10:48:40 PM] (202134) Background analysis(23) started
2024-06-25 22:48:40.577 [info] [Info  - 10:48:40 PM] (202134) Background analysis(3) started
2024-06-25 22:48:40.579 [info] [Info  - 10:48:40 PM] (202134) Background analysis(16) root directory: file:///home/discord/.vscode-server/extensions/ms-python.vscode-pylance-2024.6.102%2Bdiscord.3/dist
2024-06-25 22:48:40.580 [info] [Info  - 10:48:40 PM] (202134) Background analysis(20) root directory: file:///home/discord/.vscode-server/extensions/ms-python.vscode-pylance-2024.6.102%2Bdiscord.3/dist
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) Setting pythonPath for service "a": "/home/discord/.virtualenvs/a/bin/python"
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) Setting environmentName for service "a": "3.7.17 (a venv)"
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) No include entries specified; assuming /home/discord/discord/a
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) Auto-excluding **/node_modules
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) Auto-excluding **/__pycache__
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) Auto-excluding **/.*
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) Assuming Python version 3.7.17.final.0
2024-06-25 22:48:42.932 [info] [Info  - 10:48:42 PM] (202134) Found 441 source files
2024-06-25 22:48:44.091 [info] [Info  - 10:48:44 PM] (202134) Setting pythonPath for service "b": "/home/discord/.virtualenvs/b/bin/python"
2024-06-25 22:48:44.093 [info] [Info  - 10:48:44 PM] (202134) Setting environmentName for service "b": "3.11.9 (b venv)"
2024-06-25 22:48:44.093 [info] [Info  - 10:48:44 PM] (202134) No include entries specified; assuming /home/discord/discord/b
2024-06-25 22:48:44.094 [info] [Info  - 10:48:44 PM] (202134) Auto-excluding **/node_modules
2024-06-25 22:48:44.094 [info] [Info  - 10:48:44 PM] (202134) Auto-excluding **/__pycache__
2024-06-25 22:48:44.094 [info] [Info  - 10:48:44 PM] (202134) Auto-excluding **/.*
2024-06-25 22:48:44.094 [info] [Warn  - 10:48:44 PM] (202134) stubPath file:///home/discord/discord/b/stubs is not a valid directory.
2024-06-25 22:48:44.094 [info] [Info  - 10:48:44 PM] (202134) Assuming Python version 3.11.9.final.0
2024-06-25 22:48:44.094 [info] [Info  - 10:48:44 PM] (202134) Found 6443 source files
2024-06-25 22:48:44.620 [info] [Info  - 10:48:44 PM] (202134) Setting pythonPath for service "c": "/home/discord/.virtualenvs/c/bin/python"
2024-06-25 22:48:44.620 [info] [Info  - 10:48:44 PM] (202134) Setting environmentName for service "c": "3.11.9 (c-env venv)"
2024-06-25 22:48:44.621 [info] [Info  - 10:48:44 PM] (202134) No include entries specified; assuming /home/discord/discord/discord_clyde
2024-06-25 22:48:44.621 [info] [Info  - 10:48:44 PM] (202134) Auto-excluding **/node_modules
2024-06-25 22:48:44.621 [info] [Info  - 10:48:44 PM] (202134) Auto-excluding **/__pycache__
2024-06-25 22:48:44.621 [info] [Info  - 10:48:44 PM] (202134) Auto-excluding **/.*
2024-06-25 22:48:44.621 [info] [Info  - 10:48:44 PM] (202134) Assuming Python version 3.11.9.final.0
2024-06-25 22:48:44.621 [info] [Info  - 10:48:44 PM] (202134) Found 876 source files

Maybe I misunderstand how the indexing works, but this step usually seems to take a lot of CPU.

I do see now a separate step when I open a file under a separate project:

2024-06-25 22:51:15.762 [info] [Info  - 10:51:15 PM] (202134) [IDX(53)] Long operation: index execution environment file:///home/discord/repo/a (3975ms)
2024-06-25 22:51:15.798 [info] [Info  - 10:51:15 PM] (202134) [IDX(53)] Long operation: index packages file:///home/discord/repo/a (4058ms)
2024-06-25 22:51:15.798 [info] [Info  - 10:51:15 PM] (202134) indexed(53) 258 files over 1 exec env
2024-06-25 22:51:15.887 [info] [Info  - 10:51:15 PM] (202134) Indexing finished(53).
2024-06-25 22:51:17.082 [info] [Info  - 10:51:17 PM] (202134) [BG(5)] Long operation: checking: file:///home/discord/repo/a/a/a/user.py (6362ms)
2024-06-25 22:51:17.083 [info] [Info  - 10:51:17 PM] (202134) [BG(5)] Long operation: analyzing: file:///home/discord/repo/a/a/a/user.py (6881ms)

So is this where the actual indexing occurs?

@pcasdf
Copy link
Author

pcasdf commented Jun 25, 2024

I think I understand now that that's where the indexing occurs. Sorry for my confusion!

@heejaechang
Copy link
Contributor

ya, the second one is when indexing started. so if you are saying the first one takes a lot of cpus, it could be something else taking time. also, you can do python.analysis.indexing: false to see whether CPU issue still persist.

@heejaechang
Copy link
Contributor

I think I understand now that that's where the indexing occurs. Sorry for my confusion!

don't worry about it, anyway, if there is other perf (CPU) issue outside of indexing, can you provide us some logs so we can take a look what is going on?

https://github.com/microsoft/pylance-release/wiki/Collecting-data-for-an-investigation.#collecting-cpuprofiles

basically steps you need to do is

  1. start vscode and open multi root workspace as you used to do
  2. wait until vscode goes idle (make sure pylance is loaded. you can create untitled python file to make that happen)
  3. open workspace settings json file
  4. invoke pylance start profiling command
  5. change someting in the setting file such as adding python.analysis.indexing: false and save
  6. wait until vscode goes idle
  7. invoke pyalnce stop profiling command
  8. provide us *.cpuprofile files created by pylance (message box should tell you where those files are)

it would help us a lot to find out where CPUs are used for the part you mentioned.

@pcasdf
Copy link
Author

pcasdf commented Jun 26, 2024

I see now that indexing appears to complete fairly quickly, even in a large 7k Python file sub-project, if VS Code has already loaded all extensions and Pylance has fully loaded and reached the idle state. But when opening a file after the Python extension is just beginning to load, it appears that the background analysis tasks may be blocking before allowing indexing to begin. This is only a problem for us since we have so many projects, or "service instances" as they're named in the output. When we start VS Code and open our first Python file of that session, we usually have to wait about 1-2 minutes before Intellisense begins to work. Afterwards, opening files in other projects is fairly fast, even for those that hadn't been indexed yet. This isn't a terrible experience, but I would love to see a speed up in that initial time from start up to working Intellisense.

Here are two profiles. The first one was after disabling indexing, and the second is after enabling indexing.
pyright-cpuprofile.tar.gz
pyright-cpuprofile-2.tar.gz

Edit:
After actually timing it, it appears to only take ~30 seconds from initial load to completing indexing of the opened file. Sorry, I think I might have been wasting your time 😓 thank you for all the help!

@heejaechang
Copy link
Contributor

no worry. thank you for providing the data!

@heejaechang
Copy link
Contributor

found the issue. dupe of #6046

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request fixed in next version (main) A fix has been implemented and will appear in an upcoming version needs repro Issue has not been reproduced yet
Projects
None yet
Development

No branches or pull requests

4 participants