Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoImport.generate_modules_cache can be speeded up by 2x #722

Closed
tkrabel opened this issue Nov 1, 2023 · 1 comment · Fixed by #723
Closed

AutoImport.generate_modules_cache can be speeded up by 2x #722

tkrabel opened this issue Nov 1, 2023 · 1 comment · Fixed by #723
Labels
bug Unexpected or incorrect user-visible behavior
Milestone

Comments

@tkrabel
Copy link
Contributor

tkrabel commented Nov 1, 2023

Describe the bug

I played around with AutoImport.generate_modules_cache, as it is perceived to be slow. I logged what packages get imported from a conda env I created and linked to the project, and I seems there are many unnecessary duplicate entries added to the database.

You can see the imports in this file: sorted_packages_with_site_packages.txt.

I see the following pattern of duplicate entries in that file.

Name(name=<import_name>, modname=<mod_name>, package=<package>, ...)
Name(name=<import_name>, modname="site-packages."<mod_name>, package='site-packages', ...)

One example of a duplicate:

Name(name='BaseName', modname='jedi.api.classes', package='jedi', source=<Source.SITE_PACKAGE: 4>, name_type=<NameType.Class: 7>)
Name(name='BaseName', modname='site-packages.jedi.api.classes', package='site-packages', source=<Source.SITE_PACKAGE: 4>, name_type=<NameType.Class: 7>)

From that, it seems the issue is that we don't exclude the top level site-packages directory itself from our search tree, which treats it as its own package and hence every package inside of it is double counted.

To Reproduce

  1. Change code in sqlite.py
diff --git a/rope/contrib/autoimport/sqlite.py b/rope/contrib/autoimport/sqlite.py
index eb7c27de..42447edc 100644
--- a/rope/contrib/autoimport/sqlite.py
+++ b/rope/contrib/autoimport/sqlite.py
@@ -371,6 +371,7 @@ class AutoImport:
             return
         self._add_packages(packages)
         job_set = task_handle.create_jobset("Generating autoimport cache", 0)
+        end_names = []
         if single_thread:
             for package in packages:
                 for module in get_files(package, underlined):
@@ -383,9 +384,11 @@ class AutoImport:
                 get_future_names(packages, underlined, job_set)
             ):
                 self._add_names(future_name.result())
+                end_names.append(future_name.result())
                 job_set.finished_job()
 
         self.connection.commit()
+        return end_names
 
     def _get_packages_from_modules(self, modules: List[str]) -> Iterator[Package]:
         for modname in modules:
  1. Run the code from an env that has the changes from (1) applied
from rope.base.project import Project
from rope.contrib.autoimport.sqlite import AutoImport

project = Project(".")
autoimport = AutoImport(project, memory=True)
autoimport.generate_modules_cache()
  1. Look at the result
@tkrabel tkrabel added the bug Unexpected or incorrect user-visible behavior label Nov 1, 2023
@tkrabel-db
Copy link

We need to also address this comment: #723 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected or incorrect user-visible behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants