Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive: Respect filter_size in query for existing nodes #6404

Merged
merged 1 commit into from
May 21, 2024

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented May 20, 2024

Fixes #6402

The QueryParams dataclass defines the filter_size attribute which is used in all queries to limit the number of parameters used in a query. This is necessary because without it, large archives would result in queries with a lot of parameters which can cause exceptions in database backends, such as SQLite, which define a limit of 1000 by default.

The aiida.tools.archive._import_nodes function was not respecting this setting when determining the set of nodes from the archive that already exist in the target storage. This would result in an exception when trying to import a large archive into a storage using SQLite. The problem is fixed by using the batch_iter utility to retrieve the existing UUIDs in batches of size filter_size.

@sphuber sphuber requested a review from GeigerJ2 May 20, 2024 14:27
@sphuber
Copy link
Contributor Author

sphuber commented May 20, 2024

With this fix, I have successfully imported the MC3D archive into a core.sqlite_dos profile.

The `QueryParams` dataclass defines the `filter_size` attribute which is
used in all queries to limit the number of parameters used in a query.
This is necessary because without it, large archives would result in
queries with a lot of parameters which can cause exceptions in database
backends, such as SQLite, which define a limit of 1000 by default.

The `aiida.tools.archive._import_nodes` function was not respecting this
setting when determining the set of nodes from the archive that already
exist in the target storage. This would result in an exception when
trying to import a large archive into a storage using SQLite. The
problem is fixed by using the `batch_iter` utility to retrieve the
existing UUIDs in batches of size `filter_size`.
@sphuber sphuber force-pushed the fix/6402/archive-import-filter-size branch from 26c4c2a to b81ed20 Compare May 20, 2024 18:29
Copy link
Contributor

@GeigerJ2 GeigerJ2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also just tried it, and it worked for me, as well. So this seems ready to go.

@sphuber sphuber merged commit ef60b66 into aiidateam:main May 21, 2024
18 of 19 checks passed
@sphuber sphuber deleted the fix/6402/archive-import-filter-size branch May 21, 2024 12:19
mikibonacci pushed a commit to mikibonacci/aiida-core that referenced this pull request Sep 3, 2024
…#6404)

The `QueryParams` dataclass defines the `filter_size` attribute which is
used in all queries to limit the number of parameters used in a query.
This is necessary because without it, large archives would result in
queries with a lot of parameters which can cause exceptions in database
backends, such as SQLite, which define a limit of 1000 by default.

The `aiida.tools.archive._import_nodes` function was not respecting this
setting when determining the set of nodes from the archive that already
exist in the target storage. This would result in an exception when
trying to import a large archive into a storage using SQLite. The
problem is fixed by using the `batch_iter` utility to retrieve the
existing UUIDs in batches of size `filter_size`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(sqlite3.OperationalError) too many SQL variables exception
2 participants