Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Query options were not respecting use_cache #873

Merged
merged 1 commit into from
Feb 27, 2023

Conversation

rwhogg
Copy link
Contributor

@rwhogg rwhogg commented Feb 22, 2023

In certain circumstances, we were not respecting use_cache for queries, unlike legacy NDB, which is quite emphatic about supporting them.
(See https://github.com/GoogleCloudPlatform/datastore-ndb-python/blob/59cb209ed95480025d26531fc91397575438d2fe/ndb/query.py#L186-L187)

In #613 we tried to match legacy NDB behavior by updating the cache using the results of queries. We still do that, but now we respect use_cache, which was a valid keyword argument for Query.fetch() and friends, but was not passed down to the context cache when needed.

As a result, the cache could mysteriously accumulate lots of memory usage and perhaps even cause you to hit memory limits, even if it was seemingly disabled and it didn't look like there were any objects holding references to your query results.
This is a problem for certain batch-style workloads where you know you're only interested in processing a certain entity once.

Fixes #752

@rwhogg rwhogg requested review from a team as code owners February 22, 2023 21:35
@product-auto-label product-auto-label bot added the api: datastore Issues related to the googleapis/python-ndb API. label Feb 22, 2023
In certain circumstances, we were not respecting use_cache
for queries, unlike legacy NDB, which is quite emphatic
about supporting them.
(See https://github.com/GoogleCloudPlatform/datastore-ndb-python/blob/59cb209ed95480025d26531fc91397575438d2fe/ndb/query.py#L186-L187)

In googleapis#613 we tried to match legacy NDB behavior by updating the cache
using the results of queries. We still do that, but now we respect
use_cache, which was a valid keyword argument for Query.fetch()
and friends, but was not passed down to the context cache when
needed.

As a result, the cache could mysteriously accumulate lots of memory
usage and perhaps even cause you to hit memory limits, even if it was
seemingly disabled and it didn't look like there were any objects
holding references to your query results.
This is a problem for certain batch-style workloads where you know
you're only interested in processing a certain entity once.

Fixes googleapis#752
@rwhogg rwhogg added the owlbot:run Add this label to trigger the Owlbot post processor. label Feb 24, 2023
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Feb 24, 2023
@rwhogg rwhogg added the automerge Merge the pull request once unit tests and other checks pass. label Feb 27, 2023
@rwhogg rwhogg merged commit 802d88d into googleapis:main Feb 27, 2023
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Feb 27, 2023
@rwhogg rwhogg deleted the query-options branch February 27, 2023 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the googleapis/python-ndb API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Datastore: (using NDB) Linear memory usage consumption when using fetch_page
3 participants