Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count limits #86

Merged
merged 6 commits into from Aug 29, 2021
Merged

Count limits #86

merged 6 commits into from Aug 29, 2021

Conversation

Tinche
Copy link
Contributor

@Tinche Tinche commented Aug 24, 2021

This was motivated by our real-life use case. We sometimes need to limit counting to preserve latency and Dynamo resources. I tried using query_single_page but it doesn't support count queries, so I just added limit to count.

@Tinche
Copy link
Contributor Author

Tinche commented Aug 24, 2021

The mypy failure is at src/aiodynamo/http/httpx.py:45: error: Argument "data" to "post" of "AsyncClient" has incompatible type "bytes"; expected "Dict[Any, Any]" which I haven't touched. Should probably be fixed separately.

@dimaqq
Copy link
Contributor

dimaqq commented Aug 25, 2021

The mypy failure is at src/aiodynamo/http/httpx.py:45: error: Argument "data" to "post" of "AsyncClient" has incompatible type "bytes"; expected "Dict[Any, Any]" which I haven't touched. Should probably be fixed separately.

Will be done in #87

src/aiodynamo/client.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise LGTM from me.

@ojii if the new functionality is reasonable.

@dimaqq
Copy link
Contributor

dimaqq commented Aug 25, 2021

I've merged #87, if you rebase the tests should be all green ❇️ :)

hk = HashKey("h", "k")
assert await client.count(table, hk, limit=1) == 0
await client.put_item(table, {"h": "k", "r": "0"})
for i in range(1, 20):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for repeating this 20 times?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to call it more than a few times to ensure the limit parameter was being correctly applied. The number 20 is arbitrary.

Copy link
Contributor

@ojii ojii Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea, but I think the loop adds little value. Instead, how about testing these scenarios:

  • 1 item table, limit 1 (tests a full count despite limit)
  • 2 item table, limit 1 (tests a single page partial count)
  • table above page size, limit larger than page size but not all items (tests a multi page partial count)

There's lots of other potential scenarios, but I feel like these are the most important. The main point I'm trying to suggest is to test specific scenarios, rather than the same one 20 times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, but what is the page size? I know from experience on our production DynamoDB instances some tables would return around 10k items using a single query. I think it depends on the sizes of the items?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I see

async def test_query_with_limit(client: Client, high_throughput_table: TableName):
already dealing with pagination, maybe I can use this approach for the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ojii Test reworked!

@ojii
Copy link
Contributor

ojii commented Aug 26, 2021

This was motivated by our real-life use case. We sometimes need to limit counting to preserve latency and Dynamo resources. I tried using query_single_page but it doesn't support count queries, so I just added limit to count.

Would a count_single_page be useful for your use case?

@Tinche
Copy link
Contributor Author

Tinche commented Aug 26, 2021

This was motivated by our real-life use case. We sometimes need to limit counting to preserve latency and Dynamo resources. I tried using query_single_page but it doesn't support count queries, so I just added limit to count.

Would a count_single_page be useful for your use case?

Yes, had this been available I would have used it. On a high level though, what I really want is to count with a limit so that's what I implemented.

@Tinche Tinche requested a review from ojii August 27, 2021 22:54
Copy link
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I'll slop limit onto the docs after merge :clever:

@@ -745,6 +747,7 @@ async def count(
start_key: Optional[Dict[str, Any]] = None,
filter_expression: Optional[Condition] = None,
index: Optional[str] = None,
limit: Optional[int] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation needs to be updated :)

@dimaqq dimaqq merged commit 6b5c0b2 into HENNGE:master Aug 29, 2021
@dimaqq dimaqq mentioned this pull request Aug 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants