You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The return from DescribeTable is missing the ItemCount field. According to DynamoDB documentation, ItemCount should have:
The number of items in the specified table. DynamoDB updates this value approximately every six hours. Recent changes might not be reflected in this value.
In addition to base tables, indexes should also have this ItemCount estimate.
Unfortunately, there is no obvious way to implement this feature efficiently in Scylla. We have a probabilistic mechanism for estimating the number of partitions in a set of sstables, but ItemCount is about individual rows, not partitions. Moreover, whatever estimate we get includes just the data on one node - we'll probably need to collect estimates from all nodes - and when (periodically? during the request? what happens when one of the nodes is down)?
Moreover it's not clear to me (we'll need to find an answer to this) whether a probabilistic, approximate count, is good enough - or if this must be an accurate count of items (albeit only true to six hours ago).
We have an xfailing tests for this feature: test_describe_table.py::test_describe_table_item_count, test_gsi.py::test_gsi_describe and test_lsi.py::test_lsi_describe_fields.
The text was updated successfully, but these errors were encountered:
The return from
DescribeTable
is missing theItemCount
field. According to DynamoDB documentation,ItemCount
should have:In addition to base tables, indexes should also have this
ItemCount
estimate.Unfortunately, there is no obvious way to implement this feature efficiently in Scylla. We have a probabilistic mechanism for estimating the number of partitions in a set of sstables, but
ItemCount
is about individual rows, not partitions. Moreover, whatever estimate we get includes just the data on one node - we'll probably need to collect estimates from all nodes - and when (periodically? during the request? what happens when one of the nodes is down)?Moreover it's not clear to me (we'll need to find an answer to this) whether a probabilistic, approximate count, is good enough - or if this must be an accurate count of items (albeit only true to six hours ago).
We have an xfailing tests for this feature:
test_describe_table.py
::test_describe_table_item_count
,test_gsi.py
::test_gsi_describe
andtest_lsi.py
::test_lsi_describe_fields
.The text was updated successfully, but these errors were encountered: