Skip to content

Comments

HDDS-6520. Refactor RDBTable#getRangeKVs()#3339

Closed
kaijchen wants to merge 3 commits intoapache:masterfrom
kaijchen:HDDS-6520
Closed

HDDS-6520. Refactor RDBTable#getRangeKVs()#3339
kaijchen wants to merge 3 commits intoapache:masterfrom
kaijchen:HDDS-6520

Conversation

@kaijchen
Copy link
Member

@kaijchen kaijchen commented Apr 24, 2022

What changes were proposed in this pull request?

Refactor RDBTable#getRangeKVs(), cleanup code and make getting prevKey more efficient.

Question

Why RDBTable#getRangeKVs() should return empty list if startKey is not found?
It makes getting all key-value pairs awkward.

Current

For example, if the keys in table are [10, 20, 30, 40, 50, 60, 70, 80], and we use count = 3 in each iteration.

  1. startKey = None, result = [10, 20, 30]
  2. startKey = 30, result = [30, 40, 50], (can't use 31 as startKey)
  3. startKey = 50, result = [50, 60, 70]
  4. startKey = 70, result = [70, 80]

And the total results is [10, 20, 30, 30, 40, 50, 50, 60, 70, 70, 80], note the startKey in each iteration is duplicated.

Approach 1

If we can seek for the next key if startKey does not exist, we can use 31 in the 2nd iteration and get [40, 50, 60].

  1. startKey = None, result = [10, 20, 30]
  2. startKey = 31, result = [40, 50, 60]
  3. startKey = 61, result = [70, 80]

Verified CI can pass if we remove the startKey exist check. #3339 (comment)

Approach 2

Or if we can change startKey to prevKey, the iterations will look more straightforward.

  1. prevKey = None, result = [10, 20, 30]
  2. prevKey = 30, result = [40, 50, 60]
  3. prevKey = 60, result = [70, 80]

However, this approach may require additional changes in the the codebase.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6520

How was this patch tested?

CI

@kaijchen
Copy link
Member Author

Cc @errose28 for review.

@kaijchen
Copy link
Member Author

kaijchen commented May 5, 2022

One question here: if startKey does not exist in table, can we start from the key after it?

Although it may break compatibility, current CI can be passed: https://github.com/kaijchen/ozone/actions/runs/2273344827

@kaijchen
Copy link
Member Author

kaijchen commented Jun 8, 2022

Refactor RDBTable#getRangeKVs(), cleanup code and make getting prevKey more efficient.

This refactor is no longer needed, since the original implementation is changed.

The question section should be addressed in another PR.

@kaijchen kaijchen closed this Jun 8, 2022
@kaijchen kaijchen deleted the HDDS-6520 branch March 14, 2023 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants