Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option preserve_internal_time_seconds to preserve the time info #10747

Closed
wants to merge 14 commits into from

Conversation

jay-zhuang
Copy link
Contributor

@jay-zhuang jay-zhuang commented Sep 27, 2022

Add option preserve_internal_time_seconds to preserve the internal
time information.
It's mostly for the migration of the existing data to tiered storage (
preclude_last_level_data_seconds). When the tiering feature is just
enabled, the existing data won't have the time information to decide if
it's hot or cold. Enabling this feature will start collect and preserve
the time information for the new data.

@jay-zhuang jay-zhuang force-pushed the timetracking branch 2 times, most recently from 47e0f50 to 4fceba5 Compare September 28, 2022 22:25
@jay-zhuang jay-zhuang changed the title Add option track_internal_time_seconds to enable internal time tracking Add option preserve_internal_time_seconds to preserve the time info Sep 28, 2022
@jay-zhuang jay-zhuang marked this pull request as ready for review September 28, 2022 23:08
@facebook-github-bot
Copy link
Contributor

@jay-zhuang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jay-zhuang jay-zhuang requested a review from siying October 4, 2022 18:26
HISTORY.md Outdated
@@ -6,6 +6,9 @@
* Fixed an optimistic transaction validation bug caused by DBImpl::GetLatestSequenceForKey() returning non-latest seq for merge (#10724).
* Fixed a bug in iterator refresh which could segfault for DeleteRange users (#10739).

### New Features
* Add option `preserve_internal_time_seconds` for tiered storage migration, which can start collecting the preserving the time information for the new data, but won't enable the tiering feature (`preclude_last_level_data_seconds`) yet. The time information can be used to decide the temperature of the existing data when the tiering is enabled later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preserve_internal_time_seconds can potentially be used for more accurate oldest ancestor time estimation, right? So maybe explain it more in a boarder way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature hasn't been implemented yet (we abandoned #8185 because based on test, there's limited improvement with random dataset).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know. I mean just be vague about what the feature does, and not limit to tiered storage. Even now, the feature can be used for manual checking the age of the keys.

uint64_t preserve_time_duration =
c->immutable_options()->preclude_last_level_data_seconds == 0
? c->immutable_options()->preserve_internal_time_seconds
: c->immutable_options()->preclude_last_level_data_seconds;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be max(preserve_internal_time_seconds, preclude_last_level_data_seconds)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking about that too, but only slightly leaning to have "preclude_last_level" to override "preserve_time":

  1. For simplification, so only one number is valid.
  2. Implementation wise, it also simpler, that only one number needs to be stored in compaction_job.

But the main question is: is there usecase that the use want to set these 2 number differently?
Currently I think the feature is still be part of "preclude_last_level".

Enable time tracking may improve the oldest_ancestor_time estimation, but the feature hasn't be implemented and for normal dataset, it doesn't improve the estimation too much based on some simulator test (that's one of the reason we abandon the original PR #8185 and currently we hope RoundRobin can fundamentally fix the TTL compaction issue).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we only limit the feature to tiered storage (which I would argue we shouldn't), I assume we would leave room where preserve_internal_time_seconds > preclude_last_level_data_seconds, if users would preserve the choice that they would increase preclude_last_level_data_seconds later and hope the it takes effective sooner.

uint64_t track_time_duration =
cfd->ioptions()->preclude_last_level_data_seconds == 0
? cfd->ioptions()->preserve_internal_time_seconds
: cfd->ioptions()->preclude_last_level_data_seconds;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, should it be max() if preclude_last_level_data_seconds <> 0?

// It's typically used for the migration of existing data to tiered storage.
// The user can enable this feature first, then the existing data will have
// time information for `preclude_last_level_data_seconds` feature which will
// be enabled later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments usually explain the default value and its behavior. We should say whether it is dynamically changeable or not. I think it should be changeable if possible.

Copy link
Contributor Author

@jay-zhuang jay-zhuang Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add comment to specify that it's not dynamic changeable for now (the same as "preclude_last_level"). And give more information about what it's doing.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has updated the pull request. You must reimport the pull request before landing.

SeqnoTimeTablePropTest() : SeqnoTimeTest() {}

void SetTrackTimeDurationOptions(Options& options,
uint64_t track_time_duration) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Google C++ Style suggests "put all input-only parameters before any output parameters".


ASSERT_OK(db_->CompactRange(cro, nullptr, nullptr));

ASSERT_GT(num_seqno_zeroing, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice to cover the case where in the last level, some keys are zeroed out and some aren't, based on time of insertion.

It is also better to validate by reading internal keys too. Perhaps GetAllKeyVersions() can be used for the validation.

? _current_time - preserve_time_duration
: 0;
preserve_time_min_seqno_ =
seqno_time_mapping_.GetOldestSequenceNum(preserve_time);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't completely get why now the oldest entry isn't seqno to preserve anymore, but at least it's still correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right, it is the oldest entry for this version of code. (my first version was having preserve_time and preclude_time differently, then I changed mind to have preclude_time to override preserve_time if both set, but forgot to revert this code. Now I think we're going back to have them differently).

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jay-zhuang added a commit to jay-zhuang/rocksdb that referenced this pull request Oct 11, 2022
Lock the penultimate level for the whole compaction inputs range, so any
key in that compaction is safe to move up from the last level to
penultimate level.
It depends on facebook#10747 (for the migration test)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants