Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor schema for the durable storage #12562

Merged
merged 10 commits into from Mar 4, 2024
Merged

Conversation

ieQu1
Copy link
Member

@ieQu1 ieQu1 commented Feb 21, 2024

Fixes https://emqx.atlassian.net/browse/EMQX-11862

Release version: v/e5.?

Summary

  • Reduce configuration nesting
  • Allow to change storage layout
  • Add dynamic dispatching of the DS configuration handler

PR Checklist

Please convert it to a draft if any of the following conditions are not met. Reviewers may skip over until all the items are checked:

  • Added tests for the changes
  • Added property-based tests for code which performs user input validation
  • Changed lines covered in coverage report
  • Change log has been added to changes/(ce|ee)/(feat|perf|fix|breaking)-<PR-id>.en.md files
  • For internal contributor: there is a jira ticket to track this change
  • Created PR to emqx-docs if documentation update is required, or link to a follow-up jira ticket
  • Schema changes are backward compatible

Checklist for CI (.github/workflows) changes

  • If changed package build workflow, pass this action (manual trigger)
  • Change log has been added to changes/ dir for user-facing artifacts update

@ieQu1 ieQu1 force-pushed the dev/new-ds-schema branch 4 times, most recently from c2f3f28 to 3deaceb Compare February 22, 2024 16:26
@ieQu1 ieQu1 marked this pull request as ready for review February 22, 2024 17:39
@ieQu1 ieQu1 requested review from lafirest and a team as code owners February 22, 2024 17:39
thalesmg
thalesmg previously approved these changes Feb 22, 2024
rel/i18n/emqx_ds_schema.hocon Outdated Show resolved Hide resolved
apps/emqx/src/emqx_ds_schema.erl Show resolved Hide resolved
thalesmg
thalesmg previously approved these changes Feb 22, 2024
rel/i18n/emqx_ds_schema.hocon Outdated Show resolved Hide resolved
rel/i18n/emqx_ds_schema.hocon Outdated Show resolved Hide resolved
rel/i18n/emqx_ds_schema.hocon Outdated Show resolved Hide resolved
rel/i18n/emqx_ds_schema.hocon Outdated Show resolved Hide resolved
rel/i18n/emqx_ds_schema.hocon Outdated Show resolved Hide resolved
apps/emqx/src/emqx_ds_schema.erl Show resolved Hide resolved
apps/emqx/src/emqx_ds_schema.erl Outdated Show resolved Hide resolved
apps/emqx/src/emqx_ds_schema.erl Outdated Show resolved Hide resolved
@ieQu1 ieQu1 force-pushed the dev/new-ds-schema branch 2 times, most recently from 6a58c1d to 06bdc2d Compare February 23, 2024 12:42
apps/emqx/src/emqx_ds_schema.erl Outdated Show resolved Hide resolved
rel/i18n/emqx_schema.hocon Outdated Show resolved Hide resolved
@ieQu1 ieQu1 force-pushed the dev/new-ds-schema branch 2 times, most recently from 7de9111 to 95a8439 Compare February 23, 2024 14:16
apps/emqx/src/emqx_persistent_message.erl Show resolved Hide resolved
apps/emqx/src/emqx_ds_schema.erl Outdated Show resolved Hide resolved
Comment on lines 200 to 208
{epoch_bits,
sc(
range(0, 64),
#{
default => 10,
importance => ?IMPORTANCE_MEDIUM,
desc => ?DESC(wildcard_optimized_epoch_bits)
}
)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could turn that option into a time interval instead. Work on the consistent replication has so far shown that one of the most practical ways to deal with ordering guarantees is to just use microsecond-presicion timestamps for message keys, but changing precision will suddenly make this option backward-incompatible. One significant downside is that we won't be able to respect specified value exactly and need to tell the user what's the actual epoch size somehow.

Copy link
Member Author

@ieQu1 ieQu1 Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mentioned in the other PR that using microsecond-precision timestamp for serialization is fine, but it must be implemented in a way that does NOT affect neither EMQX application, NOR the storage layer. Otherwise we leak abstractions from the replication layer.

EMQX application owns the #message record, and assumes that the timestamp is in milliseconds. DS must store and restore message as is (at least type- and unit-wise), so changing replication layer in the future doesn't result in breakage.

There is no reason why the storage layer must be desined otherwise. I guess proper way to address this is to pass unique message id (which could be derived from the microsecond TS), from replication layer to the storage layer, instead of generating it in the storage layer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Made it hidden for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

EMQX application owns the #message record, and assumes that the timestamp is in milliseconds.

That's why I mentioned message key and not emqx-level message timestamp.

NOR the storage layer.

Arguable. I do not honestly see why storage layer could not accommodate replication layer needs and handle timestamps with variable precision for the sake of overall simplicity and compactness.

Copy link
Member Author

@ieQu1 ieQu1 Feb 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguable. I do not honestly see why storage layer could not accommodate replication layer needs and handle timestamps with variable precision for the sake of overall simplicity and compactness.

Well, I would like to avoid it if possible, since it will, to some degree, break the layered model of the DS. It should be possible to replace the replication layer (if we come up with a new & faster replication protocol, for example), and reuse the storage layer modules. Storage layer may in theory implement some clever way of serializing messages that depends on the timestamp's fixed size.
It's all hypothetical, of course, but it illustrates why we should strive to keep the layers untangled.

I don't think it will lead to any complixity, to be honest. bitfield_lts module makes it quite trivial to map the serialization key next to the timestamp, so, in effect, one gets a microsecond timestamp in the RocksDB key. It's simply a matter of the API.

Sure, treating microseconds as a separate entity will make it impossible to configure bitfield_lts module to seek to an epoch with a microsecond precision, but I don't see it as a problem, because the top-level API (emqx_ds:make_iterator) won't be able to make use of it anyway.

@ieQu1 ieQu1 merged commit 2e792da into emqx:master Mar 4, 2024
166 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants