feat(ds): add message GC #12338

thalesmg · 2024-01-16T18:00:21Z

Fixes https://emqx.atlassian.net/browse/EMQX-9748 (rocksdb part only)

Release version: v/e5.6

Summary

PR Checklist

Please convert it to a draft if any of the following conditions are not met. Reviewers may skip over until all the items are checked:

Added tests for the changes
Added property-based tests for code which performs user input validation
Changed lines covered in coverage report
Change log has been added to changes/(ce|ee)/(feat|perf|fix|breaking)-<PR-id>.en.md files
For internal contributor: there is a jira ticket to track this change
Created PR to emqx-docs if documentation update is required, or link to a follow-up jira ticket
Schema changes are backward compatible

Checklist for CI (.github/workflows) changes

If changed package build workflow, pass this action (manual trigger)
Change log has been added to changes/ dir for user-facing artifacts update

apps/emqx/integration_test/emqx_persistent_session_ds_SUITE.erl

apps/emqx/src/emqx_persistent_message_ds_gc_worker.erl

ieQu1 · 2024-01-17T16:00:39Z

apps/emqx/src/emqx_persistent_message_ds_gc_worker.erl

+
+start_gc() ->
+    ok = emqx_ds:add_generation(?PERSISTENT_MESSAGE_DB),
+    AllGens = emqx_ds:list_generations_with_lifetimes(?PERSISTENT_MESSAGE_DB),


Not directly related to the GC, but adjacent: we want to have the ability to preserve the LTS between the generations. Let's brainstorm how it can be done.

Quick and dirty solution: the storage layer backend can query the metadata CF, and copy the data from its predecessor, if it has the same module. But maybe someone has a less hacky solution.

Preferably, it should be decided before the list_generations API is committed, since it might require some passage of the data.

It feels like this inheritance should be part of add_generation, indeed. 🤔

Since the need for inheritance is storage layer dependent (i.e.: LTS only), and since add_generation already simply creates a new gen with whatever is configured at the moment, it seems like this check for "if it's same module, inherit this" should be done there and not in the GC (by itself being responsible for this "baton passing").

Another dirty solution: we may keep LTS trie out of generation data and never GC it. Just apply thresholds from the current gen when inserting and clean it up on demand.

That'll still require some coordination, because in that case the next generation should use the same column family to preserve the metadata. Also, it might introduce a bit of complexity when the impl module changes: if we go from LTS to some other module, then the old trie column family should be dropped if we keep it around.

Another dirty solution: we may keep LTS trie out of generation data and never GC it. Just apply thresholds from the current gen when inserting and clean it up on demand.

This is an interesting idea, actually... However, this means LTS stops being part of the storage layer and becomes an entity of its own.

On one side, it makes sense since since topics are global.

On the other hand, not all storage backends need LTS trie.

If LTS trie becomes a separate entity, we'd need some management interfaces (and a replication path?) for it.

apps/emqx_durable_storage/src/emqx_ds.erl

savonarola · 2024-01-18T13:17:22Z

apps/emqx_durable_storage/src/proto/emqx_ds_proto_v3.erl

+%% See the License for the specific language governing permissions and
+%% limitations under the License.
+%%--------------------------------------------------------------------
+-module(emqx_ds_proto_v3).


Nit: Wouldn't one expect that emqx_ds_proto_v3 proxies RPC calls not to emqx_ds_replication_layer but to emqx_ds?

I haven't initiate this series of modules myself, but I think the reasoning here is that this is the proto module for the emqx_ds application. See emqx_bridge_proto_v1 for example: it calls 2 modules other than emqx_bridge.

I guess for emqx application we made a finer proto module breakdown due to the increased complexity of that application.

I think it was my mistake not to call this module emqx_ds_replication_layer_proto_... from the very beginning, because technically this BPAPI is specific to the builtin replication layer indeed. Not sure if we can change this easily now, though.

apps/emqx/src/emqx_persistent_message_ds_gc_worker.erl

ieQu1 · 2024-01-18T13:22:15Z

apps/emqx_durable_storage/src/emqx_ds.erl

+%% An opaque term identifying a generation.  Each implementation will possibly add
+%% information to this term to match its inner structure (e.g.: by embedding the shard id,
+%% in the case of `emqx_ds_replication_layer').
+-opaque generation_id() :: term().


Generation is a concept from the storage layer that becomes RankY at the emqx_ds layer. It's best not to redefine this name.
Could this type simply instead be a pair of {RankX, RankY}? (I called it stream_rank() on my branch)
For the builtin backend, at least, stream_rank() is guaranteed to be unique.

No problem in perhaps using the existing emqx_ds:stream_rank() type for it.

But then it might introduce the requirement on other implementation that values of this type should also be unique, besides the ordering properties, right?

We need the value to directly map to an unique generation so that we may drop it later, so can we say stream_rank() is indeed an unique identifier for the generation? I thought that a generation could hold several streams inside it.

For example: let's say we have a single generation using LTS, with an empty trie. After the first few messages are stored/iterated over, one or more streams already appear, and after it learns new structure, further streams appear, all in the same generation.

Also, we already expose the concept of generation in emqx_ds:add_generation API.

We need the value to directly map to an unique generation so that we may drop it later, so can we say stream_rank() is indeed an unique identifier for the generation? I thought that a generation could hold several streams inside it.

True. When I was writing the remark, I forgot to mention that stream_rank() is not a good name, but semantically it's the entity that most closely matches the definition of the unit that the garbage collector operates on.

I renamed this type to generation_rank()... How about that?

Yes, let's stick with it for now. Renaming types is possible without breaking BPAPI, so we don't have to commit to any particular name.

apps/emqx/test/emqx_persistent_messages_SUITE.erl

ieQu1 · 2024-01-22T16:26:34Z

apps/emqx_durable_storage/src/emqx_ds_lts.erl

+    trie_restore_existing(Trie, Dump).
+
+-spec trie_restore_existing(trie(), [{_Key, _Val}]) -> trie().
+trie_restore_existing(Trie, Dump) ->
    lists:foreach(
        fun({{StateFrom, Token}, StateTo}) ->


I think this function must ignore all but wildcard routes. Otherwise it will copy the trie elements created during the learning; we want to drop them to optimize the number of streams.

Makes sense. But is there a way to know if a given trie dump entry is part of a larger wildcard route without reconstructing the whole trie? i.e.: just looking at the Token?

e.g.: For the filter t/+/a, we still need to insert the static t and a.

True. I mean the algorithm should, of course, copy the full paths, but only as long as there is at least one wildcard in it. I think we can postpone the implementation for later, though.

apps/emqx_durable_storage/src/emqx_ds_replication_layer.erl

ieQu1 · 2024-01-22T16:36:47Z

Please add a changelog entry.

ieQu1 · 2024-01-22T17:21:07Z

apps/emqx_durable_storage/src/emqx_ds_replication_layer.erl

+    Nodes = lists:usort(lists:map(fun(Shard) -> node_of_shard(DB, Shard) end, Shards)),
+    Results = emqx_ds_proto_v3:list_generations_with_lifetimes(Nodes, DB),
+    %% TODO: Handle errors?
+    Gens = [Gen || {ok, Gen} <- Results],


I think we need to print a warning here.

ieQu1 · 2024-01-22T17:25:37Z

apps/emqx_durable_storage/src/emqx_ds_replication_layer.erl

    lists:foreach(
        fun(ShardId) ->
            emqx_ds_storage_layer:add_generation({DB, ShardId})
        end,
        MyShards
    ).

+-spec do_list_generations_with_lifetimes_v3(emqx_ds:db()) -> ok | {error, _}.


Q: Can we pass the shard as an argument and do aggregation on the caller side? It feels like a more fine-grained API.

ieQu1 · 2024-01-22T17:30:01Z

apps/emqx_durable_storage/src/emqx_ds_replication_layer.erl

+
+-spec drop_generation(emqx_ds:db(), generation_rank()) -> ok | {error, _}.
+drop_generation(DB, {Shard, GenId}) ->
+    Node = node_of_shard(DB, Shard),


This is good for now, but maybe add a TODO that explains that this operation must be propagated, in one way or another, to the entire replica set.

…ew generation Fixes emqx#12338 (comment)

thalesmg force-pushed the ds-message-gc-20240115 branch 4 times, most recently from 5388966 to 384eab8 Compare January 16, 2024 21:03

thalesmg marked this pull request as ready for review January 17, 2024 11:08

thalesmg requested review from ieQu1, keynslug, a team and lafirest as code owners January 17, 2024 11:08

ieQu1 reviewed Jan 17, 2024

View reviewed changes

thalesmg marked this pull request as draft January 17, 2024 17:45

savonarola reviewed Jan 18, 2024

View reviewed changes

apps/emqx_durable_storage/src/emqx_ds.erl Outdated Show resolved Hide resolved

savonarola reviewed Jan 18, 2024

View reviewed changes

apps/emqx/src/emqx_persistent_message_ds_gc_worker.erl Outdated Show resolved Hide resolved

ieQu1 reviewed Jan 18, 2024

View reviewed changes

thalesmg force-pushed the ds-message-gc-20240115 branch 5 times, most recently from e39c4aa to ba691e5 Compare January 18, 2024 19:55

ieQu1 reviewed Jan 18, 2024

View reviewed changes

apps/emqx/test/emqx_persistent_messages_SUITE.erl Outdated Show resolved Hide resolved

thalesmg force-pushed the ds-message-gc-20240115 branch 3 times, most recently from bbb699e to 7ee6b79 Compare January 18, 2024 20:52

thalesmg marked this pull request as ready for review January 19, 2024 16:25

ieQu1 reviewed Jan 22, 2024

View reviewed changes

apps/emqx/test/emqx_persistent_messages_SUITE.erl Show resolved Hide resolved

ieQu1 reviewed Jan 22, 2024

View reviewed changes

apps/emqx_durable_storage/src/emqx_ds_replication_layer.erl Outdated Show resolved Hide resolved

ieQu1 reviewed Jan 22, 2024

View reviewed changes

apps/emqx_durable_storage/src/emqx_ds_replication_layer.erl Outdated Show resolved Hide resolved

thalesmg force-pushed the ds-message-gc-20240115 branch 2 times, most recently from 92f85c7 to 528dfb6 Compare January 22, 2024 16:57

ieQu1 reviewed Jan 22, 2024

View reviewed changes

thalesmg added 3 commits January 22, 2024 14:53

feat(ds): add list_generations and drop_generation APIs

75b08b5

feat(lts): inherit previous generation's lts when possible

db710c4

feat(ps): add message gc

7035b4c

thalesmg force-pushed the ds-message-gc-20240115 branch from 528dfb6 to 7035b4c Compare January 22, 2024 17:53

ieQu1 approved these changes Jan 22, 2024

View reviewed changes

thalesmg merged commit d122340 into emqx:master Jan 22, 2024
165 checks passed

thalesmg deleted the ds-message-gc-20240115 branch January 22, 2024 19:57

thalesmg mentioned this pull request Jan 22, 2024

perf(ds): inherit only LTS paths containing wildcards when adding a new generation #12368

Merged

9 tasks

thalesmg added a commit to thalesmg/emqx that referenced this pull request Jan 23, 2024

perf(ds): inherit only LTS paths containing wildcards when adding a n…

1eb47d0

…ew generation Fixes emqx#12338 (comment)

zhongwencool pushed a commit to thalesmg/emqx that referenced this pull request Jan 25, 2024

perf(ds): inherit only LTS paths containing wildcards when adding a n…

9003bc5

…ew generation Fixes emqx#12338 (comment)

BrewTestBot mentioned this pull request Mar 28, 2024

emqx 5.6.0 Homebrew/homebrew-core#167392

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ds): add message GC #12338

feat(ds): add message GC #12338

thalesmg commented Jan 16, 2024

ieQu1 Jan 17, 2024

thalesmg Jan 17, 2024 •

edited

savonarola Jan 18, 2024 •

edited

thalesmg Jan 18, 2024

ieQu1 Jan 19, 2024 •

edited

savonarola Jan 18, 2024

thalesmg Jan 18, 2024

ieQu1 Jan 22, 2024

ieQu1 Jan 18, 2024

thalesmg Jan 18, 2024 •

edited

thalesmg Jan 18, 2024

thalesmg Jan 18, 2024

ieQu1 Jan 18, 2024 •

edited

thalesmg Jan 18, 2024

ieQu1 Jan 22, 2024

ieQu1 Jan 22, 2024

thalesmg Jan 22, 2024

ieQu1 Jan 22, 2024

ieQu1 commented Jan 22, 2024

ieQu1 Jan 22, 2024

ieQu1 Jan 22, 2024

ieQu1 Jan 22, 2024

feat(ds): add message GC #12338

feat(ds): add message GC #12338

Conversation

thalesmg commented Jan 16, 2024

Summary

PR Checklist

Checklist for CI (.github/workflows) changes

Choose a reason for hiding this comment

thalesmg Jan 17, 2024 • edited

Choose a reason for hiding this comment

savonarola Jan 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ieQu1 Jan 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thalesmg Jan 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ieQu1 Jan 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ieQu1 commented Jan 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thalesmg Jan 17, 2024 •

edited

savonarola Jan 18, 2024 •

edited

ieQu1 Jan 19, 2024 •

edited

thalesmg Jan 18, 2024 •

edited

ieQu1 Jan 18, 2024 •

edited