Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

box: persist gc consumers of non-anonymous replicas #9960

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

drewdzzz
Copy link
Contributor

@drewdzzz drewdzzz commented Apr 16, 2024

The patch makes Tarantool persist gc consumers of replicas. Now almost every creation and update of gc_consumer is a replace in space _gc_consumers. Also, the patch deprecates option wal_cleanup_delay because it was a workaround for a problem solved by this patch.

Part of tarantool/tarantool-ee#742

@drewdzzz drewdzzz self-assigned this Apr 16, 2024
@drewdzzz drewdzzz force-pushed the persistent_gc branch 27 times, most recently from a2c1b44 to 2caa493 Compare April 22, 2024 16:44
@coveralls
Copy link

coveralls commented Apr 22, 2024

Coverage Status

coverage: 87.115% (+0.01%) from 87.102%
when pulling 8770993 on drewdzzz:persistent_gc
into d2240bf
on tarantool:master
.

@drewdzzz drewdzzz force-pushed the persistent_gc branch 2 times, most recently from 6a2245b to ecbc542 Compare June 13, 2024 08:59
Copy link
Contributor

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch! I didn't check out the whole gc: persist gc consumers yet and I'll come back for it tonight)

src/box/replication.cc Outdated Show resolved Hide resolved
va_list ap;
va_start(ap, format);
vsnprintf(consumer->name, GC_NAME_MAX, format, ap);
gc_consumer_register_impl(uuid, vclock, format, ap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: IMHO, the code would be easier to read, if gc_consumer_register called gc_consumer_register_anon and inserted itself into hash tree. Up to you, feel free to resolve

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what others will say.

if prune_logs then
-- Trigger snapshot to prune xlogs
cg.master:exec(function() box.snapshot() end)
check_xlog(cg.master.workdir, vclock_sum(cg.master:get_vclock()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[001] replication-luatest/persistent_gc_test.lua                      [ fail ]
[001] Test failed! Output from reject file /tmp/t/rejects/replication-luatest/persistent_gc.reject:
[001] Tarantool version is 3.2.0-entrypoint-75-gecbc542e2a
[001] TAP version 13
[001] 1..7
[001] # Started on Thu Jun 13 14:19:33 2024
[001] # Starting group: Indirect replication of _gc_consumers
[001] ok     1	Indirect replication of _gc_consumers.test_indirect_replication_of_consumers
[001] # Starting group: Persistent gc
[001] ok     2	Persistent gc.test_gc_consumer_is_updated_by_relay
[001] ok     3	Persistent gc.test_gc_consumer_update_retry
[001] ok     4	Persistent gc.test_gc_consumer_is_not_updated_without_replica
[001] ok     5	Persistent gc.test_reconnect_inactive_consumer
[001] not ok 6	Persistent gc.test_reconnect_inactive_consumer_and_no_logs
[001] #   ...arantool/test/replication-luatest/persistent_gc_test.lua:43: Assertion failed: 147 > 208
[001] #   stack traceback:
[001] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:43: in function 'check_xlog'
[001] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:269: in function 'persistent_gc_deactivate_consumer_test'
[001] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:283: in function 'Persistent gc.test_reconnect_inactive_consumer_and_no_logs'
[001] #   	...
[001] #   	[C]: in function 'xpcall'
[001] #   artifacts:
[001] #   	master -> /tmp/t/001_replication-luatest/artifacts/master-tG2bbTldx5od
[001] #   	replica -> /tmp/t/001_replication-luatest/artifacts/replica-zwT-oRt_to11
[001] ok     7	Persistent gc.test_deactivate_gc_consumer_for_connected_replica
[001] # Ran 7 tests in 14.327 seconds, 6 succeeded, 1 failed
[001] 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[005] Test failed! Output from reject file /tmp/t/rejects/replication-luatest/persistent_gc.reject:
[005] Tarantool version is 3.2.0-entrypoint-75-gecbc542e2a
[005] TAP version 13
[005] 1..7
[005] # Started on Thu Jun 13 14:19:33 2024
[005] # Starting group: Indirect replication of _gc_consumers
[005] ok     1	Indirect replication of _gc_consumers.test_indirect_replication_of_consumers
[005] # Starting group: Persistent gc
[005] ok     2	Persistent gc.test_gc_consumer_is_updated_by_relay
[005] ok     3	Persistent gc.test_gc_consumer_update_retry
[005] ok     4	Persistent gc.test_gc_consumer_is_not_updated_without_replica
[005] ok     5	Persistent gc.test_reconnect_inactive_consumer
[005] not ok 6	Persistent gc.test_reconnect_inactive_consumer_and_no_logs
[005] #   ...arantool/test/replication-luatest/persistent_gc_test.lua:20: expected: nil, actual: "00000000000000000207.snap"
[005] #   stack traceback:
[005] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:20: in function 'xdir_to_xfiles'
[005] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:29: in function 'check_xlog'
[005] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:269: in function 'persistent_gc_deactivate_consumer_test'
[005] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:283: in function 'Persistent gc.test_reconnect_inactive_consumer_and_no_logs'
[005] #   	...
[005] #   	[C]: in function 'xpcall'
[005] #   artifacts:
[005] #   	master -> Failed to copy artifacts for server (alias: master, workdir: master-nPPw81T5b-GR)
[005] #   	replica -> /tmp/t/005_replication-luatest/artifacts/replica-xERIV_Lg5hqu
[005] ok     7	Persistent gc.test_deactivate_gc_consumer_for_connected_replica
[005] # Ran 7 tests in 14.714 seconds, 6 succeeded, 1 failed
[005] 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both these cases flakes a lot on my machine. I build tnt with cmake -DCMAKE_BUILD_TYPE=Debug -DTEST_BUILD=ON .; make -j

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
I guess it's because simple replace to a dummy consumer does not help - it's updated back by persist_fiber because it is still marked as async advanced. I'm working on it now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I've fixed it, but I cannot reproduce your failure. Could you, please, check it again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to reproduce these fails on my Linux PC, debugging now.

Copy link
Contributor Author

@drewdzzz drewdzzz Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out, I've found the reason!
box.snapshot() runs cleanup asynchronously - it waits while cleanup is scheduled in WAL thread but not while old files are deleted. That's why there were problems with several snaps and outdated xlogs - they just weren't deleted yet, but were scheduled for deletion. Retries help here.

It seems that Apple M2 was just too fast for these fails - files were always deleted before the check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These errors are fixed, thank you

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, one more place:

[007] Tarantool version is 3.2.0-entrypoint-75-g576b00f93a
[007] TAP version 13
[007] 1..7
[007] # Started on Fri Jun 14 18:34:38 2024
[007] # Starting group: Indirect replication of _gc_consumers
[007] ok     1	Indirect replication of _gc_consumers.test_indirect_replication_of_consumers
[007] # Starting group: Persistent gc
[007] ok     2	Persistent gc.test_gc_consumer_is_updated_by_relay
[007] ok     3	Persistent gc.test_gc_consumer_update_retry
[007] ok     4	Persistent gc.test_gc_consumer_is_not_updated_without_replica
[007] ok     5	Persistent gc.test_reconnect_inactive_consumer
[007] not ok 6	Persistent gc.test_reconnect_inactive_consumer_and_no_logs
[007] #   ...arantool/test/replication-luatest/persistent_gc_test.lua:20: expected: nil, actual: "0000
0000000000000198.snap"
[007] #   stack traceback:
[007] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:20: in function 'xdir_to_xfiles'
[007] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:29: in function 'check_xlog'
[007] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:259: in function 'persistent_gc_deactivate_consumer_test'
[007] #   	...arantool/test/replication-luatest/persistent_gc_test.lua:292: in function 'Persistent gc.test_reconnect_inactive_consumer_and_no_logs'
[007] #   	...
[007] #   	[C]: in function 'xpcall'
[007] #   artifacts:
[007] #   	replica -> /tmp/t/007_replication-luatest/artifacts/replica-1tDYxEdMe2kS
[007] #   	master -> /tmp/t/007_replication-luatest/artifacts/master-ThI_alQpxMA7
[007] ok     7	Persistent gc.test_deactivate_gc_consumer_for_connected_replica
[007] # Ran 7 tests in 13.929 seconds, 6 succeeded, 1 failed

Maybe retrying inside all check_xlog_*?

Copy link
Contributor Author

@drewdzzz drewdzzz Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it seems to be a right approach, did so. Also, revised some moments of the test to make it less flaky.

local t = require('luatest')
local server = require('luatest.server')
local cluster = require('luatest.replica_set')
local fio = require('fio')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also that really strange error. Reproduced very rarely:

[009] replication-luatest/persistent_gc_test.lua                      [ fail ]
[009] Test failed! Output from reject file /tmp/t/rejects/replication-luatest/persistent_gc.reject:
[009] Tarantool version is 3.2.0-entrypoint-75-gecbc542e2a
[009] TAP version 13
[009] 1..7
[009] # Started on Thu Jun 13 14:22:17 2024
[009] # Starting group: Indirect replication of _gc_consumers
[009] ok     1	Indirect replication of _gc_consumers.test_indirect_replication_of_consumers
[009] # Starting group: Persistent gc
[009] ok     2	Persistent gc.test_gc_consumer_is_updated_by_relay
[009] ok     3	Persistent gc.test_gc_consumer_update_retry
[009] ok     4	Persistent gc.test_gc_consumer_is_not_updated_without_replica
[009] ok     5	Persistent gc.test_reconnect_inactive_consumer
[009] 
[009] [test-run server "luatest_server"] Last 15 lines of the log file /tmp/t/009_replication-luatest/persistent_gc_test.log:
[009] builtin/fio.lua:438: attempt to index local 'st' (a nil value)
[009] stack traceback:
[009] 	...g/tnt/tarantool/test-run/lib/luatest/luatest/capture.lua:144: in function <...g/tnt/tarantool/test-run/lib/luatest/luatest/capture.lua:138>
[009] 	[builtin#19]: at 0x55555590a2ac
[009] 	[C]: in function 'xpcall'
[009] 	...ing/tnt/tarantool/test-run/lib/luatest/luatest/utils.lua:38: in function 'run_tests'
[009] 	...ng/tnt/tarantool/test-run/lib/luatest/luatest/runner.lua:353: in function <...ng/tnt/tarantool/test-run/lib/luatest/luatest/runner.lua:348>
[009] 	[C]: in function 'xpcall'
[009] 	...tnt/tarantool/test-run/lib/luatest/luatest/capturing.lua:74: in function <...tnt/tarantool/test-run/lib/luatest/luatest/capturing.lua:72>
[009] 	[C]: in function 'xpcall'
[009] 	...ng/tnt/tarantool/test-run/lib/luatest/luatest/runner.lua:55: in function 'fn'
[009] 	...antool/test-run/lib/luatest/luatest/sandboxed_runner.lua:14: in function 'run'
[009] 	...arantool/test-run/lib/luatest/luatest/cli_entrypoint.lua:4: in function <...arantool/test-run/lib/luatest/luatest/cli_entrypoint.lua:3>
[009] 	...ogramming/tnt/tarantool/test-run/lib/luatest/bin/luatest:5: in main chunk
[009] Worker "009_replication-luatest" got failed test; restarted the server

Copy link
Contributor Author

@drewdzzz drewdzzz Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is the same: During traversal underlying xlogs were asynchronously deleted and fio didn't cope with it. Fixed.

*/
struct trigger *trigger =
(struct trigger *)xmalloc(sizeof(*trigger));
trigger_create(trigger, on_replace_dd_gc_consumers, NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not try to incapsulate triggers into gc submodule and put them right into alter. Let's see, what @Gerold103 will say

Copy link
Contributor Author

@drewdzzz drewdzzz Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, it was Vova's idea to hide triggers in gc.c.

Thanks to this approach, we have all methods operating with in-memory consumers private in gc.c except for anonymous ones. Having both persistent and in-memory methods in header gc.h does not seem convenient.

src/box/gc.c Outdated Show resolved Hide resolved
@drewdzzz drewdzzz force-pushed the persistent_gc branch 10 times, most recently from 784db53 to b215d5f Compare June 20, 2024 09:15
@drewdzzz drewdzzz requested a review from locker June 20, 2024 09:26
@drewdzzz drewdzzz requested a review from Serpentian June 20, 2024 09:54
When the schema version is outdated, we are not allowed to modify system
spaces which is what func:drop() does, so the test fails after updating
schema version - the commit fixes the mistake.

NO_CHANGELOG=test
NO_DOC=test
When implementing persistent gc consumers, we will need to serialize
vclocks. We already have such helpers, but they are private for
`xrow.c`. So the commit simply puts their declarations to the
appropriate header and marks them as non-static.

Also, the commit adds a check for correct lsn and replica id - it is
needed to safely decode user tuples. This part will be tested by the
tests of space _gc_consumers.

NO_TEST=trivial
NO_CHANGELOG=internal
NO_DOC=internal
The commit introduces a trivial helper that checks if tuple field is nil.
It can be useful for system spaces with nullable fields.

NO_TEST=trivial
NO_CHANGELOG=internal
NO_DOC=internal
This space will be used to store gc consumers persistently. Since the
feature will be exported to 2.11.4 and we have linear version history,
this space is created on upgrade to 2.11.4, is dropped on upgrade to
3.0.0 and is created again on upgrade to 3.2.0. Using this approach,
the space has only two states - space is absent and space is created and
is consistent with space _cluster. We could leave the space on downgrade
from 3.2.0 to a version from [3.0.0, 3.2.0), but then we would care
about the cases when the space is created but is not supposed to be used
and when the space is created but it has stale consumers - this approach
is more complicated, so we decided to stick with the simple one.

The commit is dedicated to schema update, actual logic (on_replace trigger)
is implemented in the next commit.

Part of tarantool/tarantool-ee#742

NO_CHANGELOG=later
NO_DOC=later
The commit encapsulates WAL GC consumers - it moves consumers from
replicas to their own index (`gc.consumers_hash`), replicas has no
direct access to their consumers anymore and their lifetime became
independent. It doesn't make much sense now, but it will make our
life a lot easier when implementing persistent consumers.

Since sometimes we need a temporary gc consumer, for example, when
joining a new replica, the commit introduces the concept of anonymous
gc consumers. Caller is responsible for their deletion and they are
not registered in hash by uuid.

Along the way, make registration methods non-failing (panic on OOM).

Part of tarantool/tarantool-ee#742

NO_TEST=refactirong
NO_CHANGELOG=refactoring
NO_DOC=refactoring
Even now we re-create gc consumer in subscribe because it can go back
and function gc_consumer_advance will fail with an assertion. When
introducing persistent gc state, user will have an opportunity to update
consumer manually (and decrease their vclocks as well) so let's start
using update instead of advance.

Part of tarantool/tarantool-ee#742

NO_TEST=refactoring
NO_CHANGELOG=refactoring
NO_DOC=refactoring
Currently, every registered consumer is active by default (even if its
vclock is far behind from WAL GC one), consumer becomes inactive only
on ENOSPC and never becomes active again, which is not great - replica
could fill the xlog gap from another replica and continue to fetch our
xlogs but we won't pin them for it. In addition, when gc consumers will
be persistent, inactive consumers of hanging replicas will become active
after every restart. To relieve these problems, let's deactivate a
consumer on registration if it is outdated and activate it back when
it is advanced and becomes actual.

Part of tarantool/tarantool-ee#742

NO_TEST=see next commits
NO_CHANGELOG=internal
NO_DOC=internal
The commit turns in-memory gc consumers into persistent ones.
Replicas' descriptors (`struct replica`) do not manage consumers
anymore - they are registered and unregistered on replace in space
`_cluster` instead. Since modification of persistent consumer is
more "heavy" operation (it can yield and block current fiber) the
commit introduces two types of update - synchronous and asynchronous.
The first one is needed, for example, on subscribe - we want to update
the consumer and only then send xlogs. The second one is needed for
relay not to litter tx fiber pool.

Note that the new system space is treated in a special way - we don't
set our trigger for system spaces to it because it shouldn't acquire ddl
and limit depth of on_replace triggers. If we acquired ddl on replace in
_gc_consumers, user's transactions would be aborted for no apparent
reason when the consumers are advanced (that's what "acquire ddl"
actually do). And, since replace in `_gc_consumers` can be done from
on_replace trigger of `_cluster`, limiting depth of its triggers would
lead to error in such cases. Also, we cannot use surrogate space for
`_gc_consumers` to set its on_replace trigger because it can be deleted
on upgrade to `3.0.0` and then the trigger will be dropped along with
the space. Fortunately, we don't need a surrogate space anyway.

Also, triggers for space `_gc_consumers` are placed not in `alter.cc`
but in `gc.c` to incapsulate internal methods.

About implementation of asynchronous updates: there were two
approaches - start a fiber for each relay that will wake up and
advance the consumer, or start one fiber in gc that will traverse
all consumers and persist "dirty" ones. The first soultion seems
to be easier - you don't need to think about synchronization of
synchronous and asynchronous updates of consumers, but the second
one feels more appropriate because even after relay is stopped
WAL GC fiber will advance consumer of disconnected replica if it
has any pending updates, and having one fiber seems more convenient
too. So the second solution was chosen. To easily implement
synchronous modifications then, we could simply use asynchronous
approach with `fiber_cond`, but it doesn't work in our case: we
register and unregister consumers inside on_replace trigger of
`_cluster` and therefore inside active transaction, so we cannot
yield there to wait for asynchronous update to complete.

Since there is no need to wait limbo to update the local space, let's
use flag TXN_FORCE_ASYNC when possible. Registration and unregistration
functions are called only from on_replace trigger of `_cluster`, hence,
inside active transactions, so they never use this flag - we mustn't set
this flag to synchronous transactions.

Also, in order to successfully deactivate existing consumer, a separate
method is needed because when consumer advanced, it is marked and when
persist fiber wakes up, it persists all such marked consumers, so a
simple replace to dummy consumer would lead to update by persist fiber
because it is still marked as asynchronously advanced - the special
method resets this flag.

Part of tarantool/tarantool-ee#742

@TarantoolBot document
Title: Persistent WAL GC state

Now all replicas has WAL GC consumers persisted in space _gc_consumers.
When the replica is deleted from space _cluster, its consumer is deleted
as well. It ensures that xlogs needed for replica won't be deleted after
restart of instance. Note that they still can be deleted when we have no
space for newer xlogs.
The commit deprecates option wal_cleanup_delay - it was introduced as a
workaround for a problem, which is solved by persistent gc state.

However, we automatically set wal_cleanup_delay to non-zero value if
current schema does not support _gc_consumers space and the instance
is a part of a replicaset. It prevents xlogs from being cleaned up on
upgrade to version with deprecated but still supported
wal_cleanup_delay option.

Part of tarantool/tarantool-ee#742

@TarantoolBot document
Title: Deprecate wal_cleanup_delay

The option `wal_cleanup_delay` is deprecated starting from Tarantool 3.2.0.
The new compat module option `box_cfg_wal_cleanup_delay` was
introduced to control whether the option is still available.
With the old behavior, which is the default in Tarantool 3.2.0,
the option is still available and has the same default value, but a
deprecation warning is logged when non-zero or non-default value is used.
With the new behavior, its default value is zero and an error is raised
when a non-zero value is used.

(Please create https://tarantool.io/compat/box_cfg_wal_cleanup_delay)

Note that the compat option can be set only before initial `box.cfg` call.

We are planning to switch the compat option to the new behavior
starting from Tarantool 4.0 with the ability to revert to the
old behavior. Starting from Tarantool 5.0 we are planning to
drop `wal_cleanup_delay` completely.
-- We should create space and its index atomically so that we can
-- rely only on the fact that the space is created
log.info("open transaction to atomically create persistent WAL GC state")
box.begin()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use box.atomic so that if the code below fails, the transaction will be rolled back. Also, no need to log "begin" and "commit".

log.info("create space _gc_consumers")
local format = {{name = 'uuid', type = 'string'},
{name = 'vclock', type = 'map', is_nullable = true},
{name = 'opts', type = 'map', is_nullable = true}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never set null for options - better just insert an empty map. As for vclock - I don't see a point in making it nullable either - can't we use vclock_is_set instead? Also, why don't use you use the uuid field type for the uuid field?

@@ -1397,6 +1473,7 @@ end
local function upgrade_to_3_0_0()
store_replicaset_uuid_in_new_way()
add_instance_names()
drop_persistent_gc_state()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I run Tarantool 2.11.4 and it has some gc consumers, they will be dropped if I upgrade to 3.2.0, potentially along with xlog files required by replicas? That's unexpected - better leave the space be.

Also, adding a feature to a minor release contradicts the release policy AFAIR. @Totktonada is it okay?

"3.0.0",
"3.1.0",
"3.2.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU you can't downgrade to 3.2.0 because it hasn't been release yet.

@nshy is that okay? Also, please review the patch and the test concerning the schema upgrade/downgrade.

@@ -0,0 +1,3 @@
## feature/box

* Now all replicas have WAL GC consumers persisted in the `_gc_consumers` space.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticket reference? I don't understand why the commit description references a EE ticket while the feature seems to be implemented completely in CE.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my bad, I accidentally moved the relevant ticket into the EE repo :)
Here's a new one: #10154

@@ -1,3 +1,4 @@
## feature/box

* Now all replicas have WAL GC consumers persisted in the `_gc_consumers` space.
The `wal_cleanup_delay` option is no longer needed, so it is deprecated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Isn't it still useful for anonymous replicas?
  • I don't think that deprecating it is a good idea because this option may be useful for other purposes, for example, the user may consume xlogs from another application.
  • AFAIU this is a EE feature so it shouldn't be mentioned in the CE changelog.

src/box/gc.h Outdated
CFORMAT(printf, 3, 4)
void
gc_consumer_register(const struct tt_uuid *uuid, const struct vclock *vclock,
const char *format, ...);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes the API confusing and inconsistent.

Can't we leave the old API as is and instead add a couple of new functions? Here's how I see it:

  1. GC fiber writes all registered consumers to the system space in the background.
  2. On recovery, all consumers are loaded and marked as "orphan".
  3. gc_consumer_register adopts the consumer from the orphan list if it finds it there.
  4. gc_consumer_unregister leaves the consumer in the orphan list unless it's "forgotten" (see p.6).
  5. gc_sync (new function) waits until GC fiber writes all consumers to the system space.
  6. gc_forget (new function; may need a better name) synchronously deletes the given consumer (by id) from the system space and marks the in-memory consumer (it it exists) so that it's deleted immediately on gc_consumer_unregister or deletes it right away if it's unused.

Then we'd just need to put gc_sync after join/subscribe and gc_forget to _cluster.on_replace trigger. No global refactoring would be required - all the changes would be concentrated in src/box/gc.c.

t.assert_equals(box.info.vclock,
-- Ignore local lsn
local vclock = table.deepcopy(box.info.vclock)
vclock[0] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why so many changes in the test? This feature shouldn't result in any incompatible user-visible changes.

@locker locker assigned drewdzzz and unassigned locker Jun 21, 2024
Copy link
Collaborator

@Gerold103 Gerold103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes, cool stuff 🔥, really better than the previous version 💪! Lets see how we could make it even simpler? I wrote some comments, then noticed what Vova said, and I like these ideas. It might improve the patchset further.

Comment on lines 316 to 324
/*
* Drop consumer only if another replica does not own it.
* Another replica can own it, for example, when we created a new
* replica object on `replicaset_update`, and now we drop it because
* we can reuse the old one.
*/
if (replica_by_uuid(&replica->uuid) == NULL)
gc_consumer_unregister(&replica->uuid);
TRASH(replica);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a bit sad that the consumer ownership is still here, but now it is implicit. Could you keep gc_consumer pointer being explicitly stored in the replica like before? I mean, nothing changed really in this commit besides that now we don't see who owns what, and have to do hacks like this. You said "their lifetime became independent", but it didn't really.

Having the pointers didn't seem to block the anon consumers anyhow.

I would really appreciate if we would find a way to keep the "ownership" concept. Too easy to break things otherwise.

Also, if we allow to have multiple consumers for one UUID at once, we should mark one of them as the main one, and others would be orphaned and their change wouldn't affect the persisted state. Otherwise there needs to be an explicit protection from having more than one consumer object for the same UUID (assertions for impossible cases, runtime checks for client errors - currently, for example, I see that all the rb_tree_insert calls are not checked whether they failed or not).

* In this case background fiber will persist `volatile_vclock`
* when it isn't consistent with `vclock`.
*/
bool is_async_updated;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it needed? Isn't it always equal to vclock != volatile_vclock?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants