Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support to force send non-idempotent write when doing duplication #1908

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

ninsmiracle
Copy link
Contributor

What problem does this PR solve?

#1907

What is changed and how does it work?

  1. Change the judgment logic in the writing process, and configure it to allow non-idempotent operations in the master cluster.
  2. Modify the logic of the specified mutation committer in prepare_list
  3. Add configuration items and add them to the replica side to modify them through the server-config of admin-cli.
  4. Add the parsing logic after the backup cluster receives the packaged RPC of duplication.
  5. Add the logic for the master cluster to obtain relevant non-idempotent RPC from the plog
  6. Add the metric index of dup non-idempotent write QPS
Tests
  • Unit test
  • Manual test (add detailed scripts or steps below)

when user want to force dup non-idempotent mutations. Type following command in admin-cli:

server-config replica force_send_no_idempotent_when_duplication set true

@github-actions github-actions bot added the cpp label Feb 18, 2024
@ninsmiracle ninsmiracle changed the title Support no idempotent dupFeature: support force no idempotent wirte when doing duplication Feature: support force no idempotent wirte when doing duplication Feb 18, 2024
@ninsmiracle ninsmiracle changed the title Feature: support force no idempotent wirte when doing duplication Feature: support force send non-idempotent wirte when doing duplication Feb 18, 2024
@empiredan empiredan changed the title Feature: support force send non-idempotent wirte when doing duplication feat: support to force send non-idempotent write when doing duplication Feb 20, 2024
@ninsmiracle
Copy link
Contributor Author

This PR's base function is in line with expectations , however I was requested by business department to add more matrix for surveillance system about this feature. I will update this PR in recently days.

@ninsmiracle
Copy link
Contributor Author

Now , when user config force_send_no_idempotent_when_duplication to true . We can dup non-idempotent write to backup cluster , and we can verify the data consistency by check the number rows of pegasus app.
app in master-cluster
image

app in backup-cluster
image

And because the retries for non-idempotent writes will make influence of the consistency between two clusters. So we should show the metric like following picture:
How many non-idempotent write have been retried when doing duplication

show in backup cluster:
How many non-idempotent write have been received by backup cluster

src/common/duplication_common.cpp Outdated Show resolved Hide resolved
src/common/duplication_common.cpp Outdated Show resolved Hide resolved
@@ -31,6 +31,7 @@
#include "utils/fmt_utils.h"

DSN_DECLARE_uint32(duplicate_log_batch_bytes);
DSN_DECLARE_bool(force_send_no_idempotent_when_duplication);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to add it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's necessary . Cause once I remove this line , compilation can not be passed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DSN_DECLARE_xxx macros are recommended to move to cpp files where actually use it, don't make the header files to be too large.

The same to DSN_DECLARE_uint32(duplicate_log_batch_bytes);

src/server/pegasus_write_service.cpp Outdated Show resolved Hide resolved
src/server/pegasus_mutation_duplicator.cpp Outdated Show resolved Hide resolved
src/server/pegasus_mutation_duplicator.cpp Outdated Show resolved Hide resolved
src/server/pegasus_mutation_duplicator.cpp Outdated Show resolved Hide resolved

if (request.task_code == dsn::apps::RPC_RRDB_RRDB_INCR) {
incr_rpc rpc(write);
resp.__set_error(_impl->incr(ctx.decree, rpc.request(), rpc.response()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments to describe what's you aim here.

src/common/duplication_common.cpp Outdated Show resolved Hide resolved
src/server/pegasus_mutation_duplicator.cpp Outdated Show resolved Hide resolved
src/common/duplication_common.cpp Outdated Show resolved Hide resolved
@@ -31,6 +31,7 @@
#include "utils/fmt_utils.h"

DSN_DECLARE_uint32(duplicate_log_batch_bytes);
DSN_DECLARE_bool(duplication_unsafe_allow_non_idempotent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these declarations to the places where actually use them, don't put them in header files.

@@ -89,8 +94,14 @@ class pegasus_mutation_duplicator : public dsn::replication::mutation_duplicator

size_t _total_shipped_size{0};

const std::set<dsn::task_code> _non_idempotent_code = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const std::set<dsn::task_code> _non_idempotent_code = {
const std::set<dsn::task_code> _non_idempotent_codes = {

@@ -89,8 +94,14 @@ class pegasus_mutation_duplicator : public dsn::replication::mutation_duplicator

size_t _total_shipped_size{0};

const std::set<dsn::task_code> _non_idempotent_code = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to reuse it in some other places, e.g. src/server/pegasus_write_service.cpp

Comment on lines 429 to 432
// receive non-idempotent request from master cluster via duplication when
// FLAG_duplication_unsafe_allow_non_idempotent set as true.
// This metric greater than zero means that there is already the possibility of
// inconsistency between clusters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these comments to the dup_unsafe_received_non_idempotent_duplicate_request metric description, then users know what does it mean from http API, rather than finding out the comments in code.

src/server/pegasus_mutation_duplicator.cpp Outdated Show resolved Hide resolved
src/server/pegasus_mutation_duplicator.cpp Outdated Show resolved Hide resolved
src/server/pegasus_mutation_duplicator.cpp Outdated Show resolved Hide resolved
@@ -189,6 +211,9 @@ void pegasus_mutation_duplicator::on_duplicate_reply(uint64_t hash,
// retry this rpc
_inflights[hash].push_front(rpc);
_env.schedule([hash, cb, this]() { send(hash, cb); }, 1_s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the non-idempotent request be retried twice here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes,it will. So I record specific raw_key into log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants