New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix renaming data streams with CCR replication #88875
Conversation
840d570
to
32e82d7
Compare
@elasticsearchmachine test this please |
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of a data stream while renaming the data stream. For example, assume a user has an auto-follow request that looks like this: ``` PUT /_ccr/auto_follow/my-auto-follow-pattern { "remote_cluster" : "other-cluster", "leader_index_patterns" : ["logs-*"], "follow_index_pattern" : "{{leader_index}}_copy" } ``` And then the data stream `logs-mysql-error` was created, creating the backing index `.ds-logs-mysql-error-2022-07-29-000001`. Prior to this commit, replicating this data stream means that the backing index would be renamed to `.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not renamed for a data stream during following. After this commit, there are a couple of changes: First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes `logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works, this means we need to support renaming a data stream for a regular "create follower" request, so a new parameter has been added: `data_stream_name`. It works like this: ``` PUT /mynewindex/_ccr/follow { "remote_cluster": "other-cluster", "leader_index": "myotherindex", "data_stream_name": "new_ds" } ``` Second, the backing index for a data stream must be renamed in a way that does not break the parsing of a data stream backing pattern, whereas previously the index `.ds-logs-mysql-error-2022-07-29-000001` would be renamed to `.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the rollover digit), after this commit it will be renamed to `.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like: | Leader Cluster | Follower Cluster | |--------------|-----------| | `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) | | `.ds-logs-mysql-error-2022-07-29-000001` | `.ds-logs-mysql-error_copy-2022-07-29-000001` | Which internally means the auto-follow request turned into the create follower request of: ``` PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow { "remote_cluster": "other-cluster", "leader_index": ".ds-logs-mysql-error-2022-07-29-000001", "data_stream_name": "logs-mysql-error_copy" } ``` Relates to elastic#84940 (cherry-picked the commit for a test) Relates to elastic#61993 (where data stream support was first introduced for CCR) Resolves elastic#81751
15b9fb4
to
e1d83ca
Compare
Pinging @elastic/es-distributed (Team:Distributed) |
Hi @dakrone, I've created a changelog YAML for you. |
Pinging @elastic/es-data-management (Team:Data Management) |
@elasticmachine run elasticsearch-ci/packaging-tests-unix-sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I had some ideas/concerns shared, but I see that I am probably my understanding of the data stream is not complete :) . I would be happy to discuss those but I do not want to block this PR for this.
@@ -76,6 +76,26 @@ referenced leader index. When this API returns, the follower index exists, and | |||
(Required, string) The <<remote-clusters,remote cluster>> containing | |||
the leader index. | |||
|
|||
[[ccr-put-follow-request-body-data_stream_name]]`data_stream_name`:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, if I understand correctly, CCR works on index level. With this API what we do is to follow a single leader index that belongs to a data stream and we provide also a data stream name. I do not see from this doc, any guidance on naming the follower index or the data stream of the follower index so it could be anything right?
Furthermore, I was wondering, if CCR is defined on an index level, this means that after the leader index will be rolled over the leader data stream would not be followed anymore. Right?
So, I was wondering. Is this API maybe too "free" and error prone, users might put invalid names and potentially create weird bugs?
Should we instead, verify the name of the follower index that it follows the data stream pattern and extract the data stream name from it?
So, if a user tries to follow an data stream backing index, for example, .ds-logs-mysql-default_copy-2022-01-01-000001
we will infer logs-mysql-default_copy
as the data stream name. While in the case they try to create a follower index like .ds-logs-mysql-default-2022-01-01-000001_copy
, we will fail and let them now that this is an invalid name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we instead, verify the name of the follower index that it follows the data stream pattern and extract the data stream name from it?
It's worse than that, a data stream can have any arbitrary backing index name, because we have the migrate to data stream API that converts an alias to a data stream. So a data stream named logs-mysql-info
could have backing indices named myfirstindex
and other-index
following no pattern. Rather than enforce that there is a pattern with the backing index names, I opted to go with a check for something that looks like our regular pattern and if that doesn't work, then we treat it as an "I don't know" pattern and just do the regular name replacement.
If for some reason we have something that looks like a good pattern name, but is not, then we throw an exception and stop the auto-following. I'll add a test for this specific scenario (in e3001d5).
Furthermore, I was wondering, if CCR is defined on an index level, this means that after the leader index will be rolled over the leader data stream would not be followed anymore. Right?
If the leader is rolled over then the auto-follow pattern would pick up the next index in the data stream (presumably <thing>-000002
), the data stream is not really "followed" because CCR doesn't really follow data streams, but the next index would be replicated, only the data stream would already exist because the first index would have created it already.
Things are a little different in ILM, because ILM auto-injects some unfollow
actions around rollover, but regardless the naming thing would be the same.
...ck/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/TransportPutFollowAction.java
Outdated
Show resolved
Hide resolved
assertThat(request.getRemoteCluster(), equalTo("remote")); | ||
assertThat(request.getFollowerIndex(), equalTo("my-backing-index_copy")); | ||
assertThat(request.getLeaderIndex(), equalTo("my-backing-index")); | ||
assertThat(request.getDataStreamName(), equalTo("logs-foo-bar_copy")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is nicely representing the concern I raised earlier. I didn't know that it was possible and it's hard for me to assess the consequences.
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of a data stream while renaming the data stream. For example, assume a user has an auto-follow request that looks like this: ``` PUT /_ccr/auto_follow/my-auto-follow-pattern { "remote_cluster" : "other-cluster", "leader_index_patterns" : ["logs-*"], "follow_index_pattern" : "{{leader_index}}_copy" } ``` And then the data stream `logs-mysql-error` was created, creating the backing index `.ds-logs-mysql-error-2022-07-29-000001`. Prior to this commit, replicating this data stream means that the backing index would be renamed to `.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not renamed for a data stream during following. After this commit, there are a couple of changes: First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes `logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works, this means we need to support renaming a data stream for a regular "create follower" request, so a new parameter has been added: `data_stream_name`. It works like this: ``` PUT /mynewindex/_ccr/follow { "remote_cluster": "other-cluster", "leader_index": "myotherindex", "data_stream_name": "new_ds" } ``` Second, the backing index for a data stream must be renamed in a way that does not break the parsing of a data stream backing pattern, whereas previously the index `.ds-logs-mysql-error-2022-07-29-000001` would be renamed to `.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the rollover digit), after this commit it will be renamed to `.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like: | Leader Cluster | Follower Cluster | |--------------|-----------| | `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) | | `.ds-logs-mysql-error-2022-07-29-000001` | `.ds-logs-mysql-error_copy-2022-07-29-000001` | Which internally means the auto-follow request turned into the create follower request of: ``` PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow { "remote_cluster": "other-cluster", "leader_index": ".ds-logs-mysql-error-2022-07-29-000001", "data_stream_name": "logs-mysql-error_copy" } ``` Relates to elastic#84940 (cherry-picked the commit for a test) Relates to elastic#61993 (where data stream support was first introduced for CCR) Resolves elastic#81751 (cherry picked from commit 3420be0)
This disabled backwards compatibility tests so the CCR bugfix can be backported, since it includes some version-specific serialization.
Updates versions for backport (elastic#88995) of original fix (elastic#88875) Relates to elastic#88997
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of a data stream while renaming the data stream. For example, assume a user has an auto-follow request that looks like this: ``` PUT /_ccr/auto_follow/my-auto-follow-pattern { "remote_cluster" : "other-cluster", "leader_index_patterns" : ["logs-*"], "follow_index_pattern" : "{{leader_index}}_copy" } ``` And then the data stream `logs-mysql-error` was created, creating the backing index `.ds-logs-mysql-error-2022-07-29-000001`. Prior to this commit, replicating this data stream means that the backing index would be renamed to `.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not renamed for a data stream during following. After this commit, there are a couple of changes: First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes `logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works, this means we need to support renaming a data stream for a regular "create follower" request, so a new parameter has been added: `data_stream_name`. It works like this: ``` PUT /mynewindex/_ccr/follow { "remote_cluster": "other-cluster", "leader_index": "myotherindex", "data_stream_name": "new_ds" } ``` Second, the backing index for a data stream must be renamed in a way that does not break the parsing of a data stream backing pattern, whereas previously the index `.ds-logs-mysql-error-2022-07-29-000001` would be renamed to `.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the rollover digit), after this commit it will be renamed to `.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like: | Leader Cluster | Follower Cluster | |--------------|-----------| | `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) | | `.ds-logs-mysql-error-2022-07-29-000001` | `.ds-logs-mysql-error_copy-2022-07-29-000001` | Which internally means the auto-follow request turned into the create follower request of: ``` PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow { "remote_cluster": "other-cluster", "leader_index": ".ds-logs-mysql-error-2022-07-29-000001", "data_stream_name": "logs-mysql-error_copy" } ``` Relates to #84940 (cherry-picked the commit for a test) Relates to #61993 (where data stream support was first introduced for CCR) Resolves #81751
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of
a data stream while renaming the data stream. For example, assume a user has an auto-follow request
that looks like this:
And then the data stream
logs-mysql-error
was created, creating the backing index.ds-logs-mysql-error-2022-07-29-000001
.Prior to this commit, replicating this data stream means that the backing index would be renamed to
.ds-logs-mysql-error-2022-07-29-000001_copy
and the data stream would not be renamed. Thiscaused a check to trip in
TransportPutLifecycleAction
asserting that a backing index was notrenamed for a data stream during following.
After this commit, there are a couple of changes:
First, the data stream will also be renamed. This means that the
logs-mysql-error
becomeslogs-mysql-error_copy
when created on the follower cluster. Because of the way that CCR works,this means we need to support renaming a data stream for a regular "create follower" request, so a
new parameter has been added:
data_stream_name
. It works like this:Second, the backing index for a data stream must be renamed in a way that does not break the parsing
of a data stream backing pattern, whereas previously the index
.ds-logs-mysql-error-2022-07-29-000001
would be renamed to.ds-logs-mysql-error-2022-07-29-000001_copy
(an illegal name since it doesn't end with therollover digit), after this commit it will be renamed to
.ds-logs-mysql-error_copy-2022-07-29-000001
to match the renamed data stream. This means that forthe given
follow_index_pattern
of{{leader_index}}_copy
the index changes look like:logs-mysql-error
(data stream)logs-mysql-error_copy
(data stream).ds-logs-mysql-error-2022-07-29-000001
.ds-logs-mysql-error_copy-2022-07-29-000001
Which internally means the auto-follow request turned into the create follower request of:
Relates to #84940 (cherry-picked the commit for a test)
Relates to #61993 (where data stream support was first introduced for CCR)
Resolves #81751