Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request]copy_index api(reindex with file copy not doc insert) #44128

Closed
riverbuilding opened this issue Jul 9, 2019 · 9 comments
Closed
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates

Comments

@riverbuilding
Copy link
Contributor

use case: copy old index as a backup to support rollback and then update the original index to move forward in the same cluster.

ES provide re-index API which can be used to copy index with many flexibility like schema change, doc filter etc.
although with many limitations, from performance side, the file copy method is much fast than reindex, especially when index size is big.

API:

POST _copyindex
{
"source": {
"index": "twitter"

},
"dest": {
"index": "new_twitter"
"replica": Number//must eq or less than orginal index replica, otherwise, fail request
}
}

it's really like snapshot-restore process only repository is ES itself

@mark-vieira mark-vieira added the :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs label Jul 9, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@original-brownbear
Copy link
Member

@riverbuilding just to understand this feature request better:

Your motivation is to effectively have ES be the repository instead of using another repository implementation to save the complexity of setting up a repository?

@ywelsch
Copy link
Contributor

ywelsch commented Jul 10, 2019

As far as I understand, this could be the index split functionality with the additional option of keeping the number of primary shards the same in the target index.

@riverbuilding
Copy link
Contributor Author

@ywelsch split is really what I need in this situation. thanks a lot.

@riverbuilding
Copy link
Contributor Author

@ywelsch I'd like to ask: when split index, can the "can't select recover from shards if both indices have the same number of shards" be omit? I mean, can the target index be set the same shard number with the original index?

@ywelsch
Copy link
Contributor

ywelsch commented Jul 11, 2019

Right now, it can't. If we were to allow this, I would rather prefer to expose it via a new endpoint _clone than enable this on _split.

@ywelsch ywelsch reopened this Jul 11, 2019
@ywelsch ywelsch added team-discuss :Data Management/Indices APIs APIs to create and manage indices and templates and removed :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Jul 11, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@riverbuilding
Copy link
Contributor Author

@ywelsch _clone is really a good name. thanks

@jakelandis
Copy link
Contributor

We discussed this today and think this a good enhancement and also like the _clone name implemented on top of the split functionality.

ywelsch added a commit that referenced this issue Jul 25, 2019
Adds an API to clone an index. This is similar to the index split and shrink APIs, just with the
difference that the number of primary shards is kept the same. In case where the filesystem
provides hard-linking capabilities, this is a very cheap operation.

Indexing cloning can be done by running `POST my_source_index/_clone/my_target_index` and it
supports the same options as the split and shrink APIs.

Closes #44128
polyfractal pushed a commit to polyfractal/elasticsearch that referenced this issue Jul 29, 2019
Adds an API to clone an index. This is similar to the index split and shrink APIs, just with the
difference that the number of primary shards is kept the same. In case where the filesystem
provides hard-linking capabilities, this is a very cheap operation.

Indexing cloning can be done by running `POST my_source_index/_clone/my_target_index` and it
supports the same options as the split and shrink APIs.

Closes elastic#44128
jkakavas pushed a commit that referenced this issue Jul 31, 2019
Adds an API to clone an index. This is similar to the index split and shrink APIs, just with the
difference that the number of primary shards is kept the same. In case where the filesystem
provides hard-linking capabilities, this is a very cheap operation.

Indexing cloning can be done by running `POST my_source_index/_clone/my_target_index` and it
supports the same options as the split and shrink APIs.

Closes #44128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates
Projects
None yet
Development

No branches or pull requests

6 participants