Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate sorting in reindex #49458

Merged

Conversation

henningandersen
Copy link
Contributor

@henningandersen henningandersen commented Nov 21, 2019

Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to #47567

The deprecation notice in ILM documentations use of sorted reindex may need further refinement once we move to removing sort.

Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to elastic#47567
@henningandersen henningandersen added >deprecation v8.0.0 :Distributed/Reindex Issues relating to reindex that are not caused by issues further down v7.6.0 labels Nov 21, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Reindex)

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 478 to 484
`sort`:::
(Optional, list) A comma-separated list of `<field>:<direction>` pairs to sort by before indexing.
Use in conjunction with `max_docs` to control what documents are reindexed.
deprecated[7.6, sort in reindex is deprecated] *NOTE: Sorting in reindex is deprecated.*
Sorting in reindex was never guaranteed to index
documents in order and prevents future evolution of reindex such as resilience and performance
improvements. If used in combination with `max_docs`, consider using a query filter instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the deprecated macro and the bold note are a bit redundant.

I'd place this content into a block deprecated macro like below.

`sort`:::
+
--
(Optional, list) A comma-separated list of `<field>:<direction>` pairs to sort by before indexing.
Use in conjunction with `max_docs` to control what documents are reindexed.

deprecated::[7.6,Sort in reindex is deprecated. Sorting in reindex was never guaranteed to index documents in order and prevents reindex such as resilience and performance improvements. If used in combination with `max_docs`, consider using a query filter instead.]
--

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this now, but I get an error saying:

element listitem: validity error : Element listitem content does not follow the DTD

which is why I originally went for the simpler solution. My asciidoc skills are not super-advanced, so maybe there is a trick to make this work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to get this working on my end without that error.

Screen Shot 2019-11-27 at 3 43 43 PM

Here's the commit on my fork if that helps:
https://github.com/jrodewig/elasticsearch/commit/7111829c9d08cba24af5dc776c53e958a7490a58

Feel free to cherry-pick it or copy/paste the contents.

There are a few tricks:

  • The delimiter setup is needed.
  • The deprecated text must be on a single line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you have is workable though. If you're not able to get the streamlined deprecated note working, I wouldn't hold up this PR. We can reformat it as a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @jrodewig , worked out fine on one line (had to also escape the comma), see: aa2a7e0

@@ -478,6 +478,10 @@ which defaults to a maximum size of 100 MB.
`sort`:::
(Optional, list) A comma-separated list of `<field>:<direction>` pairs to sort by before indexing.
Use in conjunction with `max_docs` to control what documents are reindexed.
deprecated[7.6, sort in reindex is deprecated] *NOTE: Sorting in reindex is deprecated.*
Sorting in reindex was never guaranteed to index
documents in order and prevents future evolution of reindex such as resilience and performance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go with "further" rather than "future"

Suggested change
documents in order and prevents future evolution of reindex such as resilience and performance
documents in order and prevents further development of reindex, such as resilience and performance

Copy link
Contributor

@jrodewig jrodewig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a suggestion re: streamlining the deprecated note. Also left a minor editorial comment.

Otherwise LGTM.

@@ -837,8 +818,8 @@ POST _reindex
----------------------------------------------------------------
// TEST[setup:big_twitter]

<1> `_reindex` defaults to sorting by `_doc` so `random_score` will not have any
effect unless you override the sort to `_score`.
<1> you may need to adjust the `min_score` depending on the relative amount of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<1> you may need to adjust the `min_score` depending on the relative amount of
<1> You may need to adjust the `min_score` depending on the relative amount of

Copy link
Contributor

@jrodewig jrodewig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I can address reformatting the deprecated note as a separate issue.

Streamline deprecation notice and better text.

Add deprecation notice to ILM reindex doc.
@henningandersen
Copy link
Contributor Author

Thanks for reviewing @jrodewig and @tbrooks8.

@henningandersen
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/bwc
@elasticmachine run elasticsearch-ci/packaging-sample-matrix

@henningandersen henningandersen merged commit 5b56a99 into elastic:master Nov 29, 2019
henningandersen added a commit that referenced this pull request Nov 29, 2019
Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to #47567
henningandersen added a commit that referenced this pull request Nov 29, 2019
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Nov 29, 2019
Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to elastic#47567
henningandersen added a commit that referenced this pull request Dec 1, 2019
Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to #47567
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Dec 5, 2019
Moved the deprecation warning to ReindexValidator to ensure it runs
early and works with resilient reindex. Also check that the warning
is reported back also for wait_for_completion=false.

Follow-up to elastic#49458
henningandersen added a commit that referenced this pull request Dec 6, 2019
Moved the deprecation warning to ReindexValidator to ensure it runs
early and works with resilient reindex. Also check that the warning
is reported back for wait_for_completion=false.

Follow-up to #49458
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Dec 6, 2019
Moved the deprecation warning to ReindexValidator to ensure it runs
early and works with resilient reindex. Also check that the warning
is reported back for wait_for_completion=false.

Follow-up to elastic#49458
henningandersen added a commit that referenced this pull request Dec 6, 2019
Moved the deprecation warning to ReindexValidator to ensure it runs
early and works with resilient reindex. Also check that the warning
is reported back for wait_for_completion=false.

Follow-up to #49458
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to elastic#47567
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
Moved the deprecation warning to ReindexValidator to ensure it runs
early and works with resilient reindex. Also check that the warning
is reported back for wait_for_completion=false.

Follow-up to elastic#49458
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Feb 4, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Feb 9, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Feb 9, 2020
Relates: #4341

Deprecate sorting in reindex elastic/elasticsearch#49458 (issue: elastic/elasticsearch#47567)

Closes #4356

(cherry picked from commit 20a2133)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>deprecation :Distributed/Reindex Issues relating to reindex that are not caused by issues further down v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants