Skip to content

Commit

Permalink
Merge branch 'master' into replicated-closed-indices
Browse files Browse the repository at this point in the history
  • Loading branch information
tlrx committed Feb 20, 2019
2 parents b756f6c + 12006ea commit 538cdcd
Show file tree
Hide file tree
Showing 141 changed files with 2,315 additions and 1,289 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ public void setDistribution(Distribution distribution) {
public void freeze() {
requireNonNull(distribution, "null distribution passed when configuring test cluster `" + this + "`");
requireNonNull(version, "null version passed when configuring test cluster `" + this + "`");
requireNonNull(javaHome, "null javaHome passed when configuring test cluster `" + this + "`");
logger.info("Locking configuration of `{}`", this);
configurationFrozen.set(true);
}
Expand Down Expand Up @@ -204,16 +205,7 @@ private void startElasticsearchProcess(Path distroArtifact) {
Map<String, String> environment = processBuilder.environment();
// Don't inherit anything from the environment for as that would lack reproductability
environment.clear();
if (javaHome != null) {
environment.put("JAVA_HOME", getJavaHome().getAbsolutePath());
} else if (System.getenv().get("JAVA_HOME") != null) {
logger.warn("{}: No java home configured will use it from environment: {}",
this, System.getenv().get("JAVA_HOME")
);
environment.put("JAVA_HOME", System.getenv().get("JAVA_HOME"));
} else {
logger.warn("{}: No javaHome configured, will rely on default java detection", this);
}
environment.put("JAVA_HOME", getJavaHome().getAbsolutePath());
environment.put("ES_PATH_CONF", configFile.getParent().toAbsolutePath().toString());
environment.put("ES_JAVA_OPTIONS", "-Xms512m -Xmx512m");
// don't buffer all in memory, make sure we don't block on the default pipes
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1261,7 +1261,8 @@ public void testGetAlias() throws IOException {
GetAliasesResponse getAliasesResponse = execute(getAliasesRequest, highLevelClient().indices()::getAlias,
highLevelClient().indices()::getAliasAsync);

assertThat(getAliasesResponse.getAliases().size(), equalTo(3));
assertThat("Unexpected number of aliases, got: " + getAliasesResponse.getAliases().toString(),
getAliasesResponse.getAliases().size(), equalTo(3));
assertThat(getAliasesResponse.getAliases().get("index1").size(), equalTo(1));
AliasMetaData aliasMetaData1 = getAliasesResponse.getAliases().get("index1").iterator().next();
assertThat(aliasMetaData1, notNullValue());
Expand Down
4 changes: 1 addition & 3 deletions distribution/docker/src/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@ RUN curl --retry 8 -s ${jdkUrl} | tar -C /opt -zxf -
# REF: https://github.com/elastic/elasticsearch-docker/issues/171
RUN ln -sf /etc/pki/ca-trust/extracted/java/cacerts /opt/jdk-${jdkVersion}/lib/security/cacerts

RUN yum install -y unzip which

RUN groupadd -g 1000 elasticsearch && \
adduser -u 1000 -g 1000 -d /usr/share/elasticsearch elasticsearch

Expand All @@ -51,7 +49,7 @@ ENV JAVA_HOME /opt/jdk-${jdkVersion}
COPY --from=builder /opt/jdk-${jdkVersion} /opt/jdk-${jdkVersion}

RUN yum update -y && \
yum install -y nc unzip wget which && \
yum install -y nc && \
yum clean all

RUN groupadd -g 1000 elasticsearch && \
Expand Down
218 changes: 133 additions & 85 deletions docs/plugins/repository-s3.asciidoc

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
[[analysis-synonym-graph-tokenfilter]]
=== Synonym Graph Token Filter

beta[]

The `synonym_graph` token filter allows to easily handle synonyms,
including multi-word synonyms correctly during the analysis process.

Expand Down Expand Up @@ -187,3 +185,8 @@ multiple versions of a token may choose which version of the token to emit when
parsing synonyms, e.g. `asciifolding` will only produce the folded version of the
token. Others, e.g. `multiplexer`, `word_delimiter_graph` or `ngram` will throw an
error.

WARNING:The synonym rules should not contain words that are removed by
a filter that appears after in the chain (a `stop` filter for instance).
Removing a term from a synonym rule breaks the matching at query time.

2 changes: 1 addition & 1 deletion docs/reference/cat/shards.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
[[relocation]]
=== Relocation

Let's say you've checked your health and you see a relocating
Let's say you've checked your health and you see relocating
shards. Where are they from and where are they going?

[source,js]
Expand Down
16 changes: 8 additions & 8 deletions docs/reference/docs/bulk.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,11 @@ optional_source\n
--------------------------------------------------
// NOTCONSOLE

*NOTE*: the final line of data must end with a newline character `\n`. Each newline character
*NOTE*: The final line of data must end with a newline character `\n`. Each newline character
may be preceded by a carriage return `\r`. When sending requests to this endpoint the
`Content-Type` header should be set to `application/x-ndjson`.

The possible actions are `index`, `create`, `delete` and `update`.
The possible actions are `index`, `create`, `delete`, and `update`.
`index` and `create` expect a source on the next
line, and have the same semantics as the `op_type` parameter to the
standard index API (i.e. create will fail if a document with the same
Expand Down Expand Up @@ -214,7 +214,7 @@ documents. See <<optimistic-concurrency-control>> for more details.
Each bulk item can include the version value using the
`version` field. It automatically follows the behavior of the
index / delete operation based on the `_version` mapping. It also
support the `version_type` (see <<index-versioning, versioning>>)
support the `version_type` (see <<index-versioning, versioning>>).

[float]
[[bulk-routing]]
Expand Down Expand Up @@ -245,20 +245,20 @@ NOTE: Only the shards that receive the bulk request will be affected by
`refresh`. Imagine a `_bulk?refresh=wait_for` request with three
documents in it that happen to be routed to different shards in an index
with five shards. The request will only wait for those three shards to
refresh. The other two shards of that make up the index do not
refresh. The other two shards that make up the index do not
participate in the `_bulk` request at all.

[float]
[[bulk-update]]
=== Update

When using `update` action `retry_on_conflict` can be used as field in
When using the `update` action, `retry_on_conflict` can be used as a field in
the action itself (not in the extra payload line), to specify how many
times an update should be retried in the case of a version conflict.

The `update` action payload, supports the following options: `doc`
The `update` action payload supports the following options: `doc`
(partial document), `upsert`, `doc_as_upsert`, `script`, `params` (for
script), `lang` (for script) and `_source`. See update documentation for details on
script), `lang` (for script), and `_source`. See update documentation for details on
the options. Example with update actions:

[source,js]
Expand All @@ -282,4 +282,4 @@ POST _bulk
[[bulk-security]]
=== Security

See <<url-access-control>>
See <<url-access-control>>.
54 changes: 27 additions & 27 deletions docs/reference/docs/delete-by-query.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
== Delete By Query API

The simplest usage of `_delete_by_query` just performs a deletion on every
document that match a query. Here is the API:
document that matches a query. Here is the API:

[source,js]
--------------------------------------------------
Expand All @@ -20,7 +20,7 @@ POST twitter/_delete_by_query

<1> The query must be passed as a value to the `query` key, in the same
way as the <<search-search,Search API>>. You can also use the `q`
parameter in the same way as the search api.
parameter in the same way as the search API.

That will return something like this:

Expand Down Expand Up @@ -68,7 +68,7 @@ these documents. In case a search or bulk request got rejected, `_delete_by_quer
failures that are returned by the failing bulk request are returned in the `failures`
element; therefore it's possible for there to be quite a few failed entities.

If you'd like to count version conflicts rather than cause them to abort then
If you'd like to count version conflicts rather than cause them to abort, then
set `conflicts=proceed` on the url or `"conflicts": "proceed"` in the request body.

Back to the API format, this will delete tweets from the `twitter` index:
Expand Down Expand Up @@ -140,14 +140,14 @@ POST twitter/_delete_by_query?scroll_size=5000
[float]
=== URL Parameters

In addition to the standard parameters like `pretty`, the Delete By Query API
also supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, `timeout`
In addition to the standard parameters like `pretty`, the delete by query API
also supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, `timeout`,
and `scroll`.

Sending the `refresh` will refresh all shards involved in the delete by query
once the request completes. This is different than the Delete API's `refresh`
once the request completes. This is different than the delete API's `refresh`
parameter which causes just the shard that received the delete request
to be refreshed. Also unlike the Delete API it does not support `wait_for`.
to be refreshed. Also unlike the delete API it does not support `wait_for`.

If the request contains `wait_for_completion=false` then Elasticsearch will
perform some preflight checks, launch the request, and then return a `task`
Expand All @@ -163,10 +163,10 @@ for details. `timeout` controls how long each write request waits for unavailabl
shards to become available. Both work exactly how they work in the
<<docs-bulk,Bulk API>>. As `_delete_by_query` uses scroll search, you can also specify
the `scroll` parameter to control how long it keeps the "search context" alive,
eg `?scroll=10m`, by default it's 5 minutes.
e.g. `?scroll=10m`. By default it's 5 minutes.

`requests_per_second` can be set to any positive decimal number (`1.4`, `6`,
`1000`, etc) and throttles rate at which `_delete_by_query` issues batches of
`1000`, etc.) and throttles the rate at which delete by query issues batches of
delete operations by padding each batch with a wait time. The throttling can be
disabled by setting `requests_per_second` to `-1`.

Expand All @@ -182,7 +182,7 @@ target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds
--------------------------------------------------

Since the batch is issued as a single `_bulk` request large batch sizes will
Since the batch is issued as a single `_bulk` request, large batch sizes will
cause Elasticsearch to create many requests and then wait for a while before
starting the next set. This is "bursty" instead of "smooth". The default is `-1`.

Expand Down Expand Up @@ -259,13 +259,13 @@ The number of version conflicts that the delete by query hit.
`noops`::

This field is always equal to zero for delete by query. It only exists
so that delete by query, update by query and reindex APIs return responses
so that delete by query, update by query, and reindex APIs return responses
with the same structure.

`retries`::

The number of retries attempted by delete by query. `bulk` is the number
of bulk actions retried and `search` is the number of search actions retried.
of bulk actions retried, and `search` is the number of search actions retried.

`throttled_millis`::

Expand All @@ -286,7 +286,7 @@ executed again in order to conform to `requests_per_second`.

Array of failures if there were any unrecoverable errors during the process. If
this is non-empty then the request aborted because of those failures.
Delete-by-query is implemented using batches and any failure causes the entire
Delete by query is implemented using batches, and any failure causes the entire
process to abort but all failures in the current batch are collected into the
array. You can use the `conflicts` option to prevent reindex from aborting on
version conflicts.
Expand All @@ -296,7 +296,7 @@ version conflicts.
[[docs-delete-by-query-task-api]]
=== Works with the Task API

You can fetch the status of any running delete-by-query requests with the
You can fetch the status of any running delete by query requests with the
<<tasks,Task API>>:

[source,js]
Expand All @@ -306,7 +306,7 @@ GET _tasks?detailed=true&actions=*/delete/byquery
// CONSOLE
// TEST[skip:No tasks to retrieve]

The responses looks like:
The response looks like:

[source,js]
--------------------------------------------------
Expand Down Expand Up @@ -346,7 +346,7 @@ The responses looks like:
}
--------------------------------------------------
// TESTRESPONSE
<1> this object contains the actual status. It is just like the response json
<1> This object contains the actual status. It is just like the response JSON
with the important addition of the `total` field. `total` is the total number
of operations that the reindex expects to perform. You can estimate the
progress by adding the `updated`, `created`, and `deleted` fields. The request
Expand All @@ -373,7 +373,7 @@ you to delete that document.
[[docs-delete-by-query-cancel-task-api]]
=== Works with the Cancel Task API

Any Delete By Query can be canceled using the <<tasks,task cancel API>>:
Any delete by query can be canceled using the <<tasks,task cancel API>>:

[source,js]
--------------------------------------------------
Expand Down Expand Up @@ -403,26 +403,26 @@ POST _delete_by_query/r1A2WoRbTwKZ516z6NEs5A:36619/_rethrottle?requests_per_seco

The task ID can be found using the <<tasks,tasks API>>.

Just like when setting it on the `_delete_by_query` API `requests_per_second`
Just like when setting it on the delete by query API, `requests_per_second`
can be either `-1` to disable throttling or any decimal number
like `1.7` or `12` to throttle to that level. Rethrottling that speeds up the
query takes effect immediately but rethrotting that slows down the query will
take effect on after completing the current batch. This prevents scroll
take effect after completing the current batch. This prevents scroll
timeouts.

[float]
[[docs-delete-by-query-slice]]
=== Slicing

Delete-by-query supports <<sliced-scroll>> to parallelize the deleting process.
Delete by query supports <<sliced-scroll, sliced scroll>> to parallelize the deleting process.
This parallelization can improve efficiency and provide a convenient way to
break the request down into smaller parts.

[float]
[[docs-delete-by-query-manual-slice]]
==== Manually slicing
==== Manual slicing

Slice a delete-by-query manually by providing a slice id and total number of
Slice a delete by query manually by providing a slice id and total number of
slices to each request:

[source,js]
Expand Down Expand Up @@ -498,7 +498,7 @@ Which results in a sensible `total` like this one:
==== Automatic slicing

You can also let delete-by-query automatically parallelize using
<<sliced-scroll>> to slice on `_id`. Use `slices` to specify the number of
<<sliced-scroll, sliced scroll>> to slice on `_id`. Use `slices` to specify the number of
slices to use:

[source,js]
Expand Down Expand Up @@ -575,8 +575,8 @@ be larger than others. Expect larger slices to have a more even distribution.
are distributed proportionally to each sub-request. Combine that with the point
above about distribution being uneven and you should conclude that the using
`size` with `slices` might not result in exactly `size` documents being
`_delete_by_query`ed.
* Each sub-requests gets a slightly different snapshot of the source index
deleted.
* Each sub-request gets a slightly different snapshot of the source index
though these are all taken at approximately the same time.

[float]
Expand All @@ -588,8 +588,8 @@ number for most indices. If you're slicing manually or otherwise tuning
automatic slicing, use these guidelines.

Query performance is most efficient when the number of `slices` is equal to the
number of shards in the index. If that number is large, (for example,
500) choose a lower number as too many `slices` will hurt performance. Setting
number of shards in the index. If that number is large (for example,
500), choose a lower number as too many `slices` will hurt performance. Setting
`slices` higher than the number of shards generally does not improve efficiency
and adds overhead.

Expand Down
Loading

0 comments on commit 538cdcd

Please sign in to comment.