Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
1358f95
Author: Sean Gallagher
Apr 1, 2014
e13d759
Author: Sean Gallagher
Apr 1, 2014
31f7696
Add upgrade instructions
Apr 4, 2014
124d370
[TEST] cleanup secondary cluster properly in Tribe tests.
s1monw Apr 5, 2014
a5aafbb
[TEST] Prevent RelocationTests from going crazy when relocations take…
s1monw Apr 5, 2014
d26a956
releasable bytes output + use in transport / translog
kimchy Apr 4, 2014
7b9df39
[Test] Added better control over the number of documents indexed by B…
bleskes Apr 5, 2014
ade1d0e
Added global ordinals (unique incremental numbering for terms) to fie…
martijnvg Feb 17, 2014
000c33a
fix typo
gabriel-tessier Apr 1, 2014
866c520
Add doc value for binary field.
kzwang Apr 2, 2014
9df655a
Remove AtomicFieldData.isValuesOrdered.
jpountz Apr 4, 2014
ecab74f
add lucene language model similarities (Dirichlet & JelinekMercer)
kzwang Apr 7, 2014
d64d0d6
Remove clear on mock page/array
kimchy Apr 6, 2014
37c07ef
disable args tests that cause page/array leak
kimchy Apr 7, 2014
40a1ac8
Renamed XContentParser.Token named "t" to "token".
mallamanis Feb 10, 2014
dc15dee
Renamed ClusterBlocks variable named "block" to "blocks".
mallamanis Feb 10, 2014
043d785
Removing EOL client rubberband and adding official php client
Nov 20, 2013
d8364e8
Fix typo and add more clients
Dec 11, 2013
c6caeea
Update link to puppet module and remove link to other RPM repo as we …
Feb 20, 2014
c9b0b04
Aggregation cleanup
uboness Apr 3, 2014
94278d8
Update advanced-scripting.asciidoc
wittyameta Apr 3, 2014
705c7e2
Recycled bytes in http + rest layer refactoring phase 2
kimchy Apr 6, 2014
1ec4f8f
[TEST] Replaced RestTestSuiteRunner with parametrized test that uses …
javanna Mar 10, 2014
f1a8aad
[TEST] null out static resources in base test classes
s1monw Apr 7, 2014
9a2dba2
Fixed upgrade.asciidoc typo and incorrect usage.
Apr 7, 2014
f283a9f
[TEST] specified number_of_shards 5 to make sure the two docs end up …
javanna Apr 7, 2014
fb338ef
Make writePrimitive*() and readPrimitive*() methods public.
Apr 7, 2014
e3187d5
Update LongHash to work like BytesRefHash.
mattweber Apr 5, 2014
033e46f
Rename readPrimitive*Array()/writePrimitive*Array() methods
Apr 7, 2014
5138083
Author: Sean Gallagher
Apr 1, 2014
f2181d5
[TEST] Be more verbose if ClusterStatsTests fails
s1monw Apr 6, 2014
befa833
Make sure successful operations are correct if second phase is fast
s1monw Apr 7, 2014
49c74e0
Rename successulOps to successfulOps in TransportSearchTypeAction
s1monw Apr 7, 2014
a1d0eee
[TEST] return the correct transport instance in mock transport
kimchy Apr 7, 2014
48031b6
Fixes typo in "Scan" search type documention
AndrewO Apr 7, 2014
dcc6a6e
[BUILD] Remove site dependencies generation
mrsolo Apr 7, 2014
b4c506b
[TEST] use no replicas in MLT tests - doc mappers need to be present …
s1monw Apr 8, 2014
0cebbf1
Text Query has been replaced by Match Query
dadoonet Apr 8, 2014
fd8a6ac
[TEST] make BulkTest more robust if test infra is slow
s1monw Apr 8, 2014
c58b823
[TEST] Add more randomization to bulk tests
s1monw Apr 8, 2014
e8467f0
Failed shards could be re-assigned to the same nodes if multiple repl…
bleskes Apr 8, 2014
de13d70
[TEST] Wait for LANGUID events to be processed before pulling stats
s1monw Apr 8, 2014
7eb8b0d
[TEST] Log where locks are created from if they are still open on clo…
s1monw Apr 8, 2014
a98b3fa
Revert "[TEST] Log where locks are created from if they are still ope…
s1monw Apr 8, 2014
5b6fd6d
[TEST] Fix testRandomDirectoryIOExceptions to wait for green on reope…
s1monw Apr 8, 2014
adc9a25
Fix P/C assertions for rewrite reader
s1monw Apr 8, 2014
6d96683
[Test] recoverWhileRelocating: Increase timeout while waiting for sha…
bleskes Apr 8, 2014
cd0c0de
[TEST] RecoveryWhileunderLoadTests sometimes need higher timeouts
s1monw Apr 9, 2014
a9c8624
[TEST] MLT Rest test needs a mapping since we randomized number of no…
s1monw Apr 9, 2014
960d353
Remove plugin isolation feature for a future version
costin Mar 17, 2014
9aa1cb4
Fix format string for DiskThresholdDecider reroute explanation
dakrone Apr 8, 2014
a2fb480
[DOCS] Improved the upgrade doc, added upgrade table
Apr 9, 2014
beeecc2
Merge branch '5651' of github.com:seang-es/elasticsearch into 5651
Apr 9, 2014
af0278b
[Docs] Allocation setting explanation
nik9000 Apr 9, 2014
55fc606
Merge branch '5651' of github.com:seang-es/elasticsearch into 5651
Apr 9, 2014
b51817c
[DOCS] Fixed version number in README.textfile
Apr 9, 2014
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 5 additions & 1 deletion README.textile
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ We can also use the JSON query language Elasticsearch provides instead of a quer
curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty=true' -d '
{
"query" : {
"text" : { "user": "kimchy" }
"match" : { "user": "kimchy" }
}
}'
</pre>
Expand Down Expand Up @@ -206,6 +206,10 @@ The distribution will be created under @target/releases@.
See the "TESTING":TESTING.asciidoc file for more information about
running the Elasticsearch test suite.

h3. Upgrading to Elasticsearch 1.x?

In order to ensure a smooth upgrade process from earlier versions of Elasticsearch (< 1.0.0), it is recommended to perform a full cluster restart. Please see the "Upgrading" section of the "setup reference":http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html.

h1. License

<pre>
Expand Down
20 changes: 8 additions & 12 deletions TESTING.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -186,22 +186,18 @@ mvn test -Dtests.class=org.elasticsearch.test.rest.ElasticsearchRestTests
`ElasticsearchRestTests` is the executable test class that runs all the
yaml suites available within the `rest-api-spec` folder.

The following are the options supported by the REST tests runner:
The REST tests support all the options provided by the randomized runner, plus the following:

* `tests.rest[true|false|host:port]`: determines whether the REST tests need
to be run and if so whether to rely on an external cluster (providing host
and port) or fire a test cluster (default). It's possible to provide a
comma separated list of addresses to send requests in a round-robin fashion.
* `tests.rest[true|false]`: determines whether the REST tests need to be run (default) or not.
* `tests.rest.suite`: comma separated paths of the test suites to be run
(by default loaded from /rest-api-spec/test). It is possible to run only a subset
of the tests providing a sub-folder or even a single yaml file (the default
/rest-api-spec/test prefix is optional when files are loaded from classpath)
e.g. -Dtests.rest.suite=index,get,create/10_with_id
* `tests.rest.section`: regex that allows to filter the test sections that
are going to be run. If provided, only the section names that match (case
insensitive) against it will be executed
* `tests.rest.spec`: REST spec path (default /rest-api-spec/api)
* `tests.iters`: runs multiple iterations
* `tests.seed`: seed to base the random behaviours on
* `tests.appendseed[true|false]`: enables adding the seed to each test
section's description (default false)

Note that the REST tests, like all the integration tests, can be run against an external
cluster by specifying the `tests.cluster` property, which if present needs to contain a
comma separated list of nodes to connect to (e.g. localhost:9300). A transport client will
be created based on that and used for all the before|after test operations, and to extract
the http addresses of the nodes so that REST requests can be sent to them.
2 changes: 1 addition & 1 deletion dev-tools/build_release.py
Original file line number Diff line number Diff line change
Expand Up @@ -388,7 +388,7 @@ def smoke_test_release(release, files, expected_hash, plugins):
if version['build_hash'].strip() != expected_hash:
raise RuntimeError('HEAD hash does not match expected [%s] but got [%s]' % (expected_hash, version['build_hash']))
print(' Running REST Spec tests against package [%s]' % release_file)
run_mvn('test -Dtests.rest=%s -Dtests.class=*.*RestTests' % ("127.0.0.1:9200"))
run_mvn('test -Dtests.cluster=%s -Dtests.class=*.*RestTests' % ("127.0.0.1:9300"))
print(' Verify if plugins are listed in _nodes')
conn.request('GET', '/_nodes?plugin=true&pretty=true')
res = conn.getresponse()
Expand Down
12 changes: 9 additions & 3 deletions docs/community/clients.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -39,15 +39,15 @@ See the {client}/ruby-api/current/index.html[official Elasticsearch Ruby client]
* http://github.com/karmi/tire[Tire]:
Ruby API & DSL, with ActiveRecord/ActiveModel integration.

* http://github.com/grantr/rubberband[rubberband]:
Ruby client.

* https://github.com/PoseBiz/stretcher[stretcher]:
Ruby client.

* https://github.com/wireframe/elastic_searchable/[elastic_searchable]:
Ruby client + Rails integration.

* https://github.com/ddnexus/flex[Flex]:
Ruby Client.


[[community-php]]
=== PHP
Expand All @@ -62,6 +62,8 @@ See the {client}/php-api/current/index.html[official Elasticsearch PHP client].
* http://github.com/polyfractal/Sherlock[Sherlock]:
PHP client, one-to-one mapping with query DSL, fluid interface.

* https://github.com/nervetattoo/elasticsearch[elasticsearch]
PHP 5.3 client

[[community-java]]
=== Java
Expand Down Expand Up @@ -184,3 +186,7 @@ See the {client}/javascript-api/current/index.html[official Elasticsearch JavaSc
* https://github.com/jasonfill/ColdFusion-ElasticSearch-Client[ColdFusion-Elasticsearch-Client]
Cold Fusion client for Elasticsearch

[[community-nodejs]]
=== NodeJS
* https://github.com/phillro/node-elasticsearch-client[Node-Elasticsearch-Client]
A node.js client for elasticsearch
5 changes: 1 addition & 4 deletions docs/community/misc.asciidoc
Original file line number Diff line number Diff line change
@@ -1,15 +1,12 @@
[[misc]]
== Misc

* https://github.com/electrical/puppet-elasticsearch[Puppet]:
* https://github.com/elasticsearch/puppet-elasticsearch[Puppet]:
Elasticsearch puppet module.

* http://github.com/elasticsearch/cookbook-elasticsearch[Chef]:
Chef cookbook for Elasticsearch

* https://github.com/tavisto/elasticsearch-rpms[elasticsearch-rpms]:
RPMs for elasticsearch.

* http://www.github.com/neogenix/daikon[daikon]:
Daikon Elasticsearch CLI

Expand Down
1 change: 0 additions & 1 deletion docs/reference/cluster/nodes-info.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@ plugins per node:
* `site`: `true` if the plugin is a site plugin
* `jvm`: `true` if the plugin is a plugin running in the JVM
* `url`: URL if the plugin is a site plugin
* `isolation`: whether the plugin is loaded in isolation (`true`) or not (`false`)

The result will look similar to:

Expand Down
26 changes: 18 additions & 8 deletions docs/reference/cluster/update-settings.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -65,22 +65,32 @@ There is a specific list of settings that can be updated, those include:

[float]
===== Balanced Shards
All these values are relative to one another. The first three are used to
compose a three separate weighting functions into one. The cluster is balanced
when no allowed action can bring the weights of each node closer together by
more then the fourth setting. Actions might not be allowed, for instance,
due to forced awareness or allocation filtering.

`cluster.routing.allocation.balance.shard`::
Defines the weight factor for shards allocated on a node
(float). Defaults to `0.45f`.
Defines the weight factor for shards allocated on a node
(float). Defaults to `0.45f`. Raising this raises the tendency to
equalize the number of shards across all nodes in the cluster.

`cluster.routing.allocation.balance.index`::
Defines a factor to the number of shards per index allocated
on a specific node (float). Defaults to `0.5f`.
Defines a factor to the number of shards per index allocated
on a specific node (float). Defaults to `0.5f`. Raising this raises the
tendency to equalize the number of shards per index across all nodes in
the cluster.

`cluster.routing.allocation.balance.primary`::
defines a weight factor for the number of primaries of a specific index
allocated on a node (float). `0.05f`.
Defines a weight factor for the number of primaries of a specific index
allocated on a node (float). `0.05f`. Raising this raises the tendency
to equalize the number of primary shards across all nodes in the cluster.

`cluster.routing.allocation.balance.threshold`::
minimal optimization value of operations that should be performed (non
negative float). Defaults to `1.0f`.
Minimal optimization value of operations that should be performed (non
negative float). Defaults to `1.0f`. Raising this will cause the cluster
to be less aggressive about optimizing the shard balance.

[float]
===== Concurrent Rebalance
Expand Down
52 changes: 52 additions & 0 deletions docs/reference/index-modules/fielddata.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,41 @@ field data format.
`doc_values`::
Computes and stores field data data-structures on disk at indexing time.

[float]
==== Global ordinals

coming[1.2.0]

Global ordinals is a data-structure on top of field data, that maintains an
incremental numbering for all the terms in field data in a lexicographic order.
Each term has a unique number and the number of term 'A' is lower than the number
of term 'B'. Global ordinals are only supported on string fields.

Field data on string also has ordinals, which is a unique numbering for all terms
in a particular segment and field. Global ordinals just build on top of this,
by providing a mapping between the segment ordinals and the global ordinals.
The latter being unique across the entire shard.

Global ordinals can be beneficial in search features that use segment ordinals already
such as the terms aggregator to improve the execution time. Often these search features
need to merge the segment ordinal results to a cross segment terms result. With
global ordinals this mapping happens during field data load time instead of during each
query execution. With global ordinals search features only need to resolve the actual
term when building the (shard) response, but during the execution there is no need
at all to use the actual terms and the unique numbering global ordinals provided is
sufficient and improves the execution time.

Global ordinals for a specified field are tied to all the segments of a shard (Lucene index),
which is different than for field data for a specific field which is tied to a single segment.
For this reason global ordinals need to be rebuilt in its entirety once new segments
become visible. This one time cost would happen anyway without global ordinals, but
then it would happen for each search execution instead!

The loading time of global ordinals depends on the number of terms in a field, but in general
it is low, since it source field data has already been loaded. The memory overhead of global
ordinals is a small because it is very efficiently compressed. Eager loading of global ordinals
can move the loading time from the first search request, to the refresh itself.

[float]
=== Fielddata loading

Expand All @@ -147,6 +182,23 @@ It is possible to force field data to be loaded and cached eagerly through the
}
--------------------------------------------------

Global ordinals can also be eagerly loaded:

[source,js]
--------------------------------------------------
{
category: {
type: "string",
fielddata: {
loading: "eager_global_ordinals"
}
}
}
--------------------------------------------------

With the above setting both field data and global ordinals for a specific field
are eagerly loaded.

[float]
==== Disabling field data loading

Expand Down
25 changes: 25 additions & 0 deletions docs/reference/index-modules/similarity.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,31 @@ based model] . This similarity has the following options:

Type name: `IB`

[float]
[[lm_dirichlet]]
==== LM Dirichlet similarity.

http://lucene.apache.org/core/4_7_1/core/org/apache/lucene/search/similarities/LMDirichletSimilarity.html[LM
Dirichlet similarity] . This similarity has the following options:

[horizontal]
`mu`:: Default to `2000`.

Type name: `LMDirichlet`

[float]
[[lm_jelinek_mercer]]
==== LM Jelinek Mercer similarity.

http://lucene.apache.org/core/4_7_1/core/org/apache/lucene/search/similarities/LMJelinekMercerSimilarity.html[LM
Jelinek Mercer similarity] . This similarity has the following options:

[horizontal]
`lambda`:: The optimal value depends on both the collection and the query. The optimal value is around `0.1`
for title queries and `0.7` for long queries. Default to `0.1`.

Type name: `LMJelinekMercer`

[float]
[[default-base]]
==== Default and Base Similarities
Expand Down
1 change: 1 addition & 0 deletions docs/reference/mapping/types/core-types.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,7 @@ Defaults to the property/field name.
|`store` |Set to `true` to store actual field in the index, `false` to not
store it. Defaults to `false` (note, the JSON document itself is stored,
and it can be retrieved from it).
|`doc_values` |Set to `true` to store field values in a column-stride fashion.
|=======================================================================

[float]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/modules/advanced-scripting.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ return score;
=== Term vectors:

The `_index` variable can only be used to gather statistics for single terms. If you want to use information on all terms in a field, you must store the term vectors (set `term_vector` in the mapping as described in the <<mapping-core-types,mapping documentation>>). To access them, call
`_index.getTermVectors()` to get a
`_index.termVectors()` to get a
https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/Fields.html[Fields]
instance. This object can then be used as described in https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/Fields.html[lucene doc] to iterate over fields and then for each field iterate over each term in the field.
The method will return null if the term vectors were not stored.
Expand Down
14 changes: 0 additions & 14 deletions docs/reference/modules/plugins.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -142,20 +142,6 @@ bin/plugin --install mobz/elasticsearch-head --timeout 1m
bin/plugin --install mobz/elasticsearch-head --timeout 0
-----------------------------------

added[1.1.0]
[float]
==== Plugins isolation

Since Elasticsearch 1.1, by default, each plugin is loaded in _isolation_ (in its dedicated `ClassLoader`) to avoid class clashes between the various plugins and their associated libraries. The default can be changed through the `plugins.isolation` property in `elasticsearch.yml`, by setting it to `false`:

[source,js]
--------------------------------------------------
plugins.isolation: false
--------------------------------------------------

Do note that each plugin can specify its _mandatory_ isolation through the `isolation` property in its `es-plugin.properties` configuration. In this (rare) case, the plugin setting is used, overwriting whatever default used by Elasticsearch.


[float]
[[known-plugins]]
=== Known Plugins
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Example:
<1> The `global` aggregation has an empty body
<2> The sub-aggregations that are registered for this `global` aggregation

The above aggregation demonstrates how one would compute aggregations (`avg_price` in this example) on all the documents in the search context, regardless of the query (in our example, it will compute the the average price over all products in our catalog, not just on the "shirts").
The above aggregation demonstrates how one would compute aggregations (`avg_price` in this example) on all the documents in the search context, regardless of the query (in our example, it will compute the average price over all products in our catalog, not just on the "shirts").

The response for the above aggreation:

Expand All @@ -48,4 +48,4 @@ The response for the above aggreation:
}
--------------------------------------------------

<1> The number of documents that were aggregated (in our case, all documents within the search context)
<1> The number of documents that were aggregated (in our case, all documents within the search context)
Original file line number Diff line number Diff line change
Expand Up @@ -310,12 +310,15 @@ http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNIX_LINES

==== Execution hint

There are two mechanisms by which terms aggregations can be executed: either by using field values directly in order to aggregate
data per-bucket (`map`), or by using ordinals of the field values instead of the values themselves (`ordinals`). Although the
latter execution mode can be expected to be slightly faster, it is only available for use when the underlying data source exposes
those terms ordinals. Moreover, it may actually be slower if most field values are unique. Elasticsearch tries to have sensible
defaults when it comes to the execution mode that should be used, but in case you know that one execution mode may perform better
than the other one, you have the ability to "hint" it to Elasticsearch:
coming[1.2.0] The `global_ordinals` execution mode

There are three mechanisms by which terms aggregations can be executed: either by using field values directly in order to aggregate
data per-bucket (`map`), by using ordinals of the field values instead of the values themselves (`ordinals`) or by using global
ordinals of the field (`global_ordinals`). The latter is faster, especially for fields with many unique
values. However it can be slower if only a few documents match, when for example a terms aggregator is nested in another
aggregator, this applies for both `ordinals` and `global_ordinals` execution modes. Elasticsearch tries to have sensible
defaults when it comes to the execution mode that should be used, but in case you know that one execution mode may
perform better than the other one, you have the ability to "hint" it to Elasticsearch:

[source,js]
--------------------------------------------------
Expand All @@ -331,6 +334,6 @@ than the other one, you have the ability to "hint" it to Elasticsearch:
}
--------------------------------------------------

<1> the possible values are `map` and `ordinals`
<1> the possible values are `map`, `ordinals` and `global_ordinals`

Please note that Elasticsearch will ignore this execution hint if it is not applicable.
2 changes: 1 addition & 1 deletion docs/reference/search/request/search-type.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ curl -XGET 'localhost:9200/_search?search_type=scan&scroll=10m&size=50' -d '
'
--------------------------------------------------

The `scroll` parameter control the keep alive time of the scrolling
The `scroll` parameter controls the keep alive time of the scrolling
request and initiates the scrolling process. The timeout applies per
round trip (i.e. between the previous scan scroll request, to the next).

Expand Down
2 changes: 2 additions & 0 deletions docs/reference/setup.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,5 @@ include::setup/as-a-service-win.asciidoc[]
include::setup/dir-layout.asciidoc[]

include::setup/repositories.asciidoc[]

include::setup/upgrade.asciidoc[]
Loading