Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Upgrade to Elasticsearch 2 #325
Elasticsearch 2 brings a lot of new features we want, such as faster Geo tools, an improved FST, and better tools for monitoring performance. We have a branch for experimental support, but there's actually quite a bit to the process. There are also some backwards incompatible changes in Elasticsearch 2 that we'll have to work around.
The process for migration might look something like this:
1.) Make whatever non-backwards incompatible changes to support ES 2.0 are possible before doing anything else. This just ensures the number of changes in play during the actual migration are minimal. 324f1fd and 0b35e6a from the experimental branch mostly cover this part already.
2.) Modify the API to support querying against multiple, configurable indices. pelias/api#334 describes part of this work. We will need this for...
3.) Modify the importers and schema to store admin regions and addresses/venues in two separate indices, with different n-grams settings (1-gram for admin regions, 2-gram for addresses/venues). Elasticsearch doesn't support our current setup where different types in the same index have different analysis settings. We want to make this change before, not alongside, the upgrade, so that we can see exactly what performance and result quality affects upgrading has. This change by itself should lead to equivalent results.
4.) Actually upgrade to ES 2.0
Update: we are moving along with the upgrade. Here's a list of task we still have to tackle:
referenced this issue
May 13, 2016
This was referenced
May 17, 2016
The dev and prod_build builds from Thursday just finished. They both failed after successfully ingesting all the data. I didn't look into the dev failure, but the prod_build one hit an error when running the acceptance tests, which I'm sure we can fix without too much trouble.
I also realized that the Elasticsearch APIs we use to optimize the index before rotating has changed, so we'll have to take a look at that as well.
referenced this issue
Jul 5, 2016
the major behavioral changes:
improved handling of numeric values
prior to this release, numerals were treated the same as letters when it came to creating prefix-grams, so a token like
improved local focus
the TF/IDF scoring has been disabled for partial token matching (eg. only the last word typed using /v1/autocomplete).
as a result, we get better local biasing without having to change the scoring weights
search using single tokens
prior to this release we ignored single character tokens (except the very first keypress) due to performance reasons; after some refactoring and performance testing we are pleased to re-enabled this functionality.
from the heff: