Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch 7 Support #831

Open
orangejulius opened this issue Oct 28, 2019 · 3 comments
Open

Elasticsearch 7 Support #831

orangejulius opened this issue Oct 28, 2019 · 3 comments

Comments

@orangejulius
Copy link
Member

@orangejulius orangejulius commented Oct 28, 2019

This issue will track support for Elasticsearch 7 in Pelias.

Most Elasticsearch upgrades require two sets of changes:

  • Base compatibility changes, often dropping use of functionality no longer supported by Elasticsearch. These changes are generally required before the new version of Elasticsearch works at all
  • Tweaks and changes to ensure that queries return the correct results and with adequate performance. These are usually a bit more subjective and can come after initial support has been completed.

Pelias Tasks

Here's the list of breaking changes we'll need to adapt to (this list will be updated over time):

Reference links

orangejulius added a commit to pelias/docker that referenced this issue Oct 28, 2019
This is the first step in supporting Elasticsearch 7.

At this time, Pelias does not work out of the box on ES7, but with a
Docker image ready to go, we can begin testing changes for
compatibility.

This Dockerfile and config is identical to the ES6 Docker image, except
for changing the version, and making one update to the
`elasticsearch.yml`:

In ES7, the bulk thread pool is removed, and both bulk and non-bulk
operations go through a single
[write](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html#modules-threadpool)
thread pool.

For Pelias we have found increasing the queue size of this thread pool
is useful to ensure imports can succeed without errors, so the
configuration file has been updated accordingly.

Connects pelias/pelias#831
orangejulius added a commit to pelias/model that referenced this issue Nov 7, 2019
This is necessary for Elasticsearch 7

Connects pelias/pelias#831
@orangejulius

This comment has been minimized.

Copy link
Member Author

@orangejulius orangejulius commented Nov 7, 2019

With the list of changes above as of this writing, an ES7 build and an import of a few million records for the Portland Metro area works well, and querying with the latest API causes no errors.

I'm sure there's more work to do, in particular I think at least one geo query related change will be required, but it looks like the core part of the ES7 upgrade is now fairly well understood! 🎉

orangejulius added a commit to pelias/schema that referenced this issue Nov 7, 2019
The first error seen when trying to use our current schema with
Elasticsearch 7 is:

```
[illegal_argument_exception] Token filter [word_delimiter] cannot be
used to parse synonyms
```

The [word delimiter](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html)
token filter is only used in one place: the `peliasAdmin` analyzer.

Looking at the documentation for `word_delimiter`, it does _a lot_:
splitting words, handling punctuation, and even some basic stemming.

It really feels like an extremely broad tool and at this point feels
like something that Elasticsearch would deprecate in the future.

Furthermore, looking at our integration tests, it seems one of the key
reasons we used it was to tokenize on hyphens, which we have done using
the `peliasNameTokenizer` since
#375.

Considering how complicated this token filter is, and how it's now being
used with relatively little effect, it seems like something we can
remove.

Connects pelias/pelias#831
@missinglink missinglink pinned this issue Nov 22, 2019
@missinglink

This comment has been minimized.

Copy link
Member

@missinglink missinglink commented Nov 26, 2019

after merging pelias/schema#403 its now possible to create indices on ES 6.8.5 which will be compatible with 7.4.2

@missinglink

This comment has been minimized.

Copy link
Member

@missinglink missinglink commented Nov 27, 2019

For the adventurous among you, we have a prelease pelias/schema branch here.
You'll find the corresponding docker images here.

At a minimum you should ensure that you've made the following configuration changes for ES7:

  • Update pelias.json to set the correct esclient.apiVersion (7.4 at time of writing)
  • Set the schema.typeName property to _doc in pelias.json (note the underscore!)
  • Update docker-compose.yml to set the correct services.elasticsearch.image (pelias/elasticsearch:7.4.2 at time of writing)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.