Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Elasticsearch 8 compatiblity #1722

Merged
merged 9 commits into from Feb 23, 2022
Merged

feat: Elasticsearch 8 compatiblity #1722

merged 9 commits into from Feb 23, 2022

Conversation

mgorsk1
Copy link
Contributor

@mgorsk1 mgorsk1 commented Feb 21, 2022

Summary of Changes

Document types were already deprecated in ES 7 but are completely removed in ES 8. This PR proposes minor refactor which makes our search proxy and databuilder elasticsearch publisher job & task compatible with Elasticsearch 8 and also backwards compatible with Elasticsearch 7.

The change proposes:

  • stop relying on _type meta field (not available in ES 8, deprecated in ES 7)
  • add mandatory resource_type keyword field in SearchableResource
  • populate mandatory resource_type keyword field in databuilder jobs/tasks relying on elasticsearch_type property
  • rely on resource_type in search proxy to properly select which resource is being returned from elasticsearch
  • discontinuation of Elasticsearch 6.x support as it's reached it's EOL https://www.elastic.co/support/eol

The proposal builds on best-practice suggestion from Elasticsearch docs: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html

Tests

  • adjusted tests

Documentation

  • updated docs to highlight that both ES 7 and 8 are compatible with Amundsen (removed ES 6 from list)

CheckList

Make sure you have checked all steps below to ensure a timely review.

  • PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2"
  • PR includes a summary of changes.
  • PR adds unit tests, updates existing unit tests, OR documents why no test additions or modifications are needed.
  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does

@boring-cyborg boring-cyborg bot added area:databuilder From databuilder folder area:search From the search folder category:proxy labels Feb 21, 2022
@mgorsk1 mgorsk1 changed the title Draft: Elasticsearch 8 compatiblity Draft: feat: Elasticsearch 8 compatiblity Feb 21, 2022
@boring-cyborg boring-cyborg bot added area:all Related to all the project area:docs labels Feb 21, 2022
@mgorsk1
Copy link
Contributor Author

mgorsk1 commented Feb 21, 2022

cc @allisonsuarez & @dkunitsk

@mgorsk1 mgorsk1 marked this pull request as ready for review February 21, 2022 16:50
@boring-cyborg boring-cyborg bot added area:common From common folder category:models labels Feb 21, 2022
@mgorsk1 mgorsk1 changed the title Draft: feat: Elasticsearch 8 compatiblity feat: Elasticsearch 8 compatiblity Feb 21, 2022
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
mgorsk1 and others added 5 commits February 22, 2022 15:18
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
@mgorsk1 mgorsk1 merged commit cdcf738 into main Feb 23, 2022
@mgorsk1 mgorsk1 deleted the feat/es8_compatibility branch February 23, 2022 08:06
@feng-tao
Copy link
Member

@mgorsk1 given all the licenses issues of ES, are we continue using ES or should we switch to Opensearch (AWS version of ES)?

also has this pr been tested with the existing ES7 ? cc @allisonsuarez @dkunitsk

@mgorsk1
Copy link
Contributor Author

mgorsk1 commented Feb 23, 2022

I've tested it with both ES 7 (locally) and ES 8 (on our acc environment).

Not sure what licensing issues you refer to? The default (always free) license of Elasticsearch makes it possible to use it for use-cases like Amundsen.

@dkunitsk
Copy link

@mgorsk1
The license issues: Elastic has moved to limit the compatibility of their tools (elasticsearch and elasticsearch-dsl libs, for example) to work with only official Elastic-made backends. The high-level client version we're using is the highest available before that breaking change was introduced. Reference

@feng-tao
At the moment, there is still no opensearch high-level python client, although they promise it's in the works. When that comes out, we'll have a dilemma since it's not clear that opensearch will be fully ES-compatible so we'll be forced to choose sides or to support both.

@feng-tao
Copy link
Member

@mgorsk1 like @dkunitsk mentioned I assume you don't spin up your own ES cluster with docker container but instead relying on cloud vendor solution(e.g AWS ES service etc)

@mgorsk1
Copy link
Contributor Author

mgorsk1 commented Feb 23, 2022

Ah ok so Python client shared the faith of beats plugins, interesting.

Indeed that's a dilemma cause it's a discussion whether we stay in elastic ecosystem or opensearch one and it's not clear to me that switching is something we really need, with time it might be that opensearch and elastic won't be greatly compatible just as @dkunitsk mentioned, so the question is which realm will give us more feature wise.

In ING we self host elasticsearch on k8s using official kuberentes operator.

Latest news on AWS <> Elastic battle are that both solutions will be available in cloud as SaaS after trademark feud was resolved in favor of Elastic claims: https://www.zdnet.com/article/aws-scrubs-elasticsearch-from-site-to-settle-trademark-dispute-with-elastic/.

ozandogrultan pushed a commit to deliveryhero/amundsen that referenced this pull request Apr 28, 2022
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Ozan Dogrultan <ozan.dogrultan@deliveryhero.com>
zacr pushed a commit to SaltIO/amundsen that referenced this pull request May 3, 2022
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Signed-off-by: Zachary Ruiz <zacharyruiz@Zacharys-MacBook-Pro-2.local>
zacr pushed a commit to SaltIO/amundsen that referenced this pull request May 13, 2022
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
hansadriaans pushed a commit to DataChefHQ/amundsen that referenced this pull request Jun 30, 2022
Signed-off-by: Mariusz Górski <gorskimariusz13@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:all Related to all the project area:common From common folder area:databuilder From databuilder folder area:search From the search folder
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants