Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Elasticsearch query runner #4293

Open
arikfr opened this issue Oct 27, 2019 · 3 comments · May be fixed by #4391
Open

New Elasticsearch query runner #4293

arikfr opened this issue Oct 27, 2019 · 3 comments · May be fixed by #4391
Assignees
Milestone

Comments

@arikfr
Copy link
Member

@arikfr arikfr commented Oct 27, 2019

Our current Elasticsearch support is in a very bad state, it has issues with aggregated queries, nested objects, and new versions of Elasticsearch. Also the codebase is in a very poor state not allowing for easy fixes.

We are going to re-implement the connector, addressing these issues and making room for future improvements.

Implementation notes:

  • Unless there will be request for it we will only support Elasticsearch's API and not the Kibana flavor.
  • This will be a new query runner. The current ones will be marked as deprecated, but existing queries using them will keep using them.
  • We will have support for SQL queries on top of Elasticsearch. There seem to be two flavors of it: community version and the X-Pack version. We should support both. Worth considering having the different SQL connectors as separate types from the ones that use the API directly (but reuse code, of course).

Previous issues:

Previous attempts at addressing these issues:

  • Support for elasticsearch-sql plugin: #2582
  • Another SQL support: #3393
  • Schema fetching errors: #3692
  • Aggregated results support: #3845
  • Nested fields support: #3456
  • Nested aggregations support: #3809
@arikfr

This comment has been minimized.

Copy link
Member Author

@arikfr arikfr commented Oct 27, 2019

Worth noting that the new implementation will at least have tests for the query parsing logic to make future changes easier.

@NicolasLM

This comment has been minimized.

Copy link
Contributor

@NicolasLM NicolasLM commented Nov 6, 2019

  1. Since the ES API is quite easily accessible with HTTP requests, I guess we don't want to add the dependency to a more specific Python ES client, keeping it the same way as the current ES query runner. Am I right?
  2. Do we want to support nested fields and nested aggregations the same way it's done for the MongoDB query runner? Nested objects get flattened and the name of the column uses a delimiter like . to indicate that it originally comes from a nested object. Note that this is the approach taken out of the box by the X-Pack SQL wrapper.
  3. Same questions regarding documents with different fields in the same collection. It seems that the MongoDB runner looks at the first and last documents and takes fields from there while X-Pack SQL shows all fields possible defined in the collection for each document (with the value null if the document does not have a specific field).
@arikfr

This comment has been minimized.

Copy link
Member Author

@arikfr arikfr commented Nov 6, 2019

  1. Yes, I had the same thoughts. From what I've seen for our needs the Elasticsearch clients don't offer much.

  2. Considering this is what X-Pack SQL does and it's consistent with what we do with other connectors, it makes sense to keep it. I would also consider reusing the code between Mongo and ES.

  3. What you described happens when we query MongoDB for schema. But in the query result we pick up columns from all the documents (see parse_results). We should follow similar pattern.

NicolasLM added a commit that referenced this issue Nov 23, 2019
The new elasticsearch query runner is split into three:

- A runner supporting the newest versions of ES,
  aggregation, nested aggregations and nested fields.
- A runner for the SQL OpenDistro flavor
- A runner for the SQL X-Pack flavor

Fixes #4293
@NicolasLM NicolasLM linked a pull request that will close this issue Nov 23, 2019
1 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants
You can’t perform that action at this time.