Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-1618] [WIP] Support ElasticSearch 5.0 #1598

Closed
wants to merge 18 commits into from

Conversation

1ambda
Copy link
Member

@1ambda 1ambda commented Nov 4, 2016

What is this PR for?

ElasticSearch was released in Oct 26 and It requires 5.x client driver but zeppelin currently use 2.x only.

  • we need 2 different profiles for 2.x and 5.x (there are no 3.x, 4.x)
  • 5.x client require different codebase. In this case we should keep 2 version of code at the same time like the spark code.
  • 5.x client requires JAVA 8 at least

What type of PR is it?

Improvement

Todos

  • - create profiles for 2.x and 5.x
  • - setup JAVA 8 validation for 5.x
  • - update default elasticsearch.version to 2.4.x (the most recent for 2.x)
  • - update license
  • - refactor
  • - create the 5.x interpreter
  • - loading dynamically
  • - hide unused classes using maven profile
  • - write tests for 5.x
  • - update docs (build, usage, upgrade)

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-1618

How should this be tested?

Some tests will be included.

  • mvn clean test -pl 'elasticsearch' -Pelasticsearch-2
  • mvn clean test -pl 'elasticsearch' -Pelasticsearch-5

Screenshots (if appropriate)

screen shot 2016-11-04 at 9 46 23 pm

elasticsearch

Questions:

  • Does the licenses files need update? - YES
  • Is there breaking changes for older versions? - NO
  • Does this needs documentation? - YES

@1ambda 1ambda force-pushed the feat/support-es-5.0 branch 2 times, most recently from 16ae06b to 73375b9 Compare November 7, 2016 06:19
@1ambda
Copy link
Member Author

1ambda commented Nov 7, 2016

Ref - https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking_50_java_api_changes.html

2 APIs make little difficult to finish this PR. They are incompatible at all :(
We have to use reflection to support both versions.

But 2.x is broadly used at this point, I think we have to support 2.x in zeppelin 0.7 at least

public boolean isFound() { ... } 
// should be
RestStatus.NOT_FOUND != response.status())

public WriteRequestBuilder setRefresh(true) { ... }
// should be
setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
Caused by: java.lang.ClassNotFoundException: org.apache.logging.log4j.Logger
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 25 more

@1ambda
Copy link
Member Author

1ambda commented Nov 7, 2016

2+ Solutions.

1. Use just reflection

  • Pros: We can have only 1 interpreter ElasticsearchInterpreter
  • Cons: but too much reflection code :(

2. Create Elasticsearch5Interpreter for version 5.x

  • Pros: We don't need reflection by selecting the proper interpreter based on the maven profile
  • Cons: We have 2 interpreters :(
    Also, Code might be duplicated but we can extract them as an interface. For example, ElasticsearchInterpeterBase

Final Thought

  • Refactor the whole ElasticsearchInterpreter to extract abstract class which can be shared by 2.x, 5.x
  • Write 2.x, 5.x interpreters separately and load them at runtime (Class.forName)
  • Hide unused classes usign maven profiles

That would work

@1ambda
Copy link
Member Author

1ambda commented Nov 9, 2016

Blocked until 5.1 release due to too many undocumented breaking changes

@bzz
Copy link
Member

bzz commented Nov 9, 2016

Great work @1ambda ! Do you know if there are any ETA for Elasticsearch 5.1?

\cc @bbonnin who contributed ES2.x interpreter support for a review

@bbonnin
Copy link
Contributor

bbonnin commented Nov 9, 2016

I have just one comment: it's about some duplicate lines of codes. I think some lines that we can find in both ElasticSearchXConnectors could be put in the base class, in order to keep only the specific part ES v2 and V5 in the sub-classes. An example is the method executeDeleteQuery that is almost the same in both classes.
Otherwise, looks good to me.

@1ambda
Copy link
Member Author

1ambda commented Nov 10, 2016

@bbonnin Thanks for the quick review :)

1. Duplicated code in executeDeleteQuery is due to incompatible API between 2.x and 5.x

// 2.x
public boolean isFound() { ... } 
// 5.x
RestStatus.NOT_FOUND != response.status())

But we can extract duplicated validation logic to the base class as you mentioned.

2. @bzz, @bbonnin Regarding to update docs,

Are there more files I should update? I am asking because

  • new maven profiles are added elasticsearch-2 (activated by default), elasticsearch-5
  • elasticsearch-5 profile requires java 8 +

3. The final hurdle is,

Hi @1ambda, wrapper query should work for the query part of the search request. Can you post more info on how it doesn't work for you please? Unfortunately I don't think we have the same for aggregations and the rest of a search request.
In general, I'd recommend to use the java api builders to create search requests that are already in the binary format, so no parsing is needed. Otherwise, I would suggest to have a look at the newly released Java REST client that doesn't have builders yet but allows to provide strings.

We need REST API client

  • Enhance test cases for agg, search. They only checks InterpreterResult.Success not the response currently

@stevenmccord
Copy link

@1ambda were you still working through this one? I was going to give it a test run if this is something that might be getting in. Noticed it hasn't been updated with master in a while, so just wanted to check to see if it might be worthwhile to give it a try. Thanks!

@1ambda
Copy link
Member Author

1ambda commented Jan 6, 2017

@stevenmccord Hi :)

I am trying to fix other urgent issue. So it's ok to close this PR if you want to implement.
The reason I am keeping this issue is sharing info about ES 5.0

@jaspinderdineout
Copy link

Is it live? Does zeppelin support elasricsearch 5.0 ???

@zjffdu
Copy link
Contributor

zjffdu commented Jan 27, 2017

\cc @1ambda

@1ambda
Copy link
Member Author

1ambda commented Jan 27, 2017

@jaspinderdineout Hi :)

Zeppelin doesn't support es 5.0 currently as i know.
I will close this PR. This issue can be easily handled after #1902.

@1ambda 1ambda closed this Jan 27, 2017
@brbrown25
Copy link

@1ambda recently elastico deprecated http and requires https now, is there any plan to upgraded this connector?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants