Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-26088][Connectors/ElasticSearch] Add Elasticsearch 8.0 support #53

Merged
merged 1 commit into from
Mar 21, 2024

Conversation

mtfelisb
Copy link
Contributor

Description

The eighth version was released in February last year and has a lot of internal changes, such as releasing a whole new Java API Client. Therefore, we can create a full new integration based on the latest Java API, and rely on the compatibility mode to previous versions. That is the goal of this pull request.

Jira

FLINK-26088

@boring-cyborg
Copy link

boring-cyborg bot commented Mar 25, 2023

Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html)

@MartijnVisser
Copy link
Contributor

@mtfelisb What's the status of this PR?

@mtfelisb
Copy link
Contributor Author

@MartijnVisser I'm wrapping up, and just need to write a few more tests. Even though I'm concerned about the implementation using Kryo, I'll mark it as ready to review and get your feedback.

@MartijnVisser
Copy link
Contributor

Even though I'm concerned about the implementation using Kryo, I'll mark it as ready to review and get your feedback.

@reta Could you help with a review?

@reta
Copy link
Member

reta commented May 31, 2023

The top level pom.xml misses the new module:

		<module>flink-connector-elasticsearch8</module>

@StefanXiepj
Copy link

I try to compile your code, but throwing follow errors:

[INFO] There are 69 errors reported by Checkstyle 8.14 with /tools/maven/checkstyle.xml ruleset.
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/OperationSerializer.java:[33] (javadoc) JavadocType: 缺少 Javadoc 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[26] (imports) AvoidStarImport: 不应使用 '.*' 形式的导入 - java.io.* 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[29] (javadoc) JavadocType: 缺少 Javadoc 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[53] (blocks) NeedBraces: 'if' 结构必须使用大括号 '{}'。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[54] (blocks) NeedBraces: 'if' 结构必须使用大括号 '{}'。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/INetworkConfigFactory.java:[21] (whitespace) EmptyLineSeparator: 'package 前应有空行。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/INetworkConfigFactory.java:[27] (javadoc) JavadocType: 缺少 Javadoc 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8Writer.java:[32] (imports) ImportOrder: Import org.apache.flink.api.connector.sink2.Sink appears after other imports that it should precede
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8Writer.java:[35] (imports) ImportOrder: 导入组之前的额外空行 'org.apache.flink.connector.base.sink.writer.BufferedRequestState'。

@mtfelisb
Copy link
Contributor Author

mtfelisb commented Jun 6, 2023

Even though I'm concerned about the implementation using Kryo, I'll mark it as ready to review and get your feedback.

@reta Could you help with a review?

@reta thank you for the review. I appreciate that. I'm going to work on what you pointed out.

@mtfelisb
Copy link
Contributor Author

mtfelisb commented Jun 6, 2023

I try to compile your code, but throwing follow errors:

[INFO] There are 69 errors reported by Checkstyle 8.14 with /tools/maven/checkstyle.xml ruleset.
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/OperationSerializer.java:[33] (javadoc) JavadocType: 缺少 Javadoc 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[26] (imports) AvoidStarImport: 不应使用 '.*' 形式的导入 - java.io.* 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[29] (javadoc) JavadocType: 缺少 Javadoc 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[53] (blocks) NeedBraces: 'if' 结构必须使用大括号 '{}'。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Operation.java:[54] (blocks) NeedBraces: 'if' 结构必须使用大括号 '{}'。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/INetworkConfigFactory.java:[21] (whitespace) EmptyLineSeparator: 'package 前应有空行。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/INetworkConfigFactory.java:[27] (javadoc) JavadocType: 缺少 Javadoc 。
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8Writer.java:[32] (imports) ImportOrder: Import org.apache.flink.api.connector.sink2.Sink appears after other imports that it should precede
[ERROR] src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8Writer.java:[35] (imports) ImportOrder: 导入组之前的额外空行 'org.apache.flink.connector.base.sink.writer.BufferedRequestState'。

@StefanXiepj I'll fix these Checkstyle errors, and once I do, I'll let you know.

@reswqa
Copy link
Member

reswqa commented Aug 22, 2023

What's the progress of this?

@mtfelisb
Copy link
Contributor Author

What's the progress of this?

Hi @reswqa, thanks for asking.
I'm writing the retry mechanism inspired by the PR that @reta shared here. After that, I'll write tests and mark the PR as ready for review.

@reswqa
Copy link
Member

reswqa commented Aug 24, 2023

Thanks @mtfelisb! I have seen many people's urgent need for this, so feel free to ping me for reviewing to accelerate the progress. And I can provide any possible help :)

@StefanXiepj
Copy link

Thanks @mtfelisb! I have seen many people's urgent need for this, so feel free to ping me for reviewing to accelerate the progress. And I can provide any possible help :)

@reswqa can you help me to review my another pr for this issue ? thanks

@mtfelisb
Copy link
Contributor Author

Guys, I wrote a bunch of tests heavily inspired by apache/flink-connector-opensearch#5 since they have very similar implementations. Thanks @reta for sharing, it helped a lot!
I need to create the documentation and would appreciate some guidance on how to generate that.

I'm looking forward to your feedback :)
cc @MartijnVisser @reswqa

@mtfelisb mtfelisb marked this pull request as ready for review September 16, 2023 20:40
@mtfelisb mtfelisb requested a review from reta September 16, 2023 20:40
@rinkako
Copy link
Contributor

rinkako commented Oct 7, 2023

hi @mtfelisb , I read the code and found that it has no implementation for DynamicTableSink, hence this es8 sink cannot be used in flink-sql. Would you have any plan to work on this point? 😄

@mtfelisb
Copy link
Contributor Author

mtfelisb commented Oct 9, 2023

hi @mtfelisb , I read the code and found that it has no implementation for DynamicTableSink, hence this es8 sink cannot be used in flink-sql. Would you have any plan to work on this point? 😄

Hi, @rinkako! I wasn't planning to do it. To be honest with you I'm not quite familiar with it. However, it could be a significant increment, indeed. I'm open to collaborations 😃

@snuyanzin
Copy link
Contributor

snuyanzin commented Feb 3, 2024

I have a question about license
it looks like we don't have any info about Elastic 8.x in NOTICE, should we add it or what is the best approach here?
Based on the info [1] it is no longer produced under Apache 2.0 license after 7.10

[1] https://www.elastic.co/pricing/faq/licensing

@reta
Copy link
Member

reta commented Feb 3, 2024

Based on the info [1] it is no longer produced under Apache 2.0 license after 7.10

The ES Java Client license is still Apache [1] (it is an exemption)

[1] https://github.com/elastic/elasticsearch-java

@snuyanzin
Copy link
Contributor

good to know, thanks

@mtfelisb
Copy link
Contributor Author

mtfelisb commented Feb 8, 2024

Hi, @MartijnVisser. I have a question regarding compatibility. This pull request is working on every Flink version but 1.19-SNAPSHOT. I've checked this issue, however, I'm already using TestSinkInitContext here. It's unclear if I should consider 1.19-SNAPSHOT and upward only. Am I missing something here?

@MartijnVisser
Copy link
Contributor

This pull request is working on every Flink version but 1.19-SNAPSHOT

I'm suspecting that's because of https://issues.apache.org/jira/browse/FLINK-25857 and https://issues.apache.org/jira/browse/FLINK-33972 which have introduced new changes to the Sink API.

It's unclear if I should consider 1.19-SNAPSHOT and upward only.

There's only thing that the community agreed on and that is that the last two minor version of Flink need to be supported. At this moment of writing, that means that Flink 1.17 and Flink 1.18 need to be supported. At the moment Flink 1.19 gets officially released, we need to have a connector that supports both Flink 1.18 and Flink 1.19.

However, that doesn't mean that connector support needs to come from one connector version. It's perfectly fine to say "There's the Flink Elastichsearch connector v3.0 that supports Flink 1.17 and 1.18, and we'll have a new Flink Elasticsearch connector v3.1 / v4.0 that supports only Flink 1.19 and upwards". It just means that there's more work involved in backporting any bugfixes (and potentially new features) to two branches. It's explained in more depth at https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Support

In this case, I do think that it would be possible to add support for Flink 1.17/1.18 and the upcoming Flink 1.19 in one branch. At least, that's also what's being tried in the Flink Kafka connector, see apache/flink-connector-kafka#84

@Jiabao-Sun
Copy link
Contributor

Hi, @MartijnVisser. I have a question regarding compatibility. This pull request is working on every Flink version but 1.19-SNAPSHOT. I've checked this issue, however, I'm already using TestSinkInitContext here. It's unclear if I should consider 1.19-SNAPSHOT and upward only. Am I missing something here?

TestSinkInitContext is ok if we use Sink.createWriter method to create a writer.
https://github.com/apache/flink/pull/24180/files#r1467200868

@mtfelisb
Copy link
Contributor Author

mtfelisb commented Feb 9, 2024

This pull request is working on every Flink version but 1.19-SNAPSHOT

I'm suspecting that's because of https://issues.apache.org/jira/browse/FLINK-25857 and https://issues.apache.org/jira/browse/FLINK-33972 which have introduced new changes to the Sink API.

It's unclear if I should consider 1.19-SNAPSHOT and upward only.

There's only thing that the community agreed on and that is that the last two minor version of Flink need to be supported. At this moment of writing, that means that Flink 1.17 and Flink 1.18 need to be supported. At the moment Flink 1.19 gets officially released, we need to have a connector that supports both Flink 1.18 and Flink 1.19.

However, that doesn't mean that connector support needs to come from one connector version. It's perfectly fine to say "There's the Flink Elastichsearch connector v3.0 that supports Flink 1.17 and 1.18, and we'll have a new Flink Elasticsearch connector v3.1 / v4.0 that supports only Flink 1.19 and upwards". It just means that there's more work involved in backporting any bugfixes (and potentially new features) to two branches. It's explained in more depth at https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Support

In this case, I do think that it would be possible to add support for Flink 1.17/1.18 and the upcoming Flink 1.19 in one branch. At least, that's also what's being tried in the Flink Kafka connector, see apache/flink-connector-kafka#84

I understand now, it makes total sense. I appreciate the detailed information you provided, @MartijnVisser 😀

@drorventura
Copy link

drorventura commented Mar 19, 2024

Hi
@snuyanzin , @Jiabao-Sun
Is this planned to be merged soon?

@mtfelisb mtfelisb requested a review from reswqa March 19, 2024 10:23
@drorventura
Copy link

Thank you @mtfelisb
@reswqa could you please approve? 🙏

Copy link
Member

@reswqa reswqa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mtfelisb, looks good to me. Could you please rebase and squash all commits?

@mtfelisb
Copy link
Contributor Author

Thanks @mtfelisb, looks good to me. Could you please rebase and squash all commits?

Thank you for the review, @reswqa! Just did it :)

Copy link
Member

@reswqa reswqa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mtfelisb, merging...

@reswqa reswqa merged commit 963c505 into apache:main Mar 21, 2024
12 checks passed
Copy link

boring-cyborg bot commented Mar 21, 2024

Awesome work, congrats on your first merged pull request!

@drorventura
Copy link

when is the next release planned?

@reswqa
Copy link
Member

reswqa commented Mar 21, 2024

when is the next release planned?

TBH, I'm not really sure. But I think we will probably release a series of connectors that supporting flink-1.19 in the near future. You can make a release request on the flink mailing list.

@ebdxflr
Copy link

ebdxflr commented Mar 25, 2024

when is the next release planned?

TBH, I'm not really sure. But I think we will probably release a series of connectors that supporting flink-1.19 in the near future. You can make a release request on the flink mailing list.

the flink mailing list is what was missing, since requests were made all over other mediums, from git to jira... can we speed things up @Flink

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants