Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Elasticsearch 8.x #125

Closed
sylvainOL opened this issue Jun 28, 2022 · 1 comment · Fixed by #126
Closed

Support for Elasticsearch 8.x #125

sylvainOL opened this issue Jun 28, 2022 · 1 comment · Fixed by #126
Labels

Comments

@sylvainOL
Copy link
Contributor

Describe the bug

Today Spark dependencies only work with elasticsearch 7.x

To Reproduce
Steps to reproduce the behavior:

  1. install elasticsearch 8.x
  2. launch spark dependencies

we have this in logs:

Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
        at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:348)
        at org.elasticsearch.hadoop.rest.RestService.findPartitions(RestService.java:220)
        at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions$lzycompute(AbstractEsRDD.scala:79)
        at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions(AbstractEsRDD.scala:78)
        at org.elasticsearch.spark.rdd.AbstractEsRDD.getPartitions(AbstractEsRDD.scala:48)
        at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:273)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
        at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:273)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
        at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4(Partitioner.scala:78)
        at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4$adapted(Partitioner.scala:78)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at scala.collection.TraversableLike.map(TraversableLike.scala:238)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
        at scala.collection.immutable.List.map(List.scala:298)
        at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:78)
        at org.apache.spark.rdd.RDD.$anonfun$groupBy$1(RDD.scala:714)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
        at org.apache.spark.rdd.RDD.groupBy(RDD.scala:714)
        at org.apache.spark.api.java.JavaRDDLike.groupBy(JavaRDDLike.scala:243)
        at org.apache.spark.api.java.JavaRDDLike.groupBy$(JavaRDDLike.scala:239)
        at org.apache.spark.api.java.AbstractJavaRDDLike.groupBy(JavaRDDLike.scala:45)
        at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:273)
        at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:249)
        at io.jaegertracing.spark.dependencies.DependenciesSparkJob.run(DependenciesSparkJob.java:54)
        at io.jaegertracing.spark.dependencies.DependenciesSparkJob.main(DependenciesSparkJob.java:40)
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Unsupported/Unknown Elasticsearch version [8.2.3].Highest supported version is [7.x]. You may need to upgrade ES-Hadoop.
        at org.elasticsearch.hadoop.util.EsMajorVersion.parse(EsMajorVersion.java:91)
        at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:756)
        at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:338)
        ... 31 more

Expected behavior
It would be nice to make it work with ES 8.x (Jaeger is "supporting" it)

Version (please complete the following information):

  • OS: Linux
  • Jaeger version: 1.31
  • Deployment: Kubernetes

Additional context

maybe by using this in pom.xml:

<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch-spark-20_${version.scala.binary}</artifactId>
    <version>8.2.3</version>
</dependency>
@vmaleze
Copy link

vmaleze commented Feb 14, 2024

For those of you who want to use this with elastic 8, here is the repo with the artefact built using the PR of @sylvainOL => https://github.com/vmaleze/spark-dependencies-es8/pkgs/container/spark-dependencies-es8
Unfortunately, the tests fails on the latest-jaeger stage and I have issues testing this on arm64. So I cannot fix it to submit a PR and merge this in the official repo.
However, the forked repo works like a charm for elastic 8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants