Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Spark no longer works with new charts that use elasticsearch 8+ #574

Open
Stevenpc3 opened this issue May 22, 2024 · 5 comments
Open
Labels
bug Something isn't working

Comments

@Stevenpc3
Copy link
Contributor

What happened?

As a user of the "system architecture" tab in Jaeger I would like to use Spark to generate the diagrams.

But spark no longer works with new charts that use elasticsearch 8+

Spark job throws logs stating that it requires elasticsearch 7 and hadoop.

Steps to reproduce

  1. Use the new charts to deploy using elasticsearch 8+
  2. produce traces that go to elasticsearch
  3. run a spark job
  4. check logs of the spark job for errors

Expected behavior

spark job completes as it used to

Relevant log output

24/05/22 15:41:20 INFO ElasticsearchDependenciesJob: Running Dependencies job for 2024-05-22T00:00Z, reading from jaeger-span-2024-05-22 index, result storing to jaeger-dependencies-2024-05-22
Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
        at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:340)
        at org.elasticsearch.hadoop.rest.RestService.findPartitions(RestService.java:220)
        at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions$lzycompute(AbstractEsRDD.scala:79)
        at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions(AbstractEsRDD.scala:78)
        at org.elasticsearch.spark.rdd.AbstractEsRDD.getPartitions(AbstractEsRDD.scala:48)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
        at org.apache.spark.Partitioner$$anonfun$4.apply(Partitioner.scala:75)
        at org.apache.spark.Partitioner$$anonfun$4.apply(Partitioner.scala:75)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.immutable.List.map(List.scala:285)
        at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:75)
        at org.apache.spark.rdd.RDD$$anonfun$groupBy$1.apply(RDD.scala:691)
        at org.apache.spark.rdd.RDD$$anonfun$groupBy$1.apply(RDD.scala:691)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
        at org.apache.spark.rdd.RDD.groupBy(RDD.scala:690)
        at org.apache.spark.api.java.JavaRDDLike$class.groupBy(JavaRDDLike.scala:243)
        at org.apache.spark.api.java.AbstractJavaRDDLike.groupBy(JavaRDDLike.scala:45)
        at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:236)
        at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:212)
        at io.jaegertracing.spark.dependencies.DependenciesSparkJob.run(DependenciesSparkJob.java:54)
        at io.jaegertracing.spark.dependencies.DependenciesSparkJob.main(DependenciesSparkJob.java:40)
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Unsupported/Unknown Elasticsearch version [8.13.2].Highest supported version is [7.x]. You may need to upgrade ES-Hadoop.
        at org.elasticsearch.hadoop.util.EsMajorVersion.parse(EsMajorVersion.java:91)
        at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:746)
        at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:330)
        ... 33 more

Screenshot

No response

Additional context

No response

Jaeger backend version

3.0.7

SDK

No response

Pipeline

No response

Stogage backend

Elasticsearch 8+

Operating system

Linux

Deployment model

Kubernetes

Deployment configs

# -- enable or disable Jaeger
enabled: true

storage:
  type: elasticsearch
  elasticsearch:
    # make this a template that decides based on devMode and can configure properly
    host: "jaeger-elasticsearch"
    usePassword: false
    antiAffinity: "soft"

# -- Preferred long term backend storage
elasticsearch:
  master:
    masterOnly: false
    replicaCount: 1
    lifecycleHooks:
      postStart:
        exec:
          command:
            - bash
            - -c
            - |
              #!/bin/bash
              # Add a template to adjust number of shards/replicas
              TEMPLATE_NAME=no_replicas
              # INDEX_PATTERN1="jaeger-span-*"
              # INDEX_PATTERN2="jaeger-service-*"
              INDEX_PATTERN1="jaeger-dependencies-*"
              ES_URL=http://localhost:9200
              while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
              curl -XPUT "$ES_URL/_index_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN1"\"'],"template":{"settings":{"number_of_replicas":"0"}}}'
  data:
    replicaCount: 0
  coordinating:
    replicaCount: 0
  ingest:
    replicaCount: 0
  fullnameOverride: "jaeger-elasticsearch"
  volumeClaimTemplate:
    accessModes: ["ReadWriteOnce"]
    resources:
      requests:
        storage: 3Gi

# -- For support with older Trace formats
agent:
    enabled: false

# -- The backend storage type to use
provisionDataStore:
  cassandra: false
  elasticsearch: true
  kafka: false

# -- The service that collects and serves trace information
collector:
  service:
    otlp:
      grpc:
        port: 4317
        name: oltp-grpc
      http:
        port: 4318
        name: oltp-http
  cmdlineParams:
    es.num-replicas: "0"

# -- The Jaeger UI service
query:
  agentSidecar:
    enabled: false
  # -- This should start with a /
  basePath: /jaeger

# Jaeger Spark job to generate the system architecture
spark:
  enabled: true
  schedule: "00 21 * * *"
@Stevenpc3 Stevenpc3 added the bug Something isn't working label May 22, 2024
@Stevenpc3
Copy link
Contributor Author

Stevenpc3 commented May 22, 2024

Need to update the registry in the chart based on this comment. #532 (comment)

Make the correct registry part of the chart

@Stevenpc3
Copy link
Contributor Author

@dpericaxon @yurishkuro Why is sparkdependencies hosted in github and the rest are hosted on docker? https://github.com/orgs/jaegertracing/packages

That is a bit confusing especially since there is on on docker that is claimed to be outdated via jaegertracing/spark-dependencies#137 (comment)

@yurishkuro
Copy link
Member

I don't know how/why that decision was made. I agree it would've been better to use the same Docker and Quay hosting we use for other images.

@yurishkuro
Copy link
Member

I updated the readme for spark-dependencies. I think this Helm chart should be pointing to a different location too:

repository: jaegertracing/spark-dependencies

@Stevenpc3
Copy link
Contributor Author

Stevenpc3 commented May 28, 2024

Yeah that link to the values is what I meant by #574 (comment)

I can make a PR. I think just setting the registry and repo to default to the ghcr.io will be fine. I did this locally since we use the global.imageRegistry. Then it will work out if the box for others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants