New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thousands of SearchContextMissingException: No search context found for id #576
Comments
Does this happened before 2.1.1 or not? It looks like the scan is finished/closed and the id is unavailable. Anything suspicious? If you can replicate this on a small log job, you can enable What version of ES are you using? Does this occur on all Spark tasks or just on one? |
Elasticsearch version is 1.7.2, also happens with 2.1.0. It does not happen on all Spark jobs, but i have isolated the scenario 😄 import com.sksamuel.elastic4s.ElasticClient
import com.sksamuel.elastic4s.ElasticDsl._
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
import org.elasticsearch.common.settings.ImmutableSettings
import org.elasticsearch.spark._
object EsScanLogApp {
def main(args: Array[String]): Unit = {
val sparkConf: SparkConf = new SparkConf()
.setMaster("local[*]")
.setAppName("EsScanLogApp")
val settings = ImmutableSettings.builder.put("cluster.name", "test").build()
val client = ElasticClient.remote(settings, "localhost", 9300)
client.execute {
deleteIndex("events-*")
}.await
// Depending on nr of documents you get more (or less) errors
client.execute {
bulk(
(0 to 20000).map(i => index into "events-2015.01.23" / "details" fields "foo" -> "bar")
++ (0 to 10).map(i => index into "events-2015.09.23" / "details" fields "foo" -> "bar")
)
}.await
val sc = new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
val indexName = "events-*/details"
sc.esRDD(indexName).collect()
}
} This causes the following logs in spark, maybe related that it discovered the mapping of the second (09.23) instead of the first (01.23)?
|
The error looks to be internal to ES likely indicating there's an issue with the state (whether that's defined by the client or not it's unclear). Since you are deleting and creating new docs, you could trigger a refresh after it and see whether it makes any difference before making the Can you post your logs as a gist ? Also consider turning the level to Cheers, |
Hi @costin The deleting / indexing is only done to reproduce the issue, it doesn't need to be in the Job. Since this is a pretty simple testcase you'll could easily get the logs, i currently don't have time to post them. |
@sjoerdmulder I managed to replicate the issue and track it down (thanks to @clintongormley ) to 2.1.x not picking up the new |
@sjoerdmulder Since the issue was detected (thanks for the test) and fixed I'm closing the issue. Feel free to open another one if it keeps occuring. Cheers, |
Version 2.1.1
We are using es-hadoop with spark, but each time our job runs the elasticsearch logs are spoiled with the errors like below:
This is causing the logfiles to grow by many gigabytes each day
The text was updated successfully, but these errors were encountered: