Closed
Description
I just ran into a real world situation, where there is a list of arrays, inside another array.
val sc = new SparkContext(conf)
val cfg = collection.mutable.Map("es.field.read.as.array.include" -> "nested.bar,foo,nested.bar.scores")
val json = """{"foo" : [5,6], "nested": { "bar" : [{"date":"2015-01-01", "scores":[1,2]},{"date":"2015-01-01", "scores":[3,4]}], "what": "now" } }"""
sc.makeRDD(Seq(json)).saveJsonToEs("spark/mappingtest")
val df = new SQLContext(sc).read
.options(cfg)
.format("org.elasticsearch.spark.sql")
.load("spark/mappingtest")
println(df.collect().toList)
addin nested.bar.scores
to es.field.read.as.array.include
seems to not help ES hadoop with a hint that there is an array at this level.
throws:
EsHadoopIllegalStateException: Field 'nested.bar.scores' not found; typically this occurs with arrays which are not mapped as single value