Skip to content

es.field.read.as.array.include only works one level deep #589

Closed
@jeffsteinmetz

Description

@jeffsteinmetz

I just ran into a real world situation, where there is a list of arrays, inside another array.

val sc = new SparkContext(conf)
val cfg = collection.mutable.Map("es.field.read.as.array.include" -> "nested.bar,foo,nested.bar.scores")
val json = """{"foo" : [5,6], "nested": { "bar" : [{"date":"2015-01-01", "scores":[1,2]},{"date":"2015-01-01", "scores":[3,4]}], "what": "now" } }"""
sc.makeRDD(Seq(json)).saveJsonToEs("spark/mappingtest")
val df = new SQLContext(sc).read
      .options(cfg)
      .format("org.elasticsearch.spark.sql")
      .load("spark/mappingtest")
println(df.collect().toList)

addin nested.bar.scores to es.field.read.as.array.include seems to not help ES hadoop with a hint that there is an array at this level.

throws:
EsHadoopIllegalStateException: Field 'nested.bar.scores' not found; typically this occurs with arrays which are not mapped as single value

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions