Ver. 2.1.0 throws java.util.NoSuchElementException: None.get with nested type #504

rleiwang · 2015-07-14T18:05:42Z

Env: elasticsearch-1.6.0, elasticsearch-hadoop-2.1.0
Mappings:

{
   "myindex": {
      "mappings": {
         "type1": {
            "properties": {
               "objectField": {
                  "type": "nested",
                  "properties": {
                     "stringField": {
                        "type": "string",
                        "index": "not_analyzed"
                     }
                  }
               }
            }
         }
      }
   }
}

Spark Driver:

        DataFrame df = sqlCtx.read()
                .format("org.elasticsearch.spark.sql")
                .option("pushdown", "true")
                .option("strict", "true")
                .option("es.nodes", "localhost")
                .option("es.port", "9208")
                .load(index + "/" + type);
        System.out.println("count " + df.count());

Exception:

java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:313)
    at scala.None$.get(Option.scala:311)
    at org.elasticsearch.spark.sql.RowValueReader$class.rowOrder(RowValueReader.scala:24)
    at org.elasticsearch.spark.sql.ScalaRowValueReader.rowOrder(ScalaEsRowValueReader.scala:13)
    at org.elasticsearch.spark.sql.ScalaRowValueReader.createMap(ScalaEsRowValueReader.scala:32)
    at org.elasticsearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:620)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:559)
    at org.elasticsearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:636)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:559)
    at org.elasticsearch.hadoop.serialization.ScrollReader.readHitAsMap(ScrollReader.java:358)
    at org.elasticsearch.hadoop.serialization.ScrollReader.readHit(ScrollReader.java:293)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:188)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:167)
    at org.elasticsearch.hadoop.rest.RestRepository.scroll(RestRepository.java:403)
    at org.elasticsearch.hadoop.rest.ScrollQuery.hasNext(ScrollQuery.java:76)
    at org.elasticsearch.spark.rdd.AbstractEsRDDIterator.hasNext(AbstractEsRDDIterator.scala:43)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.Aggregate$$anonfun$doExecute$1$$anonfun$6.apply(Aggregate.scala:129)
    at org.apache.spark.sql.execution.Aggregate$$anonfun$doExecute$1$$anonfun$6.apply(Aggregate.scala:126)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)```

The same code works fine with mapping:

{
   "myindex": {
      "mappings": {
         "type1": {
            "properties": {
               "objectField": {
                  "type": "object",
                  "properties": {
                     "stringField": {
                        "type": "string",
                        "index": "not_analyzed"
                     }
                  }
               }
            }
         }
      }
   }
}

costin · 2015-07-27T15:06:00Z

If I understand correctly, the issue stems from the fact that in one case the mapping is nested, in the other it is object.
Can you please turn on logging on the org.elasticsearch.hadoop.rest package (to TRACE) and put them in a gist? This gives insight into the actual JSON returned from Elasticsearch based on your script between the two cases and thus narrow down the problem.

Cheers,

costin · 2015-10-28T18:55:54Z

@rleiwang Since there hasn't been any update on this issue, I'm closing it. Feel free to open another one if the issue is still present.

costin added :Spark v2.1.2 v2.2.0-beta1 labels Oct 28, 2015

costin closed this as completed Oct 28, 2015

costin added invalid question labels Oct 28, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ver. 2.1.0 throws java.util.NoSuchElementException: None.get with nested type #504

Ver. 2.1.0 throws java.util.NoSuchElementException: None.get with nested type #504

rleiwang commented Jul 14, 2015

costin commented Jul 27, 2015

costin commented Oct 28, 2015

Ver. 2.1.0 throws java.util.NoSuchElementException: None.get with nested type #504

Ver. 2.1.0 throws java.util.NoSuchElementException: None.get with nested type #504

Comments

rleiwang commented Jul 14, 2015

costin commented Jul 27, 2015

costin commented Oct 28, 2015