Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ver. 2.1.0 throws java.util.NoSuchElementException: None.get with nested type #504

Closed
rleiwang opened this issue Jul 14, 2015 · 2 comments
Closed

Comments

@rleiwang
Copy link

Env: elasticsearch-1.6.0, elasticsearch-hadoop-2.1.0
Mappings:

{
   "myindex": {
      "mappings": {
         "type1": {
            "properties": {
               "objectField": {
                  "type": "nested",
                  "properties": {
                     "stringField": {
                        "type": "string",
                        "index": "not_analyzed"
                     }
                  }
               }
            }
         }
      }
   }
}

Spark Driver:

        DataFrame df = sqlCtx.read()
                .format("org.elasticsearch.spark.sql")
                .option("pushdown", "true")
                .option("strict", "true")
                .option("es.nodes", "localhost")
                .option("es.port", "9208")
                .load(index + "/" + type);
        System.out.println("count " + df.count());

Exception:

java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:313)
    at scala.None$.get(Option.scala:311)
    at org.elasticsearch.spark.sql.RowValueReader$class.rowOrder(RowValueReader.scala:24)
    at org.elasticsearch.spark.sql.ScalaRowValueReader.rowOrder(ScalaEsRowValueReader.scala:13)
    at org.elasticsearch.spark.sql.ScalaRowValueReader.createMap(ScalaEsRowValueReader.scala:32)
    at org.elasticsearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:620)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:559)
    at org.elasticsearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:636)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:559)
    at org.elasticsearch.hadoop.serialization.ScrollReader.readHitAsMap(ScrollReader.java:358)
    at org.elasticsearch.hadoop.serialization.ScrollReader.readHit(ScrollReader.java:293)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:188)
    at org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:167)
    at org.elasticsearch.hadoop.rest.RestRepository.scroll(RestRepository.java:403)
    at org.elasticsearch.hadoop.rest.ScrollQuery.hasNext(ScrollQuery.java:76)
    at org.elasticsearch.spark.rdd.AbstractEsRDDIterator.hasNext(AbstractEsRDDIterator.scala:43)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.Aggregate$$anonfun$doExecute$1$$anonfun$6.apply(Aggregate.scala:129)
    at org.apache.spark.sql.execution.Aggregate$$anonfun$doExecute$1$$anonfun$6.apply(Aggregate.scala:126)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)```

The same code works fine with mapping:

{
   "myindex": {
      "mappings": {
         "type1": {
            "properties": {
               "objectField": {
                  "type": "object",
                  "properties": {
                     "stringField": {
                        "type": "string",
                        "index": "not_analyzed"
                     }
                  }
               }
            }
         }
      }
   }
}
@costin
Copy link
Member

costin commented Jul 27, 2015

If I understand correctly, the issue stems from the fact that in one case the mapping is nested, in the other it is object.
Can you please turn on logging on the org.elasticsearch.hadoop.rest package (to TRACE) and put them in a gist? This gives insight into the actual JSON returned from Elasticsearch based on your script between the two cases and thus narrow down the problem.

Cheers,

@costin
Copy link
Member

costin commented Oct 28, 2015

@rleiwang Since there hasn't been any update on this issue, I'm closing it. Feel free to open another one if the issue is still present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants