Unable to index JSON from HDFS using SchemaRDD.saveToEs()

Elasticsearch-hadoop: elasticsearch-hadoop-2.1.0.BUILD-20150224.023403-309.zip
Spark: 1.2.0

I'm attempting to take a very basic JSON document on HDFS and index it in ES using SchemaRDD.saveToEs().  According to the documentation under "writing existing JSON to elasticsearch" it should be as easy as creating a SchemaRDD via SQLContext.jsonFile() and then index using .saveToEs(), but I'm getting an error.

Replicating the problem:
1) Create JSON file on hdfs with the content:

```
{"key":"value"}
```

2) Execute code in spark-shell

```
import org.apache.spark.SparkContext._
import org.elasticsearch.spark._

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._

val input = sqlContext.jsonFile("hdfs://nameservice1/user/mshirley/test.json")
input.saveToEs("mshirley_spark_test/test")
```

Error:

```
 org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment received[["value"]][MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 34, 118, 97, 108, 117, 101, 34, 93, 10]]; ]];
```

input object:

```
res1: org.apache.spark.sql.SchemaRDD = 
SchemaRDD[6] at RDD at SchemaRDD.scala:108
== Query Plan ==
== Physical Plan ==
PhysicalRDD [key#0], MappedRDD[5] at map at JsonRDD.scala:47
```

input.printSchema():

```
root
 |-- key: string (nullable = true)
```

Expected result:
I expect to be able to read a file from HDFS that contains a JSON document per line and submit that data to ES for indexing.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to index JSON from HDFS using SchemaRDD.saveToEs() #382

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to index JSON from HDFS using SchemaRDD.saveToEs() #382

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions