New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java.util.NoSuchElementException: None.get on 2.1.0.rc1 when Array of objects in mapping #484
Comments
Hey, seems I have the same error; also having arrays of objects in the index. I am running Spark-ES-Connector 2.1.0 final with Spark 1.4.0. All the best, |
Hi, Unfortunately there's no easy solution for this one. Elasticsearch mapping treats fields with one or multiple (array) values the same way in mapping. In Spark SQL since the associated schema of the There are two issues here:
I'd like to address both 2 and 1 in the same go in the upcoming version. Will let you know once that happens. |
Fwiw, I have updated master and 2.x so that a better exception is thrown (it does occur still since again, the SQL schema expects a single value). |
Is there a way to manually specify the schema or the type of that one column when creating the DataFrame right now? Also, can elasticsearch-hadoop read the
but the actual value in the documents right now is
So it's incorrectly detecting it as a string. |
Is this issue resolved? I applied a schema to the data frame, and printSchema shows desired schema, but still it throws the same exception. Any other work-around? |
Hi Costin, Is it possible for you to give an interim jar file with the fix? At this moment we are kind of stuck due to this issue. The problem arises due to the existence of the following key in the ES document. The printSchema() on the DataFrame should print the following and we apply the schema to the DataFrame also, but the result is same. |-- SensitiveDataDomains: array (nullable = true) |
The issue is being worked on, hopefully there's going to be a fix over the weekend. As for the interim jars - that's what the dev builds are all about; we publish nightly builds between releases so folks can try things out right away, without having to build them or wait for the release to happen. Once it will be solved, you'll get notified on this thread. As a workaround, one can create the schema manually and specify what fields have to be arrays. This can be done either by manually creating the Thank you for the patience. |
Hey folks, Is there any update to this? tried 2.2.0m1 and ran into the same issue. Cheers :) |
Hi everyone, I've pushed a fix to master as well as the dev builds. Basically one can now configure what fields will be read as arrays by the connector through By default, no field will be considered as an array. So for example given the mapping:
by default the detected schema will be:
To tell the connector that
Multiple values can be specified by separating them with a comma and even regexps can be used. Feedback is welcome! P.S. Thanks for your patience. |
@costin which commit are you referring to? searching for thanks for being so quick :) |
Introduce option to tell the connector what fields in ES need to be read as arrays. This way the connector always creates the appropriate structure, in particular in Spark SQL. relates #484
This one. The nightly build was pushed manually so testing it should have addressed the situation. |
Introduce option to tell the connector what fields in ES need to be read as arrays. This way the connector always creates the appropriate structure, in particular in Spark SQL. relates #484 (cherry picked from commit 4a76b47) Conflicts: spark/sql-13/src/main/scala/org/elasticsearch/spark/sql/SchemaUtils.scala
Folks as there hasn't been any update I'm closing the issue. Please see the new releases once they come out (should be a couple of hours). |
Using;
When we query with SparkSql we get an error:
I think the problem occur because we have an Array of objects in our index
Our index:
The text was updated successfully, but these errors were encountered: