Conversation
0fdcc5b
to
f00ebe8
Compare
d669582
to
062d8cd
Compare
|
||
val fieldsQuery = query.fields(stringFields.toList: _*) | ||
|
||
if(stringFields.size != fields.size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it could be more precise and elegant to use:
stringFields.containsAll(fields)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, that operation doesn't exists. Besides, imagine it did: It would be O(n Log n) at best, most probably O(n^2). Given that stringFields
is the result of filtering out elements from stringFields
, we know for sure that it is either empty or a sub set of the latter. Provided that condition you can check that one is different to the other by just comparing their size.
If, on the other hand, you take into consideration that most Seq
implementers provide a O(1) length operation, not making use of the advantage of being able to compare them by just comparing their size would be a huge sacrifice of performance.
e.g: Blazing blazing fast size checking of huge streams:
val s = (1 to 1000000000).toSeq
s.size
} | ||
} | ||
|
||
if (filters != null && filters.size > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could an Option be used in order to avoid this ugly code? 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely but this is code coming from the connector, we agreed that copy-pasted code should remain untouched for maintenance purposes.
👍 |
LGTM |
fb134a9
to
e938c82
Compare
… running spark tasks in order to follow standard spark results types. NOTE: This change relies on manually copying a significant portion of code from the ElasticSearch data source. This should be changed ASAP
…ssing so now it is possible to ask for subdocuments using native Elasticsearch.
e938c82
to
a5a30f0
Compare
Description
This PR adds missing type conversions from elastic native types to SparkSQL's:
It also changes the behavior of
com.stratio.crossdata.connector.elasticsearch.DefaultSource
by adding types conversions in the ElasticsearchXDRelation value extractor (executed at each node).Note that huge parts of the original elastic search datasource have been copied into our code thus adding maintenance hazards as the data source development continues. This has been done this way because of the limitations raised by the author's use of access modifiers. So a PR, already merged, (elastic/elasticsearch-hadoop#826) has been used to change that at the connector code.
When an
elasticsearch-spark-sql13
with the update code gets published, then the changes employing the datasource improvements (pfcoperez@863e9aa) should be merged into crossdata development branch.Finally, it adds support for native sub-document selection by both: Making it possible to select sub-documents and columns and transforming the document into a rows hierarchy.
Testing