[CROSSDATA] Elastic fix types #644

pfcoperez · 2016-08-24T07:48:33Z

Description

This PR adds missing type conversions from elastic native types to SparkSQL's:

BINARY
TINYINT
FLOAT

It also changes the behavior of com.stratio.crossdata.connector.elasticsearch.DefaultSource by adding types conversions in the ElasticsearchXDRelation value extractor (executed at each node).

Note that huge parts of the original elastic search datasource have been copied into our code thus adding maintenance hazards as the data source development continues. This has been done this way because of the limitations raised by the author's use of access modifiers. So a PR, already merged, (elastic/elasticsearch-hadoop#826) has been used to change that at the connector code.

When an elasticsearch-spark-sql13 with the update code gets published, then the changes employing the datasource improvements (pfcoperez@863e9aa) should be merged into crossdata development branch.

Finally, it adds support for native sub-document selection by both: Making it possible to select sub-documents and columns and transforming the document into a rows hierarchy.

Testing

Unit

coveralls · 2016-08-25T06:52:35Z

Coverage increased (+0.5%) to 57.349% when pulling eb10863 on pfcoperez:elasticFixNativeTypes into f6b8b0e on Stratio:master.

coveralls · 2016-08-29T11:25:34Z

Coverage increased (+1.1%) to 58.467% when pulling 062d8cd on pfcoperez:elasticFixNativeTypes into 5348885 on Stratio:master.

coveralls · 2016-08-30T10:47:29Z

Coverage increased (+0.8%) to 58.405% when pulling 148cccc on pfcoperez:elasticFixNativeTypes into f3b47d5 on Stratio:master.

coveralls · 2016-08-30T11:14:26Z

Coverage increased (+0.7%) to 58.623% when pulling fc44690 on pfcoperez:elasticFixNativeTypes into 8798996 on Stratio:master.

coveralls · 2016-08-30T12:22:19Z

Coverage increased (+0.8%) to 58.685% when pulling 9a41911 on pfcoperez:elasticFixNativeTypes into 8798996 on Stratio:master.

mafernandez-stratio · 2016-08-30T22:33:17Z

...c/main/scala/com/stratio/crossdata/connector/elasticsearch/ElasticSearchQueryProcessor.scala

+
+      val fieldsQuery = query.fields(stringFields.toList: _*)
+
+      if(stringFields.size != fields.size)


Maybe it could be more precise and elegant to use:

stringFields.containsAll(fields)

Unfortunately, that operation doesn't exists. Besides, imagine it did: It would be O(n Log n) at best, most probably O(n^2). Given that stringFields is the result of filtering out elements from stringFields, we know for sure that it is either empty or a sub set of the latter. Provided that condition you can check that one is different to the other by just comparing their size.

If, on the other hand, you take into consideration that most Seq implementers provide a O(1) length operation, not making use of the advantage of being able to compare them by just comparing their size would be a huge sacrifice of performance.

e.g: Blazing blazing fast size checking of huge streams:

val s = (1 to 1000000000).toSeq s.size

mafernandez-stratio · 2016-08-30T22:50:43Z

elasticsearch/src/main/scala/org/elasticsearch/spark/sql/ElasticsearchXDRelation.scala

+      }
+    }
+
+    if (filters != null && filters.size > 0) {


Could an Option be used in order to avoid this ugly code? 😃

Absolutely but this is code coming from the connector, we agreed that copy-pasted code should remain untouched for maintenance purposes.

coveralls · 2016-08-31T06:56:51Z

Coverage increased (+0.9%) to 58.685% when pulling f74136c on pfcoperez:elasticFixNativeTypes into 39fa0d6 on Stratio:master.

darroyocazorla · 2016-08-31T08:14:08Z

👍

mafernandez-stratio · 2016-08-31T08:16:10Z

LGTM

coveralls · 2016-08-31T10:16:09Z

Coverage decreased (-8.9%) to 48.977% when pulling cf2c63c on pfcoperez:elasticFixNativeTypes into 39fa0d6 on Stratio:master.

… running spark tasks in order to follow standard spark results types. NOTE: This change relies on manually copying a significant portion of code from the ElasticSearch data source. This should be changed ASAP

…ssing so now it is possible to ask for subdocuments using native Elasticsearch.

coveralls · 2016-08-31T13:03:08Z

Coverage increased (+0.9%) to 58.685% when pulling a5a30f0 on pfcoperez:elasticFixNativeTypes into 39fa0d6 on Stratio:master.

pfcoperez added the Blocked label Aug 24, 2016

pfcoperez assigned pianista215 and mafernandez-stratio Aug 24, 2016

pfcoperez changed the title ~~Elastic fix native types~~ [WIP] Elastic fix types Aug 24, 2016

pfcoperez changed the title ~~[WIP] Elastic fix types~~ [WIP][CROSSDATA-] Elastic fix types Aug 24, 2016

pfcoperez added in progress and removed Blocked labels Aug 24, 2016

pfcoperez changed the title ~~[WIP][CROSSDATA-] Elastic fix types~~ [CROSSDATA] Elastic fix types Aug 26, 2016

pfcoperez force-pushed the elasticFixNativeTypes branch from 0fdcc5b to f00ebe8 Compare August 26, 2016 12:42

pfcoperez removed the in progress label Aug 26, 2016

pfcoperez force-pushed the elasticFixNativeTypes branch 2 times, most recently from d669582 to 062d8cd Compare August 29, 2016 11:03

pfcoperez assigned darroyocazorla Aug 30, 2016

darroyocazorla assigned mafernandez-stratio and unassigned mafernandez-stratio and pianista215 Aug 30, 2016

mafernandez-stratio reviewed Aug 30, 2016
View reviewed changes

pfcoperez force-pushed the elasticFixNativeTypes branch 2 times, most recently from fb134a9 to e938c82 Compare August 31, 2016 12:18

darroyo-stratio and others added 7 commits August 31, 2016 14:30

test elastic types

65208c3

Now ElasticSearch DefaultDataSource perform types transformation when…

b56ca6d

… running spark tasks in order to follow standard spark results types. NOTE: This change relies on manually copying a significant portion of code from the ElasticSearch data source. This should be changed ASAP

Added native type conversor for float values.

c9057a4

Added TINYINT and SMALLINT native types conversions.

26cc653

Added BINARY native type conversion.

e5e43b1

Fixed both the elastic search query generator and query results proce…

ead2d8f

…ssing so now it is possible to ask for subdocuments using native Elasticsearch.

Added types test es.nodes.wan.only option for elasticsearch.

a5a30f0

pfcoperez force-pushed the elasticFixNativeTypes branch from e938c82 to a5a30f0 Compare August 31, 2016 12:36

pfcoperez merged commit b64ee38 into Stratio:master Aug 31, 2016

pfcoperez deleted the elasticFixNativeTypes branch August 31, 2016 14:23

stratiocommit pushed a commit that referenced this pull request Jun 27, 2019

[CROSSDATA-1858] Aggregated health-checks & KPIs endpoint (#644)

2fd13a8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CROSSDATA] Elastic fix types #644

[CROSSDATA] Elastic fix types #644

pfcoperez commented Aug 24, 2016 •

edited

coveralls commented Aug 25, 2016 •

edited

coveralls commented Aug 29, 2016

coveralls commented Aug 30, 2016

coveralls commented Aug 30, 2016

coveralls commented Aug 30, 2016

mafernandez-stratio Aug 30, 2016

pfcoperez Aug 31, 2016 •

edited

mafernandez-stratio Aug 30, 2016

pfcoperez Aug 31, 2016

coveralls commented Aug 31, 2016

darroyocazorla commented Aug 31, 2016

mafernandez-stratio commented Aug 31, 2016

coveralls commented Aug 31, 2016 •

edited

coveralls commented Aug 31, 2016


		val fieldsQuery = query.fields(stringFields.toList: _*)

		if(stringFields.size != fields.size)

[CROSSDATA] Elastic fix types #644

[CROSSDATA] Elastic fix types #644

Conversation

pfcoperez commented Aug 24, 2016 • edited

Description

Testing

coveralls commented Aug 25, 2016 • edited

coveralls commented Aug 29, 2016

coveralls commented Aug 30, 2016

coveralls commented Aug 30, 2016

coveralls commented Aug 30, 2016

mafernandez-stratio Aug 30, 2016

Choose a reason for hiding this comment

pfcoperez Aug 31, 2016 • edited

Choose a reason for hiding this comment

mafernandez-stratio Aug 30, 2016

Choose a reason for hiding this comment

pfcoperez Aug 31, 2016

Choose a reason for hiding this comment

coveralls commented Aug 31, 2016

darroyocazorla commented Aug 31, 2016

mafernandez-stratio commented Aug 31, 2016

coveralls commented Aug 31, 2016 • edited

coveralls commented Aug 31, 2016

pfcoperez commented Aug 24, 2016 •

edited

coveralls commented Aug 25, 2016 •

edited

pfcoperez Aug 31, 2016 •

edited

coveralls commented Aug 31, 2016 •

edited