[CARBONDATA-4166] Geo spatial Query Enhancements #4127

Indhumathi27 · 2021-04-29T13:00:58Z

Why is this PR needed?

Currently, for IN_POLYGON_LIST and IN_POLYLINE_LIST udf’s, polygons need to be
specified in SQL. If the polygon list grows in size, then the SQL will also be too long,
which may affect query performance, as SQL analysing cost will be more.
If Polygons are defined as a Column in a new dimension table, then, Spatial dimension
table join can be supported in order to support aggregation on spatial table columns
based on polygons.

What changes were proposed in this PR?

Support IN_POLYGON_LIST and IN_POLYLINE_LIST with SELECT QUERY on the
polygon table.
Support IN_POLYGON filter as join condition for spatial JOIN queries.

Does this PR introduce any user interface change?

Yes.

Is any new testcase added?

Yes

CarbonDataQA2 · 2021-04-29T13:15:25Z

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3542/

CarbonDataQA2 · 2021-04-29T14:22:25Z

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5289/

CarbonDataQA2 · 2021-04-29T16:44:44Z

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3545/

CarbonDataQA2 · 2021-04-29T16:48:18Z

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5290/

CarbonDataQA2 · 2021-04-30T14:18:40Z

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3549/

CarbonDataQA2 · 2021-04-30T14:25:42Z

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5294/

docs/spatial-index-guide.md

geo/src/main/java/org/apache/carbondata/geo/scan/expression/PolygonListExpression.java

geo/src/main/java/org/apache/carbondata/geo/scan/expression/PolygonRangeListExpression.java

ajantha-bhat · 2021-05-05T15:11:47Z

integration/spark/src/main/scala/org/apache/carbondata/geo/GeoUtilUDFs.scala

@@ -30,6 +32,7 @@ object GeoUtilUDFs {
    sparkSession.udf.register("LatLngToGeoId", new LatLngToGeoIdUDF)
    sparkSession.udf.register("ToUpperLayerGeoId", new ToUpperLayerGeoIdUDF)
    sparkSession.udf.register("ToRangeList", new ToRangeListUDF)
+    sparkSession.udf.register("ToRangeListAsString", new ToRangeListAsStringUDF)


these UDF are exposed to user right ? can we update in the document ?

This udf is not exposed to user. This is for internal purpose only.

ajantha-bhat · 2021-05-05T15:17:18Z

integration/spark/src/main/scala/org/apache/carbondata/geo/InPolygonUDF.scala

+      val matchedStr = matcher.group
+      range = matchedStr
+    }
+    val ranges = PolygonRangeListExpression.getRangeListFromString(range)


here we need to check against null for ranges? some places we check and some places we don't. (example ToRangeListAsStringUDF) can we make it uniform? If not required, remove from other places also. also maybe extract a common method for matcher and find if possible.

NULL check is already handled here in line:49. Handled some refactoring to extract common code to new method.

ajantha-bhat · 2021-05-05T15:25:33Z

...ark/src/main/scala/org/apache/spark/sql/execution/joins/BroadCastPolygonFilterPushJoin.scala

+import org.apache.carbondata.geo.scan.expression.PolygonRangeListExpression
+import org.apache.carbondata.spark.rdd.CarbonScanRDD
+
+case class BroadCastPolygonFilterPushJoin(


many things are common with BroadCastSIFilterPushJoin, we cannot extend the same class ?

BroadCastPolygonFilterPushJoin implementation is not same as BroadCastSIFilterPushJoin in terms of filter and method definition. so, better to keep it seperate.

...ark/src/main/scala/org/apache/spark/sql/execution/joins/BroadCastPolygonFilterPushJoin.scala

integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DMLStrategy.scala

CarbonDataQA2 · 2021-05-06T07:29:46Z

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5308/

CarbonDataQA2 · 2021-05-06T07:32:34Z

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3563/

ajantha-bhat · 2021-05-06T09:49:36Z

LGTM

Indhumathi27 force-pushed the spatial branch from ac286da to 39a2427 Compare April 29, 2021 14:07

Indhumathi27 force-pushed the spatial branch from 39a2427 to 1f86ef5 Compare April 29, 2021 15:10

Geo spatial improvements

da0283f

Indhumathi27 force-pushed the spatial branch from 1f86ef5 to da0283f Compare April 30, 2021 12:47

Indhumathi27 changed the title ~~[WIP] Geo spatial improvements~~ [CARBONDATA-4166] Geo spatial Query Enhancements May 5, 2021

ajantha-bhat reviewed May 5, 2021

View reviewed changes

fix comments

7cc92d5

asfgit closed this in c825730 May 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-4166] Geo spatial Query Enhancements #4127

[CARBONDATA-4166] Geo spatial Query Enhancements #4127

Indhumathi27 commented Apr 29, 2021 •

edited

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 30, 2021

CarbonDataQA2 commented Apr 30, 2021

ajantha-bhat May 5, 2021

Indhumathi27 May 6, 2021

ajantha-bhat May 5, 2021

Indhumathi27 May 6, 2021

ajantha-bhat May 5, 2021

Indhumathi27 May 6, 2021

CarbonDataQA2 commented May 6, 2021

CarbonDataQA2 commented May 6, 2021

ajantha-bhat commented May 6, 2021

[CARBONDATA-4166] Geo spatial Query Enhancements #4127

[CARBONDATA-4166] Geo spatial Query Enhancements #4127

Conversation

Indhumathi27 commented Apr 29, 2021 • edited

Why is this PR needed?

What changes were proposed in this PR?

Does this PR introduce any user interface change?

Is any new testcase added?

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 29, 2021

CarbonDataQA2 commented Apr 30, 2021

CarbonDataQA2 commented Apr 30, 2021

ajantha-bhat May 5, 2021

Choose a reason for hiding this comment

Indhumathi27 May 6, 2021

Choose a reason for hiding this comment

ajantha-bhat May 5, 2021

Choose a reason for hiding this comment

Indhumathi27 May 6, 2021

Choose a reason for hiding this comment

ajantha-bhat May 5, 2021

Choose a reason for hiding this comment

Indhumathi27 May 6, 2021

Choose a reason for hiding this comment

CarbonDataQA2 commented May 6, 2021

CarbonDataQA2 commented May 6, 2021

ajantha-bhat commented May 6, 2021

Indhumathi27 commented Apr 29, 2021 •

edited