-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-4166] Geo spatial Query Enhancements #4127
Conversation
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3542/ |
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5289/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3545/ |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5290/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3549/ |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5294/ |
geo/src/main/java/org/apache/carbondata/geo/scan/expression/PolygonListExpression.java
Outdated
Show resolved
Hide resolved
geo/src/main/java/org/apache/carbondata/geo/scan/expression/PolygonRangeListExpression.java
Outdated
Show resolved
Hide resolved
@@ -30,6 +32,7 @@ object GeoUtilUDFs { | |||
sparkSession.udf.register("LatLngToGeoId", new LatLngToGeoIdUDF) | |||
sparkSession.udf.register("ToUpperLayerGeoId", new ToUpperLayerGeoIdUDF) | |||
sparkSession.udf.register("ToRangeList", new ToRangeListUDF) | |||
sparkSession.udf.register("ToRangeListAsString", new ToRangeListAsStringUDF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these UDF are exposed to user right ? can we update in the document ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This udf is not exposed to user. This is for internal purpose only.
val matchedStr = matcher.group | ||
range = matchedStr | ||
} | ||
val ranges = PolygonRangeListExpression.getRangeListFromString(range) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we need to check against null
for ranges? some places we check and some places we don't. (example ToRangeListAsStringUDF
) can we make it uniform? If not required, remove from other places also. also maybe extract a common method for matcher and find if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NULL check is already handled here in line:49. Handled some refactoring to extract common code to new method.
import org.apache.carbondata.geo.scan.expression.PolygonRangeListExpression | ||
import org.apache.carbondata.spark.rdd.CarbonScanRDD | ||
|
||
case class BroadCastPolygonFilterPushJoin( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many things are common with BroadCastSIFilterPushJoin, we cannot extend the same class ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BroadCastPolygonFilterPushJoin implementation is not same as BroadCastSIFilterPushJoin in terms of filter and method definition. so, better to keep it seperate.
...ark/src/main/scala/org/apache/spark/sql/execution/joins/BroadCastPolygonFilterPushJoin.scala
Show resolved
Hide resolved
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DMLStrategy.scala
Outdated
Show resolved
Hide resolved
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5308/ |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3563/ |
LGTM |
Why is this PR needed?
specified in SQL. If the polygon list grows in size, then the SQL will also be too long,
which may affect query performance, as SQL analysing cost will be more.
table join can be supported in order to support aggregation on spatial table columns
based on polygons.
What changes were proposed in this PR?
polygon table.
Does this PR introduce any user interface change?
Is any new testcase added?