New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SEDONA-186] Fix ordering of columns in results of queries like SELECT * A JOIN B #706
[SEDONA-186] Fix ordering of columns in results of queries like SELECT * A JOIN B #706
Conversation
2032442
to
83d31bb
Compare
sql/src/main/scala/org/apache/spark/sql/sedona_sql/strategy/join/TraitJoinQueryExec.scala
Show resolved
Hide resolved
logInfo(s"Planning spatial join for $relationship relationship") | ||
RangeJoinExec(planLater(planA), planLater(planB), a, b, spatialPredicate, extraCondition) :: Nil | ||
RangeJoinExec(planLater(planA), planLater(planB), a, b, swappedLeftAndRight, spatialPredicate, extraCondition) :: Nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these just be handled like the broadcast join below where the swapping is handled here with pattern matching instead of pushed into the Exec node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exec node has to handle the swapping somewhere, unless we invert the spatial predicate in the join query detector.
Broadcast join exec node resolves if left and right were swapped using indexBuildSide
and windowJoinSide
, it does not seem to be more intuitive than simply pushing the swapping indicator to the TraitJoinQueryExec
node.
Inverting the spatial predicate seems to be a viable approach, I'll submit changes to this PR if you're OK with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kimahriman Adam, does @Kontinuation 's approach "Inverting the spatial predicate" sound good to you? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to wrap my head around this a little bit. By inverting the spatial predicate do you mean like rewritting it? As in ST_Contains(a, b)
-> ST_Within(b, a)
or something like that?
What would happen if you just swapped the left and right shapes if needed? Like
matchExpressionsToPlans(a, b, left, right) match {
case Some((_, _, swapped)) =>
logInfo(s"Planning spatial join for $relationship relationship")
if (swapped) {
RangeJoinExec(planLater(left), planLater(right), b, a, spatialPredicate, extraCondition) :: Nil
} else {
RangeJoinExec(planLater(left), planLater(right), a, b, spatialPredicate, extraCondition) :: Nil
}
case None =>
logInfo(
s"Spatial join for $relationship with arguments not aligned " +
"with join relations is not supported")
Nil
}
I'm trying to think of why that wouldn't work
@Kontinuation I will merge this PR for now since we will release 1.3.0 in the next few days. Maybe we can address @Kimahriman 's comment in another PR. |
Did you read the Contributor Guide?
Is this PR related to a JIRA ticket?
What changes were proposed in this PR?
Fixes SEDONA-186 by properly assembling result rows of join queries when left and right sub plans were swapped.
How was this patch tested?
New tests were added to
SpatialJoinSuite
.Did this PR include necessary documentation updates?