Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new type of spatial join query that returns the intersections directly #110

Closed
jiayuasu opened this issue Aug 7, 2017 · 11 comments

Comments

@jiayuasu
Copy link
Member

commented Aug 7, 2017

Since several users are asking this feature, GeoSpark will officially add the support of this function soon.

The description of this function is as follows:
Given two polygon datasets RDD1 and RDD2, find the polygonal intersection between every polygon in RDD1 and every polygon in RDD2.

Currently GeoSpark solution before the upcoming release:
Step 1. Do spatial join query between RDD1 and RDD2 and get such a PairRDD

<PolygonFromRDD1, IntersectedPolygonFromRDD2, IntersectedPolygonFromRDD2, IntersectedPolygonFromRDD2...>
<PolygonFromRDD1, IntersectedPolygonFromRDD2, IntersectedPolygonFromRDD2, IntersectedPolygonFromRDD2...>
<PolygonFromRDD1, IntersectedPolygonFromRDD2, IntersectedPolygonFromRDD2, IntersectedPolygonFromRDD2...>
...
...

Step 2. Do a Map on the PairRDD. This Map will do a transformation on each tuple in PairRDD. The transformation pseudo is as follows:

List polygonalIntersections = new ArrayList();
for(Polygon intersectedPolygon : List Of intersectedPolygons from RDD2)
{
 Polygon polygonalIntersection = PolygonFromRDD1.intersection(intersectedPolygon);
 polygonalIntersections.add(polygonalIntersection);
}
return new Tuple2(PolygonFromRDD1, polygonalIntersections);

@jiayuasu jiayuasu self-assigned this Aug 7, 2017

@jiayuasu jiayuasu added this to the 0.9.0 milestone Oct 31, 2017

@jiayuasu jiayuasu modified the milestones: 0.9.0, 1.1.0 Dec 12, 2017

@MikeChenfu

This comment has been minimized.

Copy link

commented Feb 1, 2018

Thanks for the great tool.

I am trying to find intersection area between each polygon in RDD1 and each polygon in RDD2.

I just want know if this function is working. I check the SpatialJoinShp.scala file and see some comments .
//val query = JoinQuery.SpatialJoinQuery(wdpa, species, false, false) // val join_result = query.rdd.map((tuple: (Polygon, util.HashSet[Polygon])) => (tuple._1, tuple._2.asScala.map(tuple._1.intersection(_).getArea)) ) // val intersections = join_result.collect()

I try to compile and run them but get errors. If possible, could you provide a sample code ? Thank you so much.

@jiayuasu

This comment has been minimized.

Copy link
Member Author

commented Feb 2, 2018

@MikeChenfu

I suggest you use JoinQuery.SpatialJoinQueryFlat() API. This is a flatMap version of SpatialJoinQuery API. This will make it easier for you to return all intersection.

It will return a <Polygon, Polygon> pair RDD. The two polygons intersect each other.

The following sentence is the correct code to return the intersection of each <Polygon,Polygon> pair. I have tested it.

val resultSize = JoinQuery.SpatialJoinQueryFlat(objectRDD,queryWindowRDD,true,false).rdd.map[Double](f =>f._1.intersection(f._2).getArea)
@MikeChenfu

This comment has been minimized.

Copy link

commented Feb 5, 2018

@jiayuasu
Thanks for your help. It works well.
Could you tell me where to define the "getArea“ function? I look through the code but get nothing. (It is my first time to use Geospark library, please forgive me if the question is stupid)
Next step, I plan to find the intersection polygon between any two polygons in RDD1 and RDD2. Many thanks if you have any suggestions.

@jiayuasu

This comment has been minimized.

Copy link
Member Author

commented Feb 6, 2018

Hi @MikeChenfu ,

To get the intersection polygon, you just don't call getArea.
The code is something like this:

val result = JoinQuery.SpatialJoinQueryFlat(objectRDD,queryWindowRDD,true,false).rdd.map[Geometry](f =>f._1.intersection(f._2))

getArea code is in JTSplus, a customized JTS library for GeoSpark. It is not GeoSpark repository. Its own repo is here:
https://github.com/jiayuasu/JTSplus

Original JTS repo is here:
https://github.com/locationtech/jts

@MikeChenfu

This comment has been minimized.

Copy link

commented Feb 6, 2018

@jiayuasu
JoinQuery.SpatialJoinQueryFlat returns nothing when I use csv File instead of Shapefile. I may make something wrong.

The code is to get data from CSV file(WKT format).
val objectRDD = new PolygonRDD(sc, PolygonRDDInputLocation, PolygonRDDStartOffset, PolygonRDDEndOffset, PolygonRDDSplitter, true) val queryWindowRDD = new PolygonRDD(sc, PolygonRDDInputLocationT, PolygonRDDStartOffset, PolygonRDDEndOffset, PolygonRDDSplitter, true)
And I analyze them according to the mention of compiler.

objectRDD.analyze(); queryWindowRDD.analyze(); objectRDD.spatialPartitioning(GridType.QUADTREE) queryWindowRDD.spatialPartitioning(wdpa.partitionTree)

The return value is NULL when I use JoinQuery.SpatialJoinQueryFlat. I think it should be return a <Polygon, Polygon> pair RDD.
val result1 = JoinQuery.SpatialJoinQueryFlat(objectRDD,queryWindowRDD,false,false) result1.take(1).foreach(line => println(line))

@jiayuasu

This comment has been minimized.

Copy link
Member Author

commented Feb 7, 2018

@MikeChenfu
Why did you partition your queryWindowRDD using wdpa.partitionTree instead of objectDD.getPartitioner?

@MikeChenfu

This comment has been minimized.

Copy link

commented Feb 7, 2018

@jiayuasu
It is a typo. "wapa" is the used name of objectDD. I am so sorry for that.

objectRDD.spatialPartitioning(GridType.QUADTREE) queryWindowRDD.spatialPartitioning(objectRDD.getPartitioner)
objectDD contains 6 polygons and queryWindowRDD also contains 6 polygons. I am trying to find the problems. I really appreciate it if you have any suggestions.

@jiayuasu

This comment has been minimized.

Copy link
Member Author

commented Feb 8, 2018

@MikeChenfu Are you sure your polygons have intersection? We can chat over GeoSpark GitterChat for a faster communication.

@MikeChenfu

This comment has been minimized.

Copy link

commented Feb 8, 2018

@jiayuasu
Thanks for your kind help. I am still working on it and may find the problems.
When JoinQuery.SpatialJoinQueryFlat(objectRDD,queryWindowRDD,true,true) is used, I will get correct result.

@jiayuasu

This comment has been minimized.

Copy link
Member Author

commented Feb 8, 2018

@MikeChenfu

Ohhhh! I forgot to tell you, the last parameter in SpatialJoinQueryFlat is ConsiderBoundaryIntersection.

You must set it as "true" in order to find intersected polygon. Otherwise it will only return polygon that are full contained by the other one.

The API explanation is here: http://www.public.asu.edu/~jiayu2/geospark/javadoc/0.9.1/

@MikeChenfu

This comment has been minimized.

Copy link

commented Feb 8, 2018

@jiayuasu
Got it. Many thanks for your kind help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.