Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoupling the bounding box from the AOIs #33

Closed
Tracked by #40
joshuacortez opened this issue Jun 13, 2022 · 7 comments · Fixed by #47
Closed
Tracked by #40

Decoupling the bounding box from the AOIs #33

joshuacortez opened this issue Jun 13, 2022 · 7 comments · Fixed by #47
Labels
enhancement New feature or request

Comments

@joshuacortez
Copy link
Contributor

In the GridGenerator class here, the overall bounding box (minx, miny, maxx, maxy) are automatically derived from the projected gdf. Perhaps it would be good to make the overall bounding box be an optional parameter. If None, it could automatically compute the bounding box from the gdf, otherwise it uses the user defined bounding box.

Having a user defined bounding box is useful in making consistently defined grids. This is useful in cases where our AOIs are not necessarily encompass the entire country. We can define our overall bounding box based on the country admin boundaries, and regardless of what AOI we supply as a gdf input, the x and y coordinates of the grid tiles remain consistent.

@jtmiclat jtmiclat added the enhancement New feature or request label Jun 13, 2022
@butchtm
Copy link
Collaborator

butchtm commented Jun 13, 2022

@joshuacortez @jtmiclat
We could add a new constructor parameter and attribute total_bounds to GridGenerator which set to None by default.
then in the get_ranges method it would use this total_bounds instead of reprojected_gdf.total_bounds if it is not none.
What do you think?

@jtmiclat
Copy link
Contributor

jtmiclat commented Jun 13, 2022

@joshuacortez Can you clarify and give more instances a custom bounding box would be useful?

We can define our overall bounding box based on the country admin boundaries, and regardless of what AOI we supply as a gdf input, the x and y coordinates of the grid tiles remain consistent.

For this instance, I think having an optional parameter that skips the intersects would make sense more than a custom bounding box. It would skip this line

final = dest_reproject[dest_reproject.intersects(self.gdf.unary_union)]

So the workflow would be

country = gpd.frome_file()
aoi = gpd.from_file()
grids = GridGenerator(country, grid_size=10, intersect=False).generate_grids()
aoi_grids = grids.interesects(aoi)

You can then store grids and reuse them if a new aoi comes along. Rather than regenerating per aoi

@joshuacortez
Copy link
Contributor Author

Hmmm, my concern with the proposed workflow though is it forces us to generate grid tiles across the country, which is very memory and time intensive for fine grids. (also discussed in #35 ). The grids gdf can easily exceed typical memory limits of 16 GB

Example use case:
We want to generate 100m x 100m grid tiles across the country, but only in particular areas of interest (e.g. only areas with fishponds, or only populated areas). Ideally, no matter what area of interest we use, the x,y coordinates pertain to the same grid tile because it's based on the same overall country bounding box.

@jtmiclat
Copy link
Contributor

jtmiclat commented Jun 13, 2022

I still don't fully understand how a custom bounding box would help with the performance improvements but it might help if we have some code for this and #35

@tm-kah-alforja tm-kah-alforja mentioned this issue Jun 14, 2022
9 tasks
@joshuacortez
Copy link
Contributor Author

Ahh having a custom bounding box is more for coordinate system consistency rather than performance per se.

I have some sample code but it's from a project in a private repo. How could we best go about this?

@jtmiclat
Copy link
Contributor

@joshuacortez Feel free to just copy-paste the code sample here!

@joshuacortez
Copy link
Contributor Author

Here's sample usage from how I used it before. Functions are defined in this gist

Step 1: Get all index pairs (x_idx, y_idx) of cells that intersect the AOIs

# ref_bbox is (minx, miny, maxx, maxy)
# aoi_polys is a list of polygons
# x_offset and y_offset are float

grid_idx, x_spacing, y_spacing = make_gridding(ref_bbox, aoi_polys, x_offset, y_offset)

Step 2: Convert the index pairs into actual geometries

# grid_idx is a set of index pairs 
# x_spacing and y_spacing are lookup tables of index pairs to projected coordinates
# proj_crs is a string
# gridding is a gdf of grid cells

gridding = gridding_to_tiles(grid_idx, x_spacing, y_spacing, proj_crs)
gridding = gridding.to_crs("epsg:4326")

You could collapse the two steps together (creating the geometries right away instead of an intermediate step of saving the index pairs). But I split them together here to assess the memory usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants