Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize the pdgstaging library #7

Closed
robyngit opened this issue Aug 5, 2022 · 3 comments
Closed

Modularize the pdgstaging library #7

robyngit opened this issue Aug 5, 2022 · 3 comments
Assignees
Labels
enhancement New feature or request pdg

Comments

@robyngit
Copy link
Member

robyngit commented Aug 5, 2022

To eventually support more flexible workflows, the viz-staging package first needs to be broken down into smaller, more flexible classes and functions.

@robyngit robyngit added the enhancement New feature or request label Aug 5, 2022
@robyngit robyngit self-assigned this Aug 5, 2022
robyngit added a commit that referenced this issue Aug 5, 2022
@robyngit robyngit added the pdg label Aug 5, 2022
robyngit added a commit that referenced this issue Aug 9, 2022
robyngit added a commit that referenced this issue Aug 11, 2022
robyngit added a commit that referenced this issue Aug 12, 2022
robyngit added a commit that referenced this issue Aug 12, 2022
robyngit added a commit that referenced this issue Aug 19, 2022
robyngit added a commit that referenced this issue Aug 19, 2022
robyngit added a commit that referenced this issue Aug 22, 2022
robyngit added a commit that referenced this issue Aug 22, 2022
robyngit added a commit that referenced this issue Aug 30, 2022
- Remove __sjoin_polygons__ (not faster)
- Make which_cells method
- Add tile-specific methods to TMSGrid
- Update TileStager to make use of Grid changes
- Remove unused imports
- Allow setting tile index as GDF multi-index an option

Relates to #7
@robyngit
Copy link
Member Author

Changes made thus far while working on this issue have resulted in some performance improvements. The new Grid class takes advantage of the grid's uniform structure to make faster versions of GeoPanda's overlay and sjoin methods. I replaced the GeoPandas.overlay and GeoPandas.sjoin methods in the Tile class with the Grid equivalents. In a small test of staging 15 IWP files, the new methods resulted in staging times that were almost 10% faster overall, and the gains in speed were relative to the filesize (i.e. the new methods saved the most time for the largest files).

Since a 10% improvement in performance might help with our current work on processing the IWP dataset, I've merged all of the changes made at this point into the develop branch. These changes DO NOT impact the API. In other words, everything should run as it did before, except with better staging times (FYI: @KastanDay)

@robyngit
Copy link
Member Author

robyngit commented Sep 1, 2022

Just met with Chunli & others about her ArcticDEM change detection data, which provides another use case to think about while working on making this workflow more modular and flexible

Here are some details about the data & requirements:

  • The data comprises 2km x 2km GeoTiff tiles that cover the arctic
  • These tiles overlap somewhat, and we should display data from just one of the tiles in the areas of overlap (rather than taking some aggregate of pixels at the same location)
  • The best way to determine which tile is preferred in areas of overlap is to read in data from another set of 2km x 2km tiles: for example, there is another layer of tiles that shows the # of DEMs used to calculate the result for each pixel. We could choose the tile with the most DEMs overall.
  • The workflow will involve deduplicating areas of overlap, as described above, in addition to essentially re-tiling the data to correspond with an OGC Two Dimensional Tile Matrix Set.

@robyngit
Copy link
Member Author

The new Grid class can be used independently of the staging step, making the library more modular than before. We can open up new issues that layout the specific tasks to be completed if we decide more modularization is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pdg
Projects
None yet
Development

No branches or pull requests

1 participant