-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRS comparison is extremely expensive #3022
Comments
@pomadchin #2890 is a separate use case.... indeed slow to read, but this ticket is about Here are the benchmarks from RasterFrames: Standard GeoTrellisWith RasterFrame Hacks(Sorry, changed the number of benchmarks during the process). |
@metasim appreciate it! do you know it was in the codebase forever or it was some sort of regression? |
@pomadchin In the codebase forever.... we broke it with DataFrames containing a million CRSs 😈 . |
Again, see here for the hacks we've been using: |
On RasterFrames we are having major performance issues with comparisons between
CRS
s, and are almost out of hacks to work around it. I suspect our usage patterns have broken past assumptions on this. Here's some more context, including some profiling results: locationtech/rasterframes#134.In RasterFrames we use a type called
ProjectedRasterTile
which is what is pushed around the most through the API. This results in one or more CRSs being implicitly included in every row. Whenjoin
s happen, invariably CRSs are involved, whereCRS.equals
is usually called, which is extremely expensive. Hence the use ofLazyCRS
. But determining!=
between CRSs is still very expensive.I've thought about hitting the GT codebase to construct a fix, but that's pretty delicate code and I'm wary about doing it on my own. Interested in discussion on this.
The text was updated successfully, but these errors were encountered: