-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add raster type & functions to manage rasters in the spatial extension #298
base: v1.0.0
Are you sure you want to change the base?
Conversation
Hi! Thanks for this PR, it's really cool! Im currently on vacation so I've only skimmed this quickly, but I have some concerns:
That said im not entirely against merging this, I just need some time to look it over once I get back to work. As I mentioned I've been toying with the idea of splitting all the GDAL functionality into a separate extension with additional drivers (and caveats) relying on VCPKG to build the dependencies, in which case I think this would fit in great. |
Also, you need to update the DuckDB submodule as the |
Thanks @Maxxen for your comments,
Yes, nothing is stored in the database. I am managing POINTERs (temporary objects) to avoid to load in RAM that so big objects such as rasters. I share your concerns. Maybe this implementation does not fit all use cases you are thinking, I see a raster as temporary object, loaded from external data sources, use them in a query and finally save to other external store if it is necesary. Load & save methods from/to BLOB can exist to read/write the database, but I do not know if this is the way to go.
Yes, you are right, this PR is adding more dependencies to GDAL, but studing PostGIS source code, many operations are implemented using GDAL as well, I thought this was as a valid solution to do not reinvent wheels. Anyway, we could mitigate this dependency separating this raster support in other new extension, the spatial extension could drop its GDAL dependency when it comes possible. Anyway, of course, feel free about this MR, this is a PoC, and you feel it raises unacceptable issues to fit in the DuckDB engine model, I am comfortable with that. |
Alright, let me have another look when I get back. I think the primary blocker right now is just how to deal with persistence. I agree that it might be a good idea to avoid copying the raster data itself into DuckDB as blobs, but I think there should be some way to "restore" a set of rasters from disk when opening a duckdb database with a raster table (if they are available on disk - otherwise throw an error or return NULL or something, we can see how PostGIS handles it). Maybe by storing a file path or an ID to some other lookup structure instead of the raw pointers. Regarding GDAL - let's maybe not worry about this too much right now. We already have it as a dependency, and it's probably going to stay for a while. If we just use the built-in raster drivers for now it won't add much complexity to the build. Thanks again for your work! |
9f701db
to
03fa25b
Compare
f5dbaaf
to
9ca99bb
Compare
is merging this on the roadmap? |
Hi everyone. I am thinking about this old PR. Maybe is it better to manage a raster as other tabular Dataframe? Something similar how RasterFrames is managing them? A raster could be serialized as a table of chunks or tiles. A parquet (Rasquet :-)) con several metadata columns (CRS, BBOX width, height, datatype, ...) and 1-N band columns with the binary for each raster band. |
This PR adds support to manage rasters (A wrapper of GDALDataset) in the extension.
It provides a new type
RASTER
, and a set of functions to load, process and save this data type.Description of this code here.