-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for seamless layers #10
Comments
The notable difference to non-seamless is that the dataset to look up needs to be determined from A workaround would be to require the user to create a single (possibly gigantic) GeoTiff, or to add all data files as separate layers and display them on top of each other. |
I just learned about terracotta (really awesome tool!) and this feature would be great. I frequently have a collection of rasters that are tiled, and would like to serve them as a single layer without merging them into a gigantic single GeoTiff. |
Maybe the best solution to larger-than-GeoTIFF layers is to store the data not in GeoTIFF tiles (e.g. Sentinel 2 tiles) but in even smaller chunks (say 256x256) in a fast database, like GEE seems to be doing it. But that might be beyond the scope of Terracotta - would be a hell of a driver... |
I've fallen out of love with seamless layer support because there are so many micro-decisions to make for relatively little gain (mainly how to handle overlap). But we could look into supporting GDAL-style VRTs instead, and offload all the heavy lifting to GDAL / rasterio. That way we'd hopefully only need very little additional driver code. |
If that is the main concern, we could require that there is no overlap (or undefined behavior in such case). But there are probably more?
VRTs over S3 files? - But why not? As far as I know, |
I'm mostly concerned about numerical roundoff and off-by-one errors. I'd expect it to take a lot of fiddling to get right. |
There is a vrt module in rasterio, for in-memory vrt. |
Yes, we use those extensively. I was thinking of the GDAL VRTs in XML form that allow you to combine datasets transparently ahead of time. |
@dionhaefner , ah I see. I've created VRT's with terracotta serve -r /{name}/{date}_{band}_{}.vrt |
It might actually work by accident if |
Okay, gave it a shot, but it didn't work. Here was my setup. I already had rasters on S3 that were optimized via Build list to build vrt: aws s3 ls s3://bucketname/prefix/ | \
rev | cut -d\ -f1 | rev | \
sed 's|^|/vsis3/bucketname/prefix/|g' \
> 20190315_testing_filelist Build the VRT: gdalbuildvrt -input_file_list 20190315_testing_filelist 20190315_testing.vrt Serve the vrt (from EC2 instance): terracotta serve --allow-all-ips --port 8787 -r ./{date}_{name}.vrt Connect from local machine: terracotta connect INSTANCE-IP:8787 At this point I get the preview app open within a browser, and I can see the dataset listed. When I put the cursor over that row, I get the correct outlined area on the map (plus the low-res sample image). But when I select it nothing is displayed, however I do get the correct limits set on the "Adjust contrast" slider bar. When I pan/zoom it looks like the server is getting requests, just not displaying anything.
Server output:
Connection output:
|
@mccarthyryanc can you post here the VRT file you created? |
That is very reassuring. If the preview image works, tile retrieval should work, too. And correct limits and footprint means the metadata computation works, too. I'm just a bit confused about all the |
@j08lue I can't provide the exact VRT, contains private info. But here is an example VRT that is pretty much the same. It was built with plain <VRTDataset rasterXSize="61000" rasterYSize="56000">
<SRS>PROJCS["NAD83(2011) / UTM zone 16N",GEOGCS["NAD83(2011)",DATUM["NAD83_National_Spatial_Reference_System_2011",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","1116"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree"$
<GeoTransform> 6.1400000000000000e+05, 1.0000000000000000e+00, 0.0000000000000000e+00, 3.8120000000000000e+06, 0.0000000000000000e+00, -1.0000000000000000e+00</GeoTransform>
<VRTRasterBand dataType="Float32" band="1">
<NoDataValue>-9999</NoDataValue>
<ColorInterp>Gray</ColorInterp>
<ComplexSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_1.tif</SourceFilename>
<SourceBand>1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Float32" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="0" yOff="25000" xSize="1000" ySize="1000" />
<NODATA>-9999</NODATA>
</ComplexSource>
<ComplexSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_2.tif</SourceFilename>
<SourceBand>1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Float32" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="1000" yOff="26000" xSize="1000" ySize="1000" />
<NODATA>-9999</NODATA>
</ComplexSource>
<ComplexSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_3.tif</SourceFilename>
<SourceBand>1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Float32" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="60000" yOff="41000" xSize="1000" ySize="1000" />
<NODATA>-9999</NODATA>
</ComplexSource>
<ComplexSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_4.tif</SourceFilename>
<SourceBand>1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Float32" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="60000" yOff="40000" xSize="1000" ySize="1000" />
<NODATA>-9999</NODATA>
</ComplexSource>
</VRTRasterBand>
<MaskBand>
<VRTRasterBand dataType="Byte">
<SimpleSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_1.tif</SourceFilename>
<SourceBand>mask,1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Byte" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="0" yOff="25000" xSize="1000" ySize="1000" />
</SimpleSource>
<SimpleSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_2.tif</SourceFilename>
<SourceBand>mask,1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Byte" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="1000" yOff="26000" xSize="1000" ySize="1000" />
</SimpleSource>
<SimpleSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_3.tif</SourceFilename>
<SourceBand>mask,1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Byte" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="60000" yOff="41000" xSize="1000" ySize="1000" />
</SimpleSource>
<SimpleSource>
<SourceFilename relativeToVRT="0">/vsis3/bucketname/TESTING/20190314_rastername_4.tif</SourceFilename>
<SourceBand>mask,1</SourceBand>
<SourceProperties RasterXSize="1000" RasterYSize="1000" DataType="Byte" BlockXSize="256" BlockYSize="256" />
<SrcRect xOff="0" yOff="0" xSize="1000" ySize="1000" />
<DstRect xOff="60000" yOff="40000" xSize="1000" ySize="1000" />
</SimpleSource>
</VRTRasterBand>
</MaskBand>
|
@dionhaefner and @j08lue, something strange happened and it actually did work... However, it took probably 30min before the tiles were rendered. I tried to retest today, saw that nothing was rendering and then accidentally left the preview app open on a browser and the server running. After something like 30min tiles started popping up! So, maybe this is something to do with running terracotta server on an EC2 instance and then terracotta connect on a local machine?
Perhaps terracotta is reverting to non cloud optimized geotiffs reads when given a VRT file, even though the underlying rasters are cloud optimized? |
Your log even says that:
That's certainly bad news; maybe there is something we can do to optimize the VRT file, otherwise we might have to take it up with Rasterio / GDAL people. But you said the preview loaded fast? |
@dionhaefner , it loaded faster than the tiles on the map, but still slower than serving a large geotiff. |
From the GDAL docs:
But there seem to be ways to build overviews for VRTs ( |
Maybe there is some neat way to create a scale-dependent combining internal (GeoTIFF) overviews and VRT-level ones, like mentioned here:
GDAL black magic... |
I took the discussion here: https://rasterio.groups.io/g/main/topic/30751150#167 If everything works as it should, the performance issues should only be present for low zoom levels, for which we don't have overviews. |
@dionhaefner, I'm not sure I follow. Based on the groups.io post the workflow is:
Does step 3 require the same overiew levels as provided during the COG creation step? |
Sorry, haven't pursued this any further. I think you got it right, just two points:
All of this is pretty experimental and I don't have any idea whether it will work. But it would be cool if you could play with it and report back. |
@dionhaefner sorry for the long wait, but I finally found some time to test this out! TL;DR: It works until I zoom in too far on the map. I created a new branch and conda env to keep things isolated: # New conda env
conda create -n tc-vrt python=3.6 -c conda-forge
conda activate tc-vrt
pip install git+https://github.com/mccarthyryanc/terracotta.git@RIO_ENV_KEYS I had a collection of 873 tiles that I added to a text file /vsis3/bucket/prefix/tile1.tif
/vsis3/bucket/prefix/tile2.tif
...
/vsis3/bucket/prefix/tile873.tif Built the VRT and overviews: # Build VRT: This step took 1 min and 41 sec to finish
gdalbuildvrt -input_file_list vrt_filelist ovr_testing.vrt
# Build overviews: This step took 4 mins ans 52 sec to finish. OVR file was 385M
gdaladdo -ro --config COMPRESS_OVERVIEW DEFLATE ovr_testing.vrt 2 4 8 16 Serve and connect like usual: # Serve from EC2 instance
terracotta serve --allow-all-ips --port 8788 -r ./{type}_{name}.vrt
# Connect from local instance
terracotta connect INSTANCE-IP:8788 At this point I can pan and zoom without issues, but subjectively it does seem a little slower than when using a single large GeoTiff. If I zoom in too far, I see no tiles and the server spits out the following error: Server Error
I imagine the error is the result of my choice in overview levels. The tiles did not form a nice grid. I did try this with a subset of tiles that made a 4x4 and built overview levels at 2 and 4 (like you suggested). This worked without any issues. What I didn't try was uploading the VRT/OVR files to S3 and inserting an entry to MySQL via terracotta driver to make it work from Lambda. But at this point I feel like the data has jumped through a few too many hoops for me to use this in production... I think would rather wait (willing to help where I can) until terracotta supported seamless layers. |
Thanks for reporting back!
Does this file exist / contain a valid GTiff? |
Yep, file exists and is a valid GeoTiff. |
Very odd, the error message reads almost like #139. If we could only get hold of a small example that demonstrates this bug... |
I've tried recreating this issue with a small set of rasters that I can share, but haven't had any luck. @dionhaefner in your original post you mention another workaround: adding all rasters as separate layers and displaying them on top of eachother. Do you have any advice on trying that? I'm assuming this would be part of any viewing app. For example, adding a ton of XYZ TileLayers to a leaftlet map? |
Thanks for trying! I think I might ask someone from mapbox to have a look at this, maybe they have an idea what's going wrong.
Yes, exactly, you would do that in the client. But it does become infeasible at some point (I wouldn't do it for hundreds of layers), because then Terracotta gets bombarded for requests for tiles that are out of range. I think the only fully supported solution is to create one or a handful of big GeoTIFFs 😕 |
👋 Because I've never been able to work with VRT (and that you mostly need to create external overviews) I've been working on a simple (but fast) solution to get mosaic tiles from multiple COGs, the result is explained in https://medium.com/devseed/cog-talk-part-2-mosaics-bbbf474e66df The general idea is to create a We have an implementation example over https://github.com/developmentseed/cogeo-mosaic mosaicJSON is still a WIP (work in progress) and we are looking for feedback 😄 . Please feel free to ping me if you have any questions. |
Do I understand that correctly that you don't have overviews for the entire mosaic? So if I zoom out rerally far it touches all of the COGs? |
@dionhaefner yes you are correct, we don't generate overviews for the This is not perfect but still better than having to pre-generate static file (vrt overview). |
@dionhaefner , do you think seamless layers will make it into terracotta? Or should I focus on cogeo-mosaic for this functionality? |
I am sort of waiting for an official solution from the GDAL / rasterio side. I don't think the use case is important enough to warrant significant feature creep in Terracotta. Up until that point, we will still require users to merge their datasets before serving. By the way, what makes merging so inconvenient for you? It should be fairly insignificant compared to cloud-optimizing and other processing steps? |
@dionhaefner , sorry for the silence, other priorities got in my way. I wanted to avoid merging because a single "layer" would have ~10K tiles to merge. I wanted to keep storage costs down by working off just the tiles. I'll play around with merging and see how that works out. I could always delete the base tiles after the merge. |
No description provided.
The text was updated successfully, but these errors were encountered: