Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory when serving large rasters #50

Closed
mrpgraae opened this issue Aug 28, 2018 · 11 comments
Closed

Out of memory when serving large rasters #50

mrpgraae opened this issue Aug 28, 2018 · 11 comments

Comments

@mrpgraae
Copy link
Collaborator

I used an overview to compute metadata, to get around the issue in #49. When I serve the data in Terracotta, I sometimes see this:

[2018-08-28 14:41:43,573] ERROR in app: Exception on /singleband/20171231/3/3/5.png [GET]
Traceback (most recent call last):
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise
    raise value
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/phgr/terracotta/terracotta/api/flask_api.py", line 52, in inner
    return fun(*args, **kwargs)
  File "/home/phgr/terracotta/terracotta/api/singleband.py", line 83, in get_singleband
    parsed_keys, tile_xyz, **options
  File "/home/phgr/terracotta/terracotta/handlers/singleband.py", line 35, in singleband
    tilesize=tile_size)
  File "/home/phgr/terracotta/terracotta/xyz.py", line 27, in get_tile_data
    return driver.get_raster_tile(keys, bounds=target_bounds, tilesize=tilesize, nodata=nodata)
  File "/home/phgr/terracotta/terracotta/drivers/base.py", line 274, in get_raster_tile
    nodata=nodata
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/cachetools/__init__.py", line 87, in wrapper
    v = method(self, *args, **kwargs)
  File "/home/phgr/terracotta/terracotta/drivers/base.py", line 27, in inner
    return fun(self, *args, **kwargs)
  File "/home/phgr/terracotta/terracotta/drivers/base.py", line 208, in _get_raster_tile
    src.crs, target_crs, src.width, src.height, *src.bounds
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/rasterio/env.py", line 363, in wrapper
    return f(*args, **kwds)
  File "/home/phgr/.conda/envs/terracotta/lib/python3.6/site-packages/rasterio/warp.py", line 418, in calculate_default_transform
    src_crs, dst_crs, width, height, left, bottom, right, top, gcps)
  File "rasterio/_warp.pyx", line 646, in rasterio._warp._calculate_default_transform
  File "rasterio/_io.pyx", line 1664, in rasterio._io.InMemoryRaster.__cinit__
  File "rasterio/_err.pyx", line 188, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OutOfMemoryError: memdataset.cpp, 1545: cannot allocate 5816105575 bytes

Which becomes a 500 response. It happens when I zoom out a bit, which might indicate that this could be a problem with loading the overviews. The innermost (highest res) overview is 43846x33163, which corresponds to a size of 1.45 GB (the raster is uint8), so the attempted allocation of 5.8 GB looks like a cast to some 32-bit size dtype of the innermost overview.

@mrpgraae
Copy link
Collaborator Author

Why does Rasterio try to load the entire overview in order to calculate the default transform, is the real question here though.

@j08lue
Copy link
Collaborator

j08lue commented Aug 28, 2018

Why does Rasterio try to load the entire overview in order to calculate the default transform, is the real question here though.

Sounds like a question better asked rasterio.

@mrpgraae
Copy link
Collaborator Author

Sounds like a question better asked rasterio.

Sure, I just put it here as a note to us, that this is what causes the error.

@dionhaefner
Copy link
Collaborator

dionhaefner commented Aug 29, 2018

This looks bad. Rasterio / GDAL constructs an in-memory raster of the full size to compute the default transform. This can't be related to zoom level, either, since the arguments to calculate_default_transform don't depend on it:

https://github.com/DHI-GRAS/terracotta/blob/4c13d7484cce76b790fc58aa66750788404bfee8/terracotta/drivers/base.py#L207-L209

@j08lue
Copy link
Collaborator

j08lue commented Aug 29, 2018

Why would you need the raster to compute the transform...? Only to get the shape?

@dionhaefner
Copy link
Collaborator

To get the "best" resolution in the target CRS. Although I remember that we had some problems at high latitudes with that function, since it always keeps pixels square. I will experiment a little with non-square pixels in the VRT, that way we wouldn't need the default transform.

@mrpgraae
Copy link
Collaborator Author

This looks bad. Rasterio / GDAL constructs an in-memory raster of the full size to compute the default transform. This can't be related to zoom level, either, since the arguments to calculate_default_transform don't depend on it:

Yes, it's probably that Rasterio tries to allocate 2x the full size of the raster for some reason.

I will experiment a little with non-square pixels in the VRT, that way we wouldn't need the default transform.

Cool 👍 pretty ridiculous behaviour from Rasterio here, especially considering that we're using cloud-optimized rasters to get around having to load the entire thing.

@dionhaefner
Copy link
Collaborator

It's probably a GDAL limitation, but a pretty severe one if you think about it ...

@mrpgraae
Copy link
Collaborator Author

It's probably a GDAL limitation, but a pretty severe one if you think about it ...

Looks like you're right, since this is the call that fails and makes Rasterio raise the exception:

https://github.com/mapbox/rasterio/blob/fd412a6ea6f21f2d92e06091daad496425f690d8/rasterio/_io.pyx#L1665

@dionhaefner
Copy link
Collaborator

Seems to be a known issue: rasterio/rasterio#1131

Pretty sucky, looking into a workaround...

@mrpgraae
Copy link
Collaborator Author

rasterio/rasterio#1435

We pushed it off to post-1.0, but it's time to tackle it now.

-- Sean Gillies, 23 days ago

hopefulness intensifies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants