Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WMS does not show raster data hosted on private AWS S3 #946

Closed
sotosoul opened this issue Jul 21, 2023 · 8 comments
Closed

WMS does not show raster data hosted on private AWS S3 #946

sotosoul opened this issue Jul 21, 2023 · 8 comments

Comments

@sotosoul
Copy link

sotosoul commented Jul 21, 2023

Description

The WMS service of my datacube-ows instance does not deliver raster data hosted on private AWS S3.

Steps

I've set up a local dev/test instance of datacube-ows and run it with the flask approach. I've also configured all env vars as described in the documentation.

i.e.:

  • AWS_DEFAULT_REGION=us-west-2
  • AWS_REGION=us-west-2
  • AWS_NO_SIGN_REQUEST=0
  • AWS_SECRET_ACCESS_KEY=...
  • AWS_ACCESS_KEY_ID=...

My ows_conf.py file contains the following lines:

"s3_url": "https://bucket-name",
"s3_bucket": "bucket-name",
"s3_aws_zone": "us-west-2"

The following part works fine, indicating that datacube is able to fetch S3 data:

from datacube import Datacube

dc = Datacube()

dss = dc.find_datasets(product='s2_l2a_10m_v1')

# returns the correct xarray Dataset just fine:
data = dc.load(  
    datasets=dss,
    latitude=(55.52, 55.7),
    longitude=(12.6, 12.75),
    output_crs="EPSG:3857",
    resolution=(-100, 100),
)

Environment

conda list returns the following versions:

datacube                  1.8.15             pyhd8ed1ab_0    conda-forge
datacube-ows              1.8.34                   pypi_0    pypi
@whatnick
Copy link
Member

whatnick commented Jul 23, 2023

What error do you get in the stack ? There may be a baked in --no-sign-request somewhere. Have you tried to use OWS as a library in the same environment to render some layers and seen the results ?

Could you try "false" as the value for no-sign request and see the results. Reading the tests, looks like this is what is tested. We should improve the docs to clearly show the valid string values interpreted as true and false.

@sotosoul
Copy link
Author

sotosoul commented Jul 24, 2023

If I setLevel of logging to INFO, I can see that:

[2023-07-24 16:46:50,542] [INFO] S3 access configured with signed requests
[2023-07-24 16:46:50,543] [INFO] Establishing/renewing credentials
[2023-07-24 16:46:50,634] [INFO] Found credentials in environment variables.

indicating that there's no issue regarding absence of these env vars.

I suppose datacube-ows queries datacube, which returns xarrays. If this assumption holds, I could look into what's being returned and debug from there. What do you think?

Btw, changing to 'false' didn't have any effect...

@whatnick
Copy link
Member

All of the read handling using OWS is done by datacube, best to capture some intermediate results there and check. It would be also good to hear from the community using authenticated S3 buckets with OWS.

@valpesendorfer
Copy link
Contributor

We're running OWS with a private S3 bucket (disclaimer: we're running the fairly old version 1.8.18 of datacube-ows still)

In the config, the only S3 related item we have set is s3_aws_zone (so we don't specify name / url).

In the environment variables we set:

AWS_NO_SIGN_REQUEST=NO 
AWS_DEFAULT_REGION=eu-central-1

We don't set credentials as these are supplied by the attached role and acquired / refreshed by the datacube-ows app.

Can you confirm that

from datacube import Datacube

dc = Datacube()

dss = dc.find_datasets(product='s2_l2a_10m_v1')

# returns the correct xarray Dataset just fine:
data = dc.load(  
    datasets=dss,
    latitude=(55.52, 55.7),
    longitude=(12.6, 12.75),
    output_crs="EPSG:3857",
    resolution=(-100, 100),
)

is actually loading the data (it should) and not a dask array? Also, is that run from the same environment as OWS?

@sotosoul
Copy link
Author

sotosoul commented Aug 1, 2023

I confirm that the dc.load function I included returns a properly populated xarray.Dataset that I can visualize with matplotlib and see the actual raster data from the S3. And, yes, it's within the same env; the reason I included it is because I wanted to show exactly that, i.e., that the datacube can access my S3 rasters just fine. I'm not sure but it seems the problem lies between the datacube-core and the datacube-ows, if that makes sense...

@valpesendorfer
Copy link
Contributor

I remember it can be tricky to sort out loading issues between these layers. If I'd had to guess, either somehow requests are set to unsigned, the credentials are somehow not picked up, or you are using a different set of credentials that don't have the required permissions.

You probably need to dig into the logs a bit, set to debug and so on, to see what's happening behind the scenes with botocore / rasterio etc.

You say that OWS does not deliver raster data. Perhaps you can share an error message or any other hint to why that is the case.

@SpacemanPaul
Copy link
Contributor

SpacemanPaul commented Aug 10, 2023

Have you rerun datacube-ows-update --views; datacube-ows-update ?

OWS does not use the same search mechanism as core - it has a separate postgis index to the database that allows more accurate spatial searches, which must be maintained as per the documentation here.

@SpacemanPaul
Copy link
Contributor

I'm assuming we can close this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants