Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WCS 2.0.1 GetCoverage request should not return multipart response when mediaType unset #6551

Closed
arbakker opened this issue Jun 29, 2022 · 19 comments

Comments

@arbakker
Copy link

Expected behavior and actual behavior.

Expected behaviour:

A WCS 2.0.1 GetCoverage request without mediaType=multipart/related and format=image/tiff parameter should return a response with Content-Type=image/tiff.

Only WCS 2.0.1 GetCoverage with mediaType=multipart/related parameter should return a multipart response, with Content-Type=multipart/related; boundary=wcs.

Actual behaviour:

WCS 2.0.1 GetCoverage requests with format=image/tiff parameter and without mediaType=multipart/related parameter return a response with Content-Type=multipart/related; boundary=wcs.

Background issue

Note: this issue has already been addressed in #3666, but since that is quite ancient and closed opening a new issue.

The WCS 2.0.1 specification (page 32) states regarding multipart response the following:

  1. Below Requirement 27

    The encoding format in which the coverage will be returned is specified by the combination of format and mediaType parameter. Admissible values (i.e, formats supported) are those listed in the server’s Capabilities document. Default is the coverage’s Native Format.

  2. Requirement 29 /req/core/getCoverage-acceptable-mediaType

    If a GetCoverage request contains a mediaType parameter then this parameter shall contain a MIME type identifier of fixed value "multipart/related".

  3. Table 16

    Name Definition Data Type Multiplicity
    mediaType if present, enforces a multi-part encoding anyURI, fixed to multipart/related zero or one (optional)
    format MIME type identifier of the format in which the coverage returned is encoded anyURI zero or one (optional)

This states clearly that when the mediaType parameter is set to multipart/related the response must have multi-part encoding.

Requirement 27 states that when the mediaType parameter is not set the response should have adhere to the format parameter. And thus should not return a multipart/related respones. This was also proposed in the old issue as well.

Steps to reproduce the problem.

Note: see Attached Test Case

To reproduce the issue first start this Docker container, which runs MapServer v7.6.4 on http://localhost/?service=WCS&request=getcapabilities:

docker run \
   --rm -d \
   -e MS_MAPFILE=/srv/data/service.map \
   -p 80:80 \
   --name mapserver-example \
   -v `pwd`/example:/srv/data \
   pdok/mapserver:7.6.4-buster-lighttpd

Then run the following Bash script:

get_cov_100="http://localhost/?service=WCS&version=1.0.0&request=GetCoverage&coverage=coverage&crs=EPSG:28992&response_crs=EPSG:28992&bbox=190365.000,444075.000,190465.000,444175.000&format=image/tiff&resx=0.5&resy=0.5"
get_cov_201="http://localhost/?service=WCS&Request=GetCoverage&version=2.0.1&CoverageId=coverage&format=image/tiff&subset=x(190365.000,190465.000)&subset=y(444075.000,444175.000)"
get_cov_201_multipart="http://localhost/?service=WCS&Request=GetCoverage&version=2.0.1&CoverageId=coverage&format=image/tiff&subset=x(190365.000,190465.000)&subset=y(444075.000,444175.000)&mediaType=multipart/related"
curl -s -o /dev/null -w 'WCS 1.0.0 - %{content_type}\n' "$get_cov_100"
curl -s -o /dev/null -w 'WCS 2.0.1 - no multipart - %{content_type}\n' "$get_cov_201"
curl -s -o /dev/null -w 'WCS 2.0.1 - multipart - %{content_type}\n' "$get_cov_201_multipart"

My expectation is that the WCS 2.0.1 - no multipart request returns a response only containing a geotiff file with a Content-Type=image/tiff response header.

Attached Test Case

example.zip

@arbakker
Copy link
Author

Other related issue: https://trac.osgeo.org/mapserver/ticket/3666

@arbakker
Copy link
Author

arbakker commented Sep 1, 2022

Just found out that by adding: CONFIG "GDAL_PAM_ENABLED" "NO" to the mapfile the WCS 2.0.1 GetCoverage requests behave as expected.

To add to that, the default value of GDAL_PAM_ENABLED is build dependent, so it might not be necessary to set the config option in the mapfile if GDAL has been build with GDAL_PAM_ENABLED=NO as default value. See the GDAL docs.

I suspect this is the case for camptocamp/docker-mapserver, since I did not see the multipart response occuring with that Docker image.

@arbakker arbakker closed this as completed Sep 1, 2022
@jratike80
Copy link

jratike80 commented Sep 1, 2022

I wonder why do you need to add CONFIG "GDAL_PAM_ENABLED" "NO" I do not use that but image/tiff has always returned a single part GeoTIFF for me. Do you use plain TIFF (without Geo) as outputformat? Then the georeferencing that cannot be put inside TIFF must be written somewhere else, or skipped as you did now. Here is the outputformat from one of my mapfiles.

OUTPUTFORMAT
  NAME GEOTIFF
  DRIVER "GDAL/GTiff"
  MIMETYPE "image/tiff"
  IMAGEMODE BYTE
  EXTENSION "tif"
 FORMATOPTION "PHOTOMETRIC=RGB"
 END

If you do create a GeoTIFF then there must be some other metadata, either coming from the source data or added by Mapserver, that GDAL cannot store into GeoTIFF tags and that triggers the creation of PAM (.aux.xml).

@arbakker
Copy link
Author

arbakker commented Sep 1, 2022

I think it is caused by the FORMATOPTION "NULLVALUE=3.402823466385289e+38" in the outputformat config. Since the .aux.xml contains only the nodata value:

<PAMDataset>
  <PAMRasterBand band="1">
    <NoDataValue le_hex_equiv="010000E0FFFFEF47">3.40282346638529E+38</NoDataValue>
  </PAMRasterBand>
</PAMDataset>

Outputformat config:

OUTPUTFORMAT
      NAME          "GEOTIFF_FLOAT32"
      DRIVER        "GDAL/GTiff"
      MIMETYPE      "image/tiff"
      IMAGEMODE     FLOAT32
      EXTENSION     "tif"
      FORMATOPTION  "NULLVALUE=3.402823466385289e+38"
      FORMATOPTION  "COMPRESS=DEFLATE"
      FORMATOPTION  "PREDICTOR=3"
      FORMATOPTION  "RESAMPLING=bilinear"
END # OUTPUTFORMAT

I suppose if you do not have data with NODATA values you will not be running into this issue.

Another explanation could be that GDAL has been compiled with a default of NO for GDAL_PAM_ENABLED.

@jratike80
Copy link

jratike80 commented Sep 1, 2022

I suppose that in this case GDAL should rather save the nodata value into TIFFTAG_GDAL_NODATA ASCII tag as documented in https://gdal.org/drivers/raster/gtiff.html#nodata-value.

Could you try to add `FORMATOPTION "PROFILE=GDALGeoTIFF" to see if it forces GDAL to save the nodata inside TIFF?

Another possibility is that GDAL thinks that your nodata value 3.402823466385289e+38 is not a valid float32 number
I found less decimals from the web, like 3.402823E+38 and 3.402823466 E + 38.
I suppose that you could test that with some other nodata value like FORMATOPTION "NULLVALUE=0".

I apologize that we have not been guiding you to right direction before. We should have asked you to show your mapfile and outputformat immediately.

@jratike80
Copy link

There may indeed be something in my latest guess:

gdal_translate -ot float32 -a_nodata 3.402823466385289e+38 test.tif nodata.tif
Input file size is 1473, 1790
Warning 1: for band 1, nodata value has been clamped to 340282346638528859811704183484516925440, the original value being out of range.
0...10...20...30...40...50...60...70...80...90...100 - done.

However, gdal_translate did not create a PAM file but stored the nodata this way:
NoData Value=3.4028234663852886e+38

Not a solution for your original problem but perhaps you should have a look at your nodata value anyway.

@jratike80 jratike80 reopened this Sep 1, 2022
@arbakker
Copy link
Author

arbakker commented Sep 1, 2022

@jratike80 thanks for looking into the issue, much appreciated!

Your suggestion of using 3.402823466E+38 as a NODATA value in the output format config solved the issue, as can be seen in the example repository I set up.

The interesting thing is that the issue occurs with WCS v2.0.1 and not with v1.0.0 (I did not test all available WCS versions). I suppose the wcs 2.0.1 uses a different codepath in Mapserver to generate the geotiff.

What is a bit confusing to me is that the NODATA value reported by gdal_info of the geotiff file that has been generated with the 3.402823466E+38 NODATA value is: 3.40282344818115234:

curl -s "http://localhost?SERVICE=WCS&VERSION=1.0.0&REQUEST=GetCoverage&FORMAT=GEOTIFF_FLOAT32&COVERAGE=ahn3_5m_dtm&BBOX=231620,580897,235620,584897&CRS=EPSG:28992&RESPONSE_CRS=EPSG:28992&WIDTH=1000&HEIGHT=1000" | gdalinfo /vsistdin/ | grep NoData
> NoData Value=3.40282344818115234

The reported NODATA value does not look right, despite that the resulting tif looks good with respect to rendering the NODATA values in QGIS:

image

Any idea what is happening here?

@jratike80
Copy link

Any idea what is happening here?

Sorry no, I am not an expert on NaN values, extreme values of floating point numbers, or inaccuracies in floating point arithmetics but I have seen before that they can make troubles so they are my usual suspects.

@arbakker
Copy link
Author

arbakker commented Sep 2, 2022

Issue of the NaN values with GDAL are partially explained/touch upon in this GDAL issue:
OSGeo/gdal#1071

@arbakker
Copy link
Author

arbakker commented Sep 2, 2022

Regarding the NoData value reported by GDAL as 3.40282344818115234, this was due to a missing e in FORMATOPTION "NULLVALUE=3.402823466e+38" . On top of that I also ran into a localization issue with gdal_translate; since I had LC_NUMERIC=nl_NL.UTF-8 the . in 3.402823466385289e+38 is interpreted as a thousands separator instead of a decimal separator.

export LC_NUMERIC=en_US.UTF-8
gdal_translate -ot float32 -a_nodata 3.4028234663852886e+38 test.tif nodata.tif
# or
export LC_NUMERIC=nl_NL.UTF-8
gdal_translate -ot float32 -a_nodata 3,4028234663852886e+38 test.tif nodata.tif

In the end I added the following outputformat config to my mapfile:

OUTPUTFORMAT
      NAME          "GEOTIFF_FLOAT32"
      DRIVER        "GDAL/GTiff"
      MIMETYPE      "image/tiff"
      IMAGEMODE     FLOAT32
      EXTENSION     "tif"
      FORMATOPTION  "NULLVALUE=3.4028234663852886e+38"
      FORMATOPTION  "COMPRESS=DEFLATE"
      FORMATOPTION  "PREDICTOR=3"
      FORMATOPTION  "RESAMPLING=bilinear"
END # OUTPUTFORMAT

Which results in the same NoData value as reported by gdalinfo:

curl -s "http://localhost?version=2.0.1&request=GetCoverage&service=WCS&CoverageID=ahn3_5m_dtm&crs=http://www.opengis.net/def/crs/EPSG/0/28992&format=image/tiff&scalesize=x(1000),y(1000)&subset=x(231620,235620)&subset=y(580897,584897)" | gdalinfo /vsistdin/ | grep NoData
>  NoData Value=3.4028234663852886e+38

@arbakker arbakker closed this as completed Sep 2, 2022
@nicolasvila
Copy link

This bug is really annoying. I'm getting multipart content in WCS 2.0.1 and not in WCS 1.0.0 which is fine... Is this bug fixed or not?

@jratike80
Copy link

jratike80 commented Jan 22, 2023

Do you also use some special value as nodata in the outputformat? If you read the comments you'll see that nothing was done for the Mapserver code but the line FORMATOPTION "NULLVALUE=3.4028234663852886e+38" was just fixed in the mapfile.

@nicolasvila
Copy link

nicolasvila commented Jan 22, 2023

My Mapserver instance produces map tiles for elevation with BIL16 format, which is used by a 3D globe, but that's not the point.
The source files that I'm using behind Mapserver are COG files, and I don't use any "nodata" value in the output format. Same for the source Geotiffs. Here is below the OUTPUTFORMAT I'm using:

  OUTPUTFORMAT
    NAME BIL16
    DRIVER "GDAL/EHdr"
    MIMETYPE "application/bil16"
    IMAGEMODE INT16
    EXTENSION "bil"
    FORMATOPTION "FORM=SIMPLE"
    FORMATOPTION "STORAGE=stream"
  END

It also happens with ASCII Grid output format. Another user complains about it here: WCS 2.0.1 sends multipart response for ascii grid format

  OUTPUTFORMAT
  NAME AAIGRID
    DRIVER "GDAL/AAIGRID"
    MIMETYPE "application/x-ascii-grid"
    IMAGEMODE FLOAT32
    EXTENSION "grd"
    FORMATOPTION "SIGNIFICANT_DIGITS=5"
  END

Well it's really strange I have no issue with WCS 1.0.0 and when I'm doing the same request, I obtain a multipart response with 4 files. Here are the 4 elements that are present in the multipart response instead of my image data:

  • out.bil
  • out.bil.aux.xml
  • out.hdr
  • out.prj

If I ask for the same part of the image in a different format, like PNG, I obtain only the image as expected. When using the bil16 output format, I have a Multipart document which content the BIL16 but also auxiliary data I don't care about. According to the documentation that shouldn't happen as I NEVER setup a "mediaType", which enforces a multi-part encoding (the only possible value is "multipart/related")

@jratike80
Copy link

Mapserver is using GDAL for creating the output. You can simulate what happens by converting a GeoTIFF file into EHdr format https://gdal.org/drivers/raster/ehdr.html

gdal_translate -of EHdr scaled.tiff ehdrtest.bil

The result contains 4 parts:
.bil contains the raw pixel data
.hdr file contains the headers which defines the structure of the .bil file
.prj contains the definitions of the coordinate system
.aux.xml contains all metadata that is present in the source data but that cannot be stored into other parts of the EHdr data

The meaning of WCS service it to deliver coverage data as original as possible without altering it. So I think that the current behavior is a good default. If the data cannot be expressed properly with one file then it is expressed with as many files as the format requires independently about the mediaType. Certainly the meaning of the WCS standard is not that if the coverage format requires many files then without mediaType only one part would be sent. But if the format supports saving all data and metadata into one file, like GeoTIFF, then the output is only one file.

I understand that for some use cases users could want to use WCS in an innovative way and then for example dropping out information about georeferencing and projection would not harm. In that case it may be possible to get the desired result by editing the outputformat:

  • setting config option GDAL_PAM_ENABLED into NO should prevent writing the aux.xml file
  • creation option WORLDFILE=NO should disable writing the world file for png format (but you wrote that you get a single .png file from WCS - that feels like a bug to me, I would await also .prj and .wld because otherwise the data are not complete)
  • I am not sure if it could be possible to send output that requires many parts as one zipped file with options like FORMATOPTION "FORM=zip" and FORMATOPTION "FILENAME=result.zip"

When it comes to EHdr format, is it even valid without the header file?

It seems that the one who complained about the ASCII Grid format was I. I do not remember right now how we resolved that problem but anyway, we have resolved that and we deliver WCS 2.0 data in GeoTIFF and .asc formats and the responses are in single part.

About how to resolve your special need to rip off all but the .bil part of your response, I suggest to write mail to mapserver-users list and ask. Maybe some other user has had similar need and resolved it already.

@nicolasvila
Copy link

@jratike80 , thanks for your reply.
My issue was more about inconsistency between WCS 1.0.0 and WCS 2.0.1. In the first whatever GDAL produce more than 1 file, only the raster file is sent back to the client (which is fine) With WCS 2.0.1, every files is sent back as multipart content, even if we didn't explicitly asked for it.
It seemed to me the option mediaType=multipart/related was meant to force such behavior.
Thanks again.

@jratike80
Copy link

jratike80 commented Jan 23, 2023

By reading the WCS 1.0.0 standard Mapserver behaves badly with WCS 1.0.0 and what feels fine to you (and to most other users I think) is a bug.

9.3.3 Multi-file payloads
Several well-known coverage formats encode data and metadata into multiple files.
When returning a multi-file payload to the client, WCS 1.0.0 servers should use multipart MIME encoding [IETF RFC 2045].

And if I read the WCS version 1.1.0 right that version does not really support direct single part response at all even if the format itself does not require many files.

I believe that option mediaType=multipart/related in WCS 2.0 was meant to produce a similar output than the default output of the previous WCS version:

When the ―store parameter is absent or has the value ―false, the server shall transfer the
complete GetCoverage response to the client, either as a MIME multipart message (for KVP
or XML requests) or as a SOAP message with attachments (for SOAP requests). The
Coverages shall reference the other parts of the MIME multi-part message (or SOAP
attachments) as indicated in 10.3.11.3 below

@jratike80
Copy link

jratike80 commented Jan 23, 2023

I checked that we could not configure Mapserver to serve ascii grid format as single part in 2019. As a workaround we wrote a small routine that separates the .asc part from the multipart response before sending GetCoverage response to our clients and therefore I lost my interest in making test with newer Mapserver versions.

A test with gdal_translate shows that with --config GDAL_PAM_ENABLED NO the aux.xml file is not created but I do not know if there is some trick to avoid writing the .prj file. Probably not because it seems to be hardcoded into https://github.com/OSGeo/gdal/blob/master/frmts/aaigrid/aaigriddataset.cpp#L1466. That does make sense because coordinates with crs info are generally much more useful than plain coordinates alone.

@rouault
Copy link
Contributor

rouault commented Jan 23, 2023

I've reviewed the code a bit, and I believe what mapserver does is the best it can do

  • for WCS 1.0, the code is a bit "stupid" and just outputs the main file generated by GDAL. Most of the time it is fine, but if you take for example the EHDR format, the main file is the grid. But EHDR without the ASCII header file is essentially useless. At least, this is not EHDR, but a raw binary grid
  • for WCS 2.0, if mediaType=multipart/related is specified, MapServer honours it. If it is not specified, MapServer looks at the output file(s) generated by GDAL. If there's a single one, then a single-part response is generated. Otherwise a multi-part one. The strictly conforming behaviour would be to generate an error indicating that mediaType=multipart/related is needed for that format. Would that be preferable?

@jratike80
Copy link

The strictly conforming behaviour would be to generate an error indicating that mediaType=multipart/related is needed for that format. Would that be preferable?

Definitely not I would say, and I am not ever sure if it would be more conforming. Or can you show a place in the standard that states that some of the supported formats may require the use of the otherwise optional mediaType=multipart/
related? At least that information cannot be found from the GetCapabilities but clients should be made to try with mediaType if request happens to fail without.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants