-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue regarding the use of "dods:" in DODSNetcdfFile #161
Comments
Tagging @DennisHeimbigner |
Ok, I think I've found the root issue here, so I'll cut right to the chase. The NCEI server does not deal with redirects of encoded urls properly. Let's follow the steps of a request to We can immediately see an issue if we make a request, but encode the brackets: curl -G -v "http://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggre
gation.ncml.ascii?time%5b0:1:2%5d"
* Trying 205.167.25.171...
* TCP_NODELAY set
* Connected to www.ncei.noaa.gov (205.167.25.171) port 80 (#0)
> GET /thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.ascii?time%5b0:1:2%5d HTTP/1.1
> Host: www.ncei.noaa.gov
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Date: Fri, 24 Jan 2020 20:21:09 GMT
< Server: Apache
< Location: https://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.ascii?time%255b0:1:2%255d
< Content-Length: 310
< Connection: close
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.ascii?time%255b0:1:2%255d">here</a>.</p>
</body></html>
* Closing connection 0 Notice in the response header, the Location field has been double encoded. Specifically, the initial request uses Now if we try to use that location, we get: curl -G -v "https://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.ascii?time%255b0:1:2%255d"
* Trying 205.167.25.178...
<snip>
> GET /thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.ascii?time%255b0:1:2%255d HTTP/1.1
> Host: www.ncei.noaa.gov
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Date: Fri, 24 Jan 2020 20:31:51 GMT
< Server: Apache-Coyote/1.1
< Strict-Transport-Security: max-age=31536000
< XDODS-Server: opendap/3.7
< Content-Description: dods-error
< Content-Type: text/plain
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Headers: X-Requested-With, Content-Type
< Connection: close
< Transfer-Encoding: chunked
<
Error {
code = 403;
message = "Request too big=1.1117421067232E7 Mbytes, max=50.0";
};
* Closing connection 0 @oxelson - this smells like a We also see in the 403 response header that the Server field is set as |
@lesserwhirls no, I've not seen this before with mod_jk, but that doesn't mean it's not a mod_* proxy issue. That said, I'd really need to know what their environment looks like (apache/tomcat configs) to see if the server environment is causing the encoding issues. |
I'll reach out to NCEI to see what's going on with the double encoding on the |
TL;DR;
Making an HTTP GET request to
http://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.dods?time
works andhttps://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.dods?time[0:1:1]
works, buthttp://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.dods?time[0:1:1]
fails with a 403 (request too big).Perhaps a server side issue, but netCDF-Java could be able to make things work by doing the right thing in terms of using the proper protocol (
https
in this case).Details
In ucar.nc2.dods.DODSNetcdfFile, any dataset url that starts with
dods:
is changed to usehttp:
netcdf-java/opendap/src/main/java/ucar/nc2/dods/DODSNetcdfFile.java
Lines 179 to 192 in 8f15ecf
Of course, that's not always the correct thing to do, but if redirects are handled properly, and the server responds properly, it should all just work. For certain code paths, everything does work. For example, if we look at the following dataset url:
dods://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml
We can open the file using
NetcdfDataset.acquireFile()
, and we can successfully read the dds and das because redirects work and the server behaves well. However, if we try to open with NetcdfDataset.openDataset(), we fail because the OPeNDAP server returns a 403 when reading a slice (in this case, trying to gethttp://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.dods?time[0:1:108082]
). It's the "reading a slice" part that seems to be the key.Doing a GET request on
http://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.dods?time
works, but once I introduce the constraint, I run into problems. For example, if I try to HTTP Gethttp://www.ncei.noaa.gov/thredds/dodsC/cdr/gridsat/GridSat-Aggregation.ncml.dods?time[0:1:1]
, I get:If I change the same request to use
https:
, it works. It's almost like the the entire query (after the?
) is being dropped after a redirect when requesting a slice of data from a variable.This behavior is also seen in the latest netCDF-Java 4.6.x code (current master branch over at https://github.com/unidata/thredds). The ability to handle
dods:
as a dataset url through NetcdfDataset used to work, at least as recently as 4.6.12-SNAPSHOT (from February of this year), so it's a somewhat recent change affecting both 4.6.x and 5.0.x.It seems to me that, regardless if this is a server side issue or not (likely is), netCDF-Java could handle this by making the right choice when trying to map
dods:
inDODSNetcdfFile
.The text was updated successfully, but these errors were encountered: