Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BiG-CZ: Fetch CUAHSI Values using WaterML and Ulmo #2353

Merged
merged 3 commits into from
Oct 10, 2017

Conversation

rajadain
Copy link
Member

@rajadain rajadain commented Oct 9, 2017

Overview

Adds Ulmo, and uses it in Django endpoints for fetching CUAHSI details and values.

The expected usage is:

  1. Query /bigcz/details to fetch general information about all variables in a site
  2. Query /bigcz/values for each variable to fetch its values

UI for this is coming in a following PR. The UI may be broken until that next PR is merged.

See commit messages for details.

Connects #2243
Connects #2238

Demo

  • Details

    http --print HhBb :8000/bigcz/details catalog==cuahsi wsdl=="http://hydroportal.cuahsi.org/nwisgw/cuahsi_1_1.asmx" site=="NWISGW:395801075101601"
    
    GET /bigcz/details?catalog=cuahsi&wsdl=http%3A%2F%2Fhydroportal.cuahsi.org%2Fnwisgw%2Fcuahsi_1_1.asmx&site=NWISGW%3A395801075101601 HTTP/1.1
    Accept: */*
    Accept-Encoding: gzip, deflate
    Connection: keep-alive
    Host: localhost:8000
    User-Agent: HTTPie/0.9.9
    
    
    
    HTTP/1.1 200 OK
    Allow: OPTIONS, GET
    Connection: keep-alive
    Content-Encoding: gzip
    Content-Type: application/json
    Date: Mon, 09 Oct 2017 17:29:25 GMT
    Server: nginx
    Transfer-Encoding: chunked
    Vary: Accept-Encoding
    Vary: Accept, Cookie
    
    {
        "code": "395801075101601",
        "elevation_m": "90",
        "location": {
            "latitude": "39.96705715",
            "longitude": "-75.17073369",
            "srs": "EPSG:4269"
        },
        "name": "PH   486",
        "network": "NWISGW",
        "series": {
            "NWISGW:72019": {
                "variable": {
                    "code": "72019",
                    "data_type": "unknown",
                    "general_category": "Hydrology",
                    "id": "3017",
                    "name": "Depth to water level, feet below land surface",
                    "no_data_value": "-999999",
                    "sample_medium": "Groundwater",
                    "speciation": "Not Applicable",
                    "time": {
                        "interval": "0",
                        "is_regular": true,
                        "units": {
                            "code": "103",
                            "name": "hour",
                            "type": "Time"
                        }
                    },
                    "units": {
                        "abbreviation": "ft"
                    },
                    "value_type": "Field Observation",
                    "vocabulary": "NWISGW"
                },
                "{http://www.cuahsi.org/water_ml/1.1/}method": {
                    "method_description": "No method specified",
                    "method_id": "0"
                },
                "{http://www.cuahsi.org/water_ml/1.1/}quality_control_level": {
                    "definition": "Unknown",
                    "quality_control_level_code": "-9999",
                    "quality_control_level_id": "-9999"
                },
                "{http://www.cuahsi.org/water_ml/1.1/}source": {
                    "citation": "http://waterservices.usgs.gov",
                    "organization": "U.S. Geological Survey (USGS)",
                    "source_description": "historical manually-recorded groundwater levels from hydrologic sites served by the USGS",
                    "source_id": "2"
                },
                "{http://www.cuahsi.org/water_ml/1.1/}value_count": {
                    "value_count": "1"
                },
                "{http://www.cuahsi.org/water_ml/1.1/}variable_time_interval": {
                    "begin_date_time": "1954-06-09T00:00:00",
                    "begin_date_time_utc": "1954-06-09T04:00:00",
                    "end_date_time": "1954-06-09T00:00:00",
                    "end_date_time_utc": "1954-06-09T04:00:00",
                    "variable_time_interval_type": "TimeIntervalType"
                }
            }
        },
        "site_property": {
            "pos_accuracy_m": "10",
            "site_comments": "02040203",
            "state": "PA"
        }
    }
  • Values

    http --print HhBb :8000/bigcz/values catalog==cuahsi wsdl=="http://hydroportal.cuahsi.org/nwisgw/cuahsi_1_1.asmx" site=="NWISGW:395801075101601" variable=="NWISGW:72019" from_date=="06/09/1954" to_date=="06/09/1954"
    
    GET /bigcz/values?catalog=cuahsi&wsdl=http%3A%2F%2Fhydroportal.cuahsi.org%2Fnwisgw%2Fcuahsi_1_1.asmx&site=NWISGW%3A395801075101601&variable=NWISGW%3A72019&from_date=06%2F09%2F1954&to_date=06%2F09%2F1954 HTTP/1.1
    Accept: */*
    Accept-Encoding: gzip, deflate
    Connection: keep-alive
    Host: localhost:8000
    User-Agent: HTTPie/0.9.9
    
    
    
    HTTP/1.1 200 OK
    Allow: OPTIONS, GET
    Connection: keep-alive
    Content-Encoding: gzip
    Content-Type: application/json
    Date: Mon, 09 Oct 2017 17:33:12 GMT
    Server: nginx
    Transfer-Encoding: chunked
    Vary: Accept-Encoding
    Vary: Accept, Cookie
    
    {
        "site": {
            "agency": "USGS",
            "code": "395801075101601",
            "location": {
                "latitude": "39.96705715",
                "longitude": "-75.17073369"
            },
            "name": "PH   486",
            "network": "NWISGW",
            "notes": {
                "county_cd": "42101",
                "huc_cd": "02040203",
                "site_type_cd": "GW",
                "state_cd": "42"
            },
            "timezone_info": {
                "default_tz": {
                    "abbreviation": "EST",
                    "offset": "-05:00"
                },
                "dst_tz": {
                    "abbreviation": "EDT",
                    "offset": "-04:00"
                },
                "uses_dst": false
            }
        },
        "values": [
            {
                "datetime": "1954-06-09T12:00:00",
                "value": "3.60"
            }
        ],
        "variable": {
            "code": "72019",
            "description": "Depth to water level, ft below land surface",
            "id": null,
            "name": "Depth to water level, feet below land surface",
            "network": "NWISGW",
            "no_data_value": "-999999",
            "oid": "52331280",
            "sample_medium": "Surface Water Observation",
            "statistic": {
                "code": "00000",
                "name": null
            },
            "time": {},
            "units": {
                "abbreviation": "ft"
            },
            "value_type": "Derived Value",
            "vocabulary": "NWISGW"
        }
    }

Notes

The WaterML objects themselves are complex and elaborate, with many nested keys and arrays. Fortunately, we don't need to write our own serializers for them, because that is all taken care of within Ulmo. Furthermore, because of WaterML being an underlying standard, we can reasonably expect the data to follow a certain shape, and exercise this assumption all the way on the client side, without having to verify it on the server.

Ulmo uses suds under the hood to make SOAP requests, and by default uses the suds cache, which is distinct from the redis cache used elsewhere in the app (for Geoprocessing, etc). I have not looked into changing this default behavior, leaving it to a future investigation should the need arise.

Testing Instructions

Ulmo (https://github.com/ulmo-dev/ulmo/) is a data access
library designed to fetch information from various hydrology
and climatology sources. It works on the WaterML standard,
which is a protocol built on top of SOAP. We will use it
to fetch detailed values for sensor data from CUAHSI.

Also upgrade pip which speeds up our Python dependency
installation step significantly.
Previously, a CUAHSI search would have a `concept_keywords` array
of strings, corresponding to each variable. Now that we want to
fetch more detailed values, we switch to having an array of
`variables`, each of which in addition to `id`, `name` (that
tends to be longer, more like a description than name), and
`concept_keyword`, has a `site` and `wsdl` which will be used
to fetch values for that variable in a given timespan.

This array is sorted by `concept_keyword`, which will also be
used as a display label in the UI.
For fetching CUAHSI variable values, we add two endpoints:

`details` expects a `wsdl` and `site` parameter, and uses it to
fetch the WaterML siteinfo object for the given site from the
given wsdl url. This siteinfo object contains details for the
variables in the site, including units, and the time ranges for
which each variable has values. This information is important,
because if we query for `values` with a time outside this range,
the endpoint crashes.

`values` expects a `wsdl`, `site`, `variable`, `from_date` and
`to_date`, and uses it to fetch the values of that variable at
that site in the given time range, and returns the WaterML value
object.

The expected use is one call to `details` for a site, to fetch
units and time ranges for each variable, and then a call to
`values` for each variable in the site, fetching its values for
the given time range.

---

Notes:

1. The WaterML objects themselves are complex and elaborate, with
   many nested keys and arrays. Fortunately, we don't need to
   write our own serializers for them, because that is all taken
   care of within Ulmo. Furthermore, because of WaterML being an
   underlying standard, we can reasonably expect the data to
   follow a certain shape, and exercise this assumption all the
   way on the client side, without having to verify it on the
   server.

2. Ulmo uses suds under the hood to make SOAP requests, and by
   default uses the suds cache, which is distinct from the redis
   cache used elsewhere in the app (for Geoprocessing, etc). I
   have not looked into changing this default behavior, leaving it
   to a future investigation should the need arise.
Copy link

@arottersman arottersman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, was able to test with a variety of NWISGW sites. NWISDV sites like 01467200 weren't going so well, but as you said in person, it's probably the underlying API, not us.

@arottersman arottersman assigned rajadain and unassigned arottersman Oct 9, 2017
@rajadain
Copy link
Member Author

I was waiting to see if #2354 led to any major changes requiring tweaking here, but so far it doesn't look like it. I'm going to merge this in, and retarget #2354 to develop. Thanks for taking a look!

@rajadain rajadain merged commit 5dc6be2 into develop Oct 10, 2017
@rajadain rajadain deleted the tt/bigcz-cuahsi-detail-values-backend branch October 10, 2017 18:54
rajadain added a commit that referenced this pull request Oct 12, 2017
…lues-frontend

BiG-CZ: Fetch and Render CUAHSI Variable Values

Builds on #2353
Connects #2243
Connects #2238
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants