Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide Opendap / THREDDS server for netcdf / hdf5 data #155

Closed
dlebauer opened this issue Aug 25, 2016 · 24 comments
Closed

Provide Opendap / THREDDS server for netcdf / hdf5 data #155

dlebauer opened this issue Aug 25, 2016 · 24 comments
Assignees

Comments

@dlebauer
Copy link
Member

Description

Provide access to netcdf data via THREDDS Server

First two files to post would be /projects/arpae/met/narr/all.nc and /projects/arpae/met/cruncep/all.nc (these could be provided within /projects/arpae/terraref/derived_data/ but they don't fit the 'site' concept - they are north american (narr) and global products (cruncep) products.

We should also provide access to the environmental and hyperspectral data via this endpoint.

Context

In issue #89 it was decided that it will be straightforward to deploy an opendap server on ROGER to provide users with access to netcdf data / subsetting.

The met data will be useful for researchers evaluating environmental influences and predicting biomass production in different regions.

We can add other sensor data streams here:

  • Field scanner environmental sensor
  • Auxillary met data from MAC
  • Kansas met
  • UIUC field (I will double check with Bernacchi)

Ideally the workflow would be something like

  1. query site geometry from betydb
  2. use this information + time domain to subset met products

Further Suggestions / Request for Feedback

First, estimate how difficult it will be to set this up, then we can prioritize.

@ghost
Copy link

ghost commented Sep 23, 2016

@dlebauer - is this a priority for the V0 release?

@ghost ghost assigned robkooper Sep 29, 2016
@ghost
Copy link

ghost commented Sep 29, 2016

@robkooper - this has been transferred to you

@ghost ghost modified the milestone: January 2017 Nov 30, 2016
@ghost
Copy link

ghost commented Dec 14, 2016

Already existing API clients for this

@max-zilla
Copy link
Contributor

max-zilla commented Jan 5, 2017

Other steps:

  • how to query THREDDS server?
  • what netCDF files do we want query-able?
  • how do we get necessary data from betyDB to execute the query?

@robkooper @dlebauer let's review this issue and make sure we have the use case defined.

@dlebauer dlebauer assigned jterstriep and unassigned JeffWhiteAZ Jan 5, 2017
@dlebauer
Copy link
Member Author

dlebauer commented Jan 5, 2017

@max-zilla

what netCDF files do we want query-able?

We can start with the hyperspectral level 1 (reflectance) and level 2 (indices) files. Use case is 'compute plot level statistics (starting with mean).

We may also consider converting geotiff to netcdf (gdal_translate seems to do this). This would facilitate subsetting and standardize the geospatial query workflow?

how do we get necessary data from BETYdb to execute the query?

The easiest way to get plot boundaries from BETYdb is with the API call https://terraref.ncsa.illinois.edu/bety/api/beta/sites?key=9999999999999999999999999999999999999999. See API documentation and / or ask @gsrohde for how to query a specific subset by partial name matching (I think something like &sitename~Season+2)

Open questions @jterstriep, @yanliu-chn, @max-zilla:

@gsrohde
Copy link

gsrohde commented Jan 5, 2017

@max-zilla The API documentation for querying is here: https://pecan.gitbooks.io/betydb-data-access/content/API/beta_API.html. See the section "Matching using regular expressions" for use of the "=~" operator.

@dlebauer
Copy link
Member Author

@jterstriep could you please meet with @robkooper to flush out the steps required to implement this?

@ghost
Copy link

ghost commented Jan 12, 2017

need use cases to proceed (organized by hierarchy, grouping needed?) organize by day

start with hyperspectral files.

THREDD server needs to be configured or Java script written first - 1 week work

@dlebauer
Copy link
Member Author

@ashiklom could you define a few use cases?

@ghost
Copy link

ghost commented Jan 12, 2017

@jterstriep are there other options? is there a new THREDDS version available? @czender?

@dlebauer
Copy link
Member Author

@ashiklom
Copy link
Member

ashiklom commented Jan 12, 2017

Off the top of my head...

  • Download only one reflectance spectrum for a particular measurement
  • For a set of species, under a particular set of treatments, download only visible reflectance (400:700 nm)
  • For all species, download the green- and red-edge wavelengths for calculating my own custom vegetation index.

@dlebauer
Copy link
Member Author

dlebauer commented Feb 2, 2017

@robkooper and @jterstriep could you please provide an ETA, convert this to an epic and create smaller issues if necessary?

@robkooper
Copy link
Member

Thredds is running, not configured yet
https://terraref.ncsa.illinois.edu/thredds/

@dlebauer
Copy link
Member Author

dlebauer commented Feb 3, 2017 via email

@ghost ghost removed the help wanted label Feb 9, 2017
@ghost ghost modified the milestones: January 2017, February 2017 Feb 13, 2017
@ghost ghost removed this from the January 2017 milestone Feb 13, 2017
@dlebauer
Copy link
Member Author

FYI One use case is remote visualization. I'm going to check it out today https://www.giss.nasa.gov/tools/panoply/

@dlebauer
Copy link
Member Author

dlebauer commented Mar 2, 2017

@robkooper does this need to be broken down into smaller issues?

@ghost
Copy link

ghost commented Mar 2, 2017

no. rob

@ghost ghost removed the help wanted label Mar 2, 2017
@dlebauer
Copy link
Member Author

dlebauer commented May 3, 2017

Problem

There is no easy way to update the server as new files come in.

Rob can you please contact the developers?

@dlebauer
Copy link
Member Author

dlebauer commented May 3, 2017

@robkooper can you start with something that is static by the end of April?

@dlebauer dlebauer modified the milestones: May 2017, February 2017 May 3, 2017
@ghost
Copy link

ghost commented Jun 1, 2017

use THREDDS 4.6, not 5.0.

Rob is still working on this.

@max-zilla
Copy link
Contributor

@robkooper has been making PEcAn VM hopefully this week, then talk to @jdmaloney to get good way to list .nc files on Roger.

@max-zilla max-zilla modified the milestones: June 2017, May 2017 Jun 22, 2017
@max-zilla
Copy link
Contributor

@robkooper and @jdmaloney should discuss.

@robkooper
Copy link
Member

Following script is run every day at midnight, the list of files is created by JD

See https://terraref.ncsa.illinois.edu/thredds/catalog.html

#!/bin/bash

cat << EOF
<?xml version="1.0" encoding="UTF-8"?>
<catalog name="THREDDS Server Default Catalog : You must change this to fit your server!"
         xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
         xmlns:xlink="http://www.w3.org/1999/xlink"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0
           http://www.unidata.ucar.edu/schemas/thredds/InvCatalog.1.0.6.xsd">

  <service name="all" base="" serviceType="compound">
    <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
    <service name="dap4" serviceType="DAP4" base="/thredds/dap4/" />
    <service name="http" serviceType="HTTPServer" base="/thredds/fileServer/" />
    <!--service name="wcs" serviceType="WCS" base="/thredds/wcs/" /-->
    <!--service name="wms" serviceType="WMS" base="/thredds/wms/" /-->
    <service name="ncss" serviceType="NetcdfSubset" base="/thredds/ncss/" />
  </service>

  <datasetRoot path="uamac" location="/media/roger/sites/ua-mac/Level_1/hyperspectral"/>

  <dataset name="TERRA" ID="TERRA">

EOF

# INPUTS
echo '    <dataset name="UAMac" ID="UAMac">'
IFS=$'\n'
LAST_DATE=""
LAST_TIME=""
sort /media/roger/sites/ua-mac/Level_1/hyperspectral/nc_files | while read X; do
    # remove leading whitespace, and extract information
    X="${X//[[:space:]]/}"
    X="${X:2}"

    DATE=$( echo "$X" | cut -d "/" -f1 )
    if [ "$DATE" != "$LAST_DATE" ]; then
      if [ "$LAST_DATE" != "" ]; then
        echo '        </dataset>'
        echo '      </dataset>'
      fi
      LAST_TIME=""
      LAST_DATE="$DATE"
      echo "      <dataset name=\"${DATE}\" ID=\"${DATE}\">"
    fi

    TIME=$( echo "$X" | cut -d "/" -f2)
    if [ "$TIME" != "$LAST_TIME" ]; then
      if [ "$LAST_TIME" != "" ]; then
        echo '        </dataset>'
      fi
      LAST_TIME="$TIME"
      NAME=$( echo "$TIME" | cut -d "_" -f3 | tr "-" ":" )
      echo "        <dataset name=\"${NAME}\" ID=\"${TIME}\">"
    fi

    NAME=$( echo "$X" | cut -d "/" -f3 )
    echo "          <dataset name=\"${NAME}\" ID=\"${X}\" urlPath=\"uamac/${X}\" serviceName=\"all\">"
    echo '          </dataset>'
done
echo '        </dataset>'
echo '      </dataset>'
echo '    </dataset>'

# FOOTER
echo '  </dataset>'
echo '</catalog>'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants