Skip to content

modify temporalCoverage to support ongoing data sources #145

@mbjones

Description

@mbjones

Author Name: Matt Jones (Matt Jones)
Original Redmine Issue: 1794, https://projects.ecoinformatics.org/ecoinfo/issues/1794
Original Date: 2004-12-01
Original Assignee: Matt Jones


---- Posted on behalf of Barbara Benson (bjbenson@wisc.edu) ----

I would like to raise some concerns that have arisen while developing EML
documents for the North Temperate Lakes LTER.

Our data reside in an Oracle database, and tables are updated with new data at
frequencies ranging from hourly to annually. We are creating EML documents to
describe these data, and the data can be accessed dynamically from our website.
Data from instrumented buoys are uploaded to the database every hour and are
thus accessible from our website current to within the last hour. Our problem
comes from trying to create temporal coverage for the NTL data. In order to
have valid EML, it would seem like our options are:

  1. to inaccurately describe the end date of a data set by choosing a static
    date; for example, the EML Best Practices document suggests using the end of the
    current year
  2. to choose not to populate temporal coverage, thus having data sets that
    won't be located by temporal searches
  3. to create data sets outside our database that are static
  4. to use the "kluge" solution from a previous draft of the EML Best Practices
    using the alternative time scale as "ongoing" and leaving the end date blank.

For data sets that are only updated annually, we are willing to create an end
date and just change that end date each year in the metadata. We have not
decided how to handle temporal coverage for data that are updated more
frequently but none of the currently available (valid) options seems desirable.

The current focus for creation of EML documents is to harvest them to the
Metacat at the LTER Network Office. The rationale for this harvest is to
support the data discovery functionality through Metacat across the LTER
datasets. Given the well developed functionality of the NTL dynamic database
access and the capability of capturing information about users accessing the NTL
data, we want the EML documents to point to our dynamic database access system
for each data set. Therefore, we don't find the creation of a static dataset a
viable option at the present time when our higher level of functionality is not
available centrally and not likely to become available in the near future.

To me the problems with creating temporal coverage for an ongoing data set
highlight what I perceive to be a more general problem regarding the
conceptualization of what objects EML is designed to describe. The set of
objects needs to be bigger than static data sets. There are other data sources
that need metadata description, e.g., database tables that are frequently
updated, data streams from sensor networks. Some features of the current
version of EML seem to be limited by this "static dataset" paradigm. It isn't
hard to envision applications for EML attached to data streams.

We would appreciate your response to these issues. We think the next version of
EML should accommodate ongoing data sets and allow the end date to be blank.

thanks
Barbara Benson

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions