-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modify temporalCoverage to support ongoing data sources #145
Comments
Original Redmine Comment We have discussed the problems associated with ongoing data. In particular, see http://www.ecoinformatics.org/pipermail/eml-dev/2004-October/001027.html So, we need to come to resolution on this issue. |
Original Redmine Comment targeting for 2.1.0, although may drop back to unspecified. |
Original Redmine Comment Looks like the links to previous EML-dev discussion threads were wrong in Comment #1. The actual discussion occurred here: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/2004-October/001030.html The synopsis of the discussion is this: I maintain that temporal coverage should refer to what data now exist (even in a dynamic database) and can be retrieved at the time the metadata are queried, not what the sampling protocol is. This allows queries to use the coverage information to accurately retrieve relevant data. Information about future intended sampling that has not yet occurred should go in sampling design descriptions, which will tell people what is intended but not make coverage inaccurate if plans change. In contrast, others feel that it is ok to have a 'null' field or to hack the field type and put in 'ongoing' or the like. My feeling is that doing this makes it indeterminate as to whether a search engine should return a data set for any given temporal search, and therefore reduces the search effectiveness of the metadata. For example, if someone has a so-called 'dynamic' database and enters a metadata record in 2002 saying that data span "2000- ", should a search engine return a 'hit' for a search for data in the range of 2008? What if the research project ended and those data weren't collected after 2004? Would we forever be obliged to return that data set as a hit, even for a search for data in 2020? To me this is an issue more about metadata accuracy and update frequency than anything else. |
Original Redmine Comment Summary of 2008-09-30 discussion (Matt Jones, Margaret O'Brien, James Brunt, Mark Servilla, Inigo San Gil, Chris Jones, Corinna Gries, Ken Ramsey) Consensus:
Comments and observations:
Possible changes to EML:
A subset of eml-dev members will consider these issues: |
Original Redmine Comment This bug is re-targeted for EML2.2 |
Original Redmine Comment Original Bugzilla ID was 1794 |
Author Name: Matt Jones (Matt Jones)
Original Redmine Issue: 1794, https://projects.ecoinformatics.org/ecoinfo/issues/1794
Original Date: 2004-12-01
Original Assignee: Matt Jones
---- Posted on behalf of Barbara Benson (bjbenson@wisc.edu) ----
I would like to raise some concerns that have arisen while developing EML
documents for the North Temperate Lakes LTER.
Our data reside in an Oracle database, and tables are updated with new data at
frequencies ranging from hourly to annually. We are creating EML documents to
describe these data, and the data can be accessed dynamically from our website.
Data from instrumented buoys are uploaded to the database every hour and are
thus accessible from our website current to within the last hour. Our problem
comes from trying to create temporal coverage for the NTL data. In order to
have valid EML, it would seem like our options are:
date; for example, the EML Best Practices document suggests using the end of the
current year
won't be located by temporal searches
using the alternative time scale as "ongoing" and leaving the end date blank.
For data sets that are only updated annually, we are willing to create an end
date and just change that end date each year in the metadata. We have not
decided how to handle temporal coverage for data that are updated more
frequently but none of the currently available (valid) options seems desirable.
The current focus for creation of EML documents is to harvest them to the
Metacat at the LTER Network Office. The rationale for this harvest is to
support the data discovery functionality through Metacat across the LTER
datasets. Given the well developed functionality of the NTL dynamic database
access and the capability of capturing information about users accessing the NTL
data, we want the EML documents to point to our dynamic database access system
for each data set. Therefore, we don't find the creation of a static dataset a
viable option at the present time when our higher level of functionality is not
available centrally and not likely to become available in the near future.
To me the problems with creating temporal coverage for an ongoing data set
highlight what I perceive to be a more general problem regarding the
conceptualization of what objects EML is designed to describe. The set of
objects needs to be bigger than static data sets. There are other data sources
that need metadata description, e.g., database tables that are frequently
updated, data streams from sensor networks. Some features of the current
version of EML seem to be limited by this "static dataset" paradigm. It isn't
hard to envision applications for EML attached to data streams.
We would appreciate your response to these issues. We think the next version of
EML should accommodate ongoing data sets and allow the end date to be blank.
thanks
Barbara Benson
The text was updated successfully, but these errors were encountered: