Skip to content

Commit

Permalink
Initial conversion of spec to markdown.
Browse files Browse the repository at this point in the history
Lots of reorganization remaining.
  • Loading branch information
mbjones committed Nov 22, 2018
1 parent ef1f73d commit ce96534
Show file tree
Hide file tree
Showing 4 changed files with 1,038 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/_bookdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ rmd_files: ["index.Rmd",
"../README.md",
"contributors.md",
"eml-220info.md",
"ch1-spec-preface.md",
"ch2-spec-overview.md",
"ch3-spec-architecture.md",
"release-notes.md",
"about-this-book.md",
"references.md"]
Expand Down
93 changes: 93 additions & 0 deletions docs/ch1-spec-preface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Chapter 1. Preface

## Introduction

The Ecological Metadata Language (EML) is a metadata standard developed
by the ecology discipline and for the ecology discipline. It is based on
prior work done by the Ecological Society of America and associated
efforts (Michener et al., 1997, Ecological Applications). EML is
implemented as a series of XML document types that can by used in a
modular and extensible manner to document ecological data. Each EML
module is designed to describe one logical part of the total metadata
that should be included with any ecological dataset.

## Purpose Statement

To provide the ecological community with an extensible, flexible,
metadata standard for use in data analysis and archiving that will allow
automated machine processing, searching and retrieval.

## Features

The architecture of EML was designed to serve the needs of the
ecological community, and has benefitted from previous work in other
related metadata languages. EML has adopted the strengths of many of
these languages, but also addresses a number of short-comings that have
proved to inhibit the automated processing and integration of dataset
resources via their metadata.

The following list represents some of the features of EML:

- Modularity: EML was designed as a collection of modules rather than
one large standard to facilitate future growth of the language in
both breadth and depth. By implementing EML with an extensible
architecture, groups may choose which of the core modules are
pertinent to describing their data, literature, and software
resources. Also, if EML falls short in a particular area, it may be
extended by creating a new module that describes the resource (e.g.
a detailed soils metadata profile that extends eml-dataset). The
intent is to provide a common set of core modules for information
exchange, but to allow for future customizations of the language
without the need of going through a lengthy \'approval\' process.

- Detailed Structure: EML strives to balance the tradeoff of too much
detail with enough detail to enable advanced services in terms of
processing data through the parsing of accompanied metadata.
Therefore, a driving question throughout the design was: \'Will this
particular piece of information be machine-processed, just human
readable, or both?\' Information was then broken down into more
highly structured elements when the answer involved machine
processing.

- Compatibility: EML adopts much of it\'s syntax from the other
metadata standards that have evolved from the expertise of groups in
other disciplines. Whenever possible, EML adopted entire trees of
information in order to facilitate conversion of EML documents into
other metadata languages. EML was designed with the following
standards in mind: Dublin Core Metadata Initiative, the Content
Standard for Digital Geospatial Metadata (CSDGM from the US
geological Survey\'s Federal Geographic Data Committee (FGDC)), the
Biological Profile of the CSDGM (from the National Biological
Information Infrastructure), the International Standards
Organization\'s Geographic Information Standard (ISO 19115), the ISO
8601 Date and Time Standard, the OpenGIS Consortiums\'s Geography
Markup Language (GML), the Scientific, Technical, and Medical Markup
Language (STMML), and the Extensible Scientific Interchange Language
(XSIL).

- Strong Typing: EML is implemented in an Extensible Markup Language
(XML) known as [XML Schema](http://www.w3.org/XML/Schema), which is
a language that defines the rules that govern the EML syntax. XML
Schema is an internet recommendation from the [World Wide Web
Consortium](http://www.w3.org), and so a metadata document that is
said to comply with the syntax of EML will structurally meet the
criteria defined in the XML Schema documents for EML. Over and above
the structure (what elements can be nested within others,
cardinality, etc.), XML Schema provides the ability to use strong
data typing within elements. This allows for finer validation of the
contents of the element, not just it\'s structure. For instance, an
element may be of type \'date\', and so the value that is inserted
in the field will be checked against XML Schema\'s definition of a
date. Traditionally, XML documents (including previous versions of
EML) have been validated against Document Type Definitions (DTDs),
which do not provide a means to employ strong validation on field
values through typing.

- There is a distinction between the content model (i.e. the concepts
behind the structure of a document - which fields go where,
cardinality, etc.) and the syntactic implementation of that model
(the technology used to express the concepts defined in the content
model). The normative sections below define the content model and
the XML Schema documents distributed with EML define the syntactic
implementation. For the foreseeable future, XML Schema will be the
syntactic specification, although it may change later.
Loading

0 comments on commit ce96534

Please sign in to comment.