Scope of the Global Names Index portal

dremsen edited this page Sep 14, 2010 · 8 revisions

Physical Components of the GNI portal

The current GNI web application functions as an access point for “GNI indexes.” This is a simple file format that provides a name, as used in an arbitrary unit of content (e.g., a publication title, a gene sequence, a species page entry, a taxonomic data record), and a link (URL). The prototype developed between EoL and GBIF has the following components:

  1. A data registry that allows a user to authenticate and register one or more resources. The registry includes a field for identifying the location of a metadata record that defines the resource.
  2. A metadata profile for describing the resource that includes a field for identifying the location of a dataset formatted to the GNI schema. A minimal metadata profile was created as GBIF was in the process of defining a more robust profile that will be supported.
  3. A harvesting method that accesses the dataset and parses it into an internal GNI database based on a provided schedule.
  4. A web interface that allows resources to be identified and both resource metadata and data to be searched, browsed, and viewed.


At the Nomina II workshops in Woods Hole we proposed to develop a GNI prototype that would serve as a resource discovery index for the data resources associated with participating datasets (IPNI, ITIS, Tropicos, Index Fungorum, etc). It was implied that we would do more than index resources associated with "names provides (those who serve information about names). We would index any resource that provided a resolve-able identifier and a textstring that represented an organism identifier.

I put these in for starters. Add more and correct if I err. Maybe it makes more sense to have these within the API document (DPR)

Questions the GNI can answer:

For a registered resource how many distinct taxon names do they serve?

For a registered resource how many distinct records do they serve?
p(. A resource may serve more than one link tied to the same name.

For a registered resource, what different kinds of links are provided?
p(. The schema distinguishs URLS from Globally Unique Identifiers that can be further parsed and identified as LSIDs, DOIs, etc.

Given a taxon name, list all records that are linked to that name.

Given a taxon name, list all resources that have records that are linked to that name.

Questions that the current GNI alone cannot answer:

Given a higher taxonomic group (e.g., Aves) show all resources that contain records that are linked to names that are members of that group.

Given a taxon name, find all records that are tied to that name or to different synonym classes related to that name.

Which names in the index are authoritative or verified?