# Metadata

## Overview
At its core, metadata is data about data.  In day-to-day GIS data management workflows, data is created, updated,
archived and used for various decision support systems.  Part of the information management lifecycle of data includes maintenance, protection and preservation, as well as facilitating discovery.  Metadata serves to meet these requirements.

## Core concepts
Documentation is critical in order to describe:

- who is responsible and who to contact for the data
- what the data represents (features, grids, etc.)
- where the data is located
- when the data was created, updated and what time span is the data based on
- why the data exists
- how the data was generated

## Standards
There are numerous standards that exist in support of documenting data.  The [Dublin Core](https://dublincore.org) standard provides 16 core elements to describe any resource.  The [OGC Catalogue Service for the Web](https://opengeospatial.org/standards/cat) leverages Dublin Core in providing a core metadata model for geospatial catalogues and search.

The geospatial community has had long standing efforts around developing metadata standards for geospatial data, including (but not limited to) [FGDC CSDGM](https://www.fgdc.gov/metadata/csdgm-standard), [DIF](https://earthdata.nasa.gov/esdis/eso/standards-and-references/directory-interchange-format-dif-standard), and [ISO 19115](https://www.iso.org/standard/26020.html).

Recently, [JSON](https://json.org) and [GeoJSON](https://geojson.org) have proliferated the geospatial ecosystem for lightweight data exchange over the web.  Metadata is no exception here; the [OGC API](https://ogcapi.org) and [STAC](https://stacspec.org) efforts have focused on JSON as a core representation of geospatial metadata.

Whichever standard you require or choose, using these standards to generate geospatial metadata provides value for easy integration into geospatial search catalogues and desktop GIS tools to help organize, categorize and find geospatial data.  The challenge of geospatial metadata remains in its complexity.  Tools are needed to easily create and manage geospatial metadata.

## Easy metadata workflows with pygeometa
[pygeometa](https://geopython.github.io/pygeometa) provides a lightweight toolkit allowing users to easily create geospatial metadata in standards-based formats using simple configuration files (affectionately called metadata control files [MCF]).  Leveraging the simple but powerful YAML format, pygeometa can generate metadata in numerous standards.  Users can also create their own custom metadata formats which can be plugged into pygeometa for custom metadata format output.

For developers, pygeometa provides an intuitive Python API that allows Python developers to tightly couple metadata generation within their systems.

## Creating metadata


Let's walk through examples of using pygeometa on the command line as well the API.

Let's start with the CLI below.

In [None]:
!pygeometa

In [None]:
!cat ../data/countries.yml

In [None]:
!pygeometa metadata generate ../data/countries.yml --schema iso19139 --output /tmp/countries.xml

In [None]:
!cat /tmp/countries.xml

Now let's try to output the metadata as an OGC API - Records metadata record.  Note the record JSON representation, which is key to the emerging OGC API standards, and baselined by GeoJSON, enabling broad interoperability.

In [None]:
!pygeometa metadata generate ../data/countries.yml --schema oarec-record

Now let's use the API to make some updates

In [None]:
from pygeometa.core import read_mcf
mdata = read_mcf('../data/countries.yml')
mdata

In [None]:
mdata['identification']['title']

Let's change the dataset title

In [None]:
mdata['identification']['title'] = 'Countries of the world'

Now let's select ISO 19139 as the output schema

In [None]:
from pygeometa.schemas.iso19139 import ISO19139OutputSchema
iso_os = ISO19139OutputSchema()

xml_string = iso_os.write(mdata)

Now let's inspect the `/gmd:MD_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:title` to see the updated title

In [None]:
print(xml_string)

Now try updating the `mdata` variable (`dict`) with updated values and use the pygeometa API to generate a new ISO XML.

---
[<- Visualization](07-visualization.ipynb) | [Publishing ->](09-publishing.ipynb)