# Publishing

## Overview
Dissemination is a key component of any geospatial data management lifecycle and the Internet is the key gateway in sharing your data with others.  Putting your data on the web has never been easier with services from Google, GitHub, Amazon, Azure, DigitalOcean and others, as well as numerous tooling for making data available through application programming interfaces (APIs).

Publishing geospatial geospatial data has varying degrees of complexity.  From simply posting files to a web server to provisioning services and APIs, there are no shortage on mechanisms to publish your data.

In this section we will focus on basic data and metadata publishing using [pygeoapi](https://pygeoapi.io) (which supports numerous [OGC API](https://ogcapi.org) standards) and [pycsw](https://pycsw.org) (which supports the OGC Catalogue Serivice for the Web [CSW] specification).  Thus the basic workflow is as follows:

```bash
Data publishing -> pygeoapi -> OGC API Clients
Metadata publishing -> pycsw -> OGC CSW and OGC API - Records Clients
```

More examples of these services and interacting with them remotely are covered in Sections 10 and 11.

For this example we are using the [WMO WOUDC](https://woudc.org) list of [Ozone and UV monitoring stations](https://woudc.org/data/stations) (retrieved 2021-09-14).  The data can be found in:

- data: `../data/woudc-stations.geojson`
- metadata: `../data/woudc-stations.mcf`

## Publishing vector data

Let's inspect our current OGC API endpoint powered by pygeoapi:

http://localhost:5000/collections

To see the same listing in JSON:

http://localhost:5000/collections?f=json

Here you will see 10 feature collections listed on the resulting webpage.  Feature collections are
identified by the `"itemType": "feature"` in the collection definition in the JSON response.

Now let's add the WOUDC station data to our pygeoapi instance.

### Update pygeoapi configuration

- Using a text editor, in the directory in which you downloaded and extracted the workshop,
open the file `workshop/services/pygeoapi-config.yml`.  This is the runtime configuration for the pygeoapi instance at http://localhost:5000
- jump to line 608 in the file
- uncomment lines 608 to 632
- save the file and exit your text editor program

### Restart pygeoapi service

To ensure our updates are made available, we need to restart the Docker container that provides pygeoapi for this workshop:

```bash
docker restart geopython-workshop-pygeoapi
```

At this point the pygeoapi instance will provide the WOUDC stations as a feature collection.  To verify, inspect the following URLs:

http://localhost:5000/collections

Now you will see 11 feature collections listed on the resulting webpage.  To see the same listing in JSON:

http://localhost:5000/collections?f=json

Let's inspect our newly added feature collection:

http://localhost:5000/collections/woudc-stations

...and in JSON:

http://localhost:5000/collections/woudc-stations?f=json

Let's browse the items in the feature collection:

http://localhost:5000/collections/woudc-stations/items

...and in JSON:

http://localhost:5000/collections/woudc-stations/items?f=json

## Publishing raster data

pygeoapi also has the ability to publish raster data as coverages.  Our pygeoapi instance now has 11 collections, so let's add an SRTM GeoTIFF coverage to our pygeoapi instance.

### Update pygeoapi configuration


* Using a text editor, in the directory in which you downloaded and extracted the workshop, open the file workshop/services/pygeoapi-config.yml. This is the runtime configuration for the pygeoapi instance at http://localhost:5000
* jump to line 635 in the file
* uncomment lines 635 to 660
* save the file and exit your text editor program

### Restart pygeoapi service

Let's restart the Docker container again to ensure our server configuration updates are made available:

```bash
docker restart geopython-workshop-pygeoapi
```

At this point the pygeoapi instance will provide the SRTM data as a collection of type coverage. To verify, inspect the following URLs:

http://localhost:5000/collections

You should see 12 collections at this point.  Let's inspect the SRTM collection:

http://localhost:5000/collections/srtm

Notice the "Coverage" links at the bottom of the webpage.  Let's see how this looks in the JSON response:

http://localhost:5000/collections/srtm?f=json

In the collection `links` section, notice the links where the `rel` properties start with `http://www.opengis.net/def/rel/ogc/1.0/coverage-*`.  This signifies that the collection has a coverage representation.  A client can then interact with the coverage via the OGC API - Coverages standard:

http://localhost:5000/collections/srtm/coverage/rangetype

http://localhost:5000/collections/srtm/coverage/domainset

http://localhost:5000/collections/srtm/coverage/coverage



## Publishing metadata

We all know that data is useless without metadata right? Let's use what we learned in [Section 08 - Metadata](08-metadata.ipynb) to publish a metadata record of the WOUDC stations to pycsw.


In [None]:
!pygeometa metadata generate ../data/woudc-stations.yml --schema iso19139 --output ../data/woudc-stations.xml

In [None]:
!ls -l ../data/woudc-stations.*

At this point let's publish to Docker container providing the pycsw service for this workshop.  Run the following commands from a terminal.

```bash
docker exec -it geopython-workshop-pycsw pycsw-admin.py load-records -p /jupyter/content/data/woudc-stations.xml  -c /etc/pycsw/pycsw.cfg
```

### CSW examples

Now let's inspect the record in pycsw in the CSW default Dublin Core representation:

http://localhost:8001/csw?service=CSW&version=2.0.2&request=GetRecordById&id=woudc-stations

...via the ISO 19115:2003 representation:

http://localhost:8001/csw?service=CSW&version=2.0.2&request=GetRecordById&id=woudc-stations&outputschema=http://www.isotc211.org/2005/gmd

...using CSW 3.0 text search functionality:

http://localhost:8001/csw?service=CSW&version=3.0.0&request=GetRecords&typenames=csw:Record&q=ozone

If you have QGIS installed, use the MetaSearch plugin to:

- add the CSW at http://localhost:8001
- search the CSW for the WOUDC record

### OGC API - Records examples

Now, let's additionally see the record in the OGC API - Records functionality:

http://localhost:8001/collections/metadata:main (metadata collection information)

http://localhost:8001/collections/metadata:main/queryables (queryables)

http://localhost:8001/collections/metadata:main/items (items)

http://localhost:8001/collections/metadata:main/items?f=json (items as JSON)

Note that OGC API - Records support in QGIS MetaSearch is currently pending and should
be made available in an upcoming release.  For now, let's interact with the pycsw
catalogue's OGC API - Records support via OWSLib:

In [None]:
from owslib.ogcapi.records import Records

cat = Records('http://geopython-workshop-pycsw:8000')
cat.collections()

In [None]:
catalogue_name = 'metadata:main'

my_catalogue = cat.collection(catalogue_name)

cat.collection_queryables(catalogue_name)

In [None]:
my_catalogue_query = cat.collection_items(catalogue_name)
my_catalogue_query

### OGC API and formats

Notice anything different about the response formats in the various OGC API requests?  You got it, JSON and HTML are now prevalent in these APIs, further lowering the barrier to adoption!  We'll talk a bit more about the emerging OGC API efforts in Section 11.

## Docker magic
As noted previously, we are using Docker to be able to deploy pygeoapi and pycsw services in an easy and robust fashion.  For the purposes of this workshop, we need to be able to make parts of these services accessible to facilitate exercises (updating configuration, adding data/metadata).

### Local mounts
The configurations of pygeoapi and pycsw on their native Docker containers are overridden by local mounts which are made available to the workshop.  As a result, making changes to these configurations from the workshop results in these changes being reflected in the Docker containers.  This saves the workshop participant from logging into the Docker containers and updating configuration by hand.

### Docker command execution
Docker command execution (i.e. `docker exec` as exemplified above) allows for the workshop participant to run commands on the Docker container without having to login directly).  We use this approach in use of `pycsw-admin.py` tooling to publish metadata from disk.

---
[<- Metadata](08-metadata.ipynb) | [Remote data ->](10-remote-data.ipynb)