# 发布（Publishing）

**注意：本章节为可选内容，未包含在 FOSS4G 2024 Workshop 中**

如果你还有时间，或者希望在本地环境中继续学习，本章节仍然非常值得参考。
需要注意的是，相关的服务容器**只能在本地（Docker）环境中运行**，
无法通过云端的 Workshop Binder 版本来执行。

如果你希望更深入地使用 pygeoapi，我们也非常推荐你参考我们的完整工作坊：
[Diving into pygeoapi](https://dive.pygeoapi.io)，
该工作坊通常也会在 FOSS4G 2023 等会议中提供。

## 概述

**数据发布（Dissemination）** 是任何地理空间数据管理生命周期中的关键环节，
而 **互联网** 则是与他人共享数据的核心通道。

如今，将数据发布到 Web 上从未如此简单：
无论是 Google、GitHub、Amazon、Azure、DigitalOcean 等云服务，
还是大量支持通过 **应用程序接口（API）** 发布数据的工具生态，
都极大降低了数据发布的门槛。

地理空间数据的发布在复杂度上存在较大差异：
从简单地将文件放到 Web 服务器上，
到部署完整的数据服务和 API，
都属于常见且可行的发布方式。

在本章节中，我们将聚焦于**基础的数据与元数据发布流程**，主要使用：

- [pygeoapi](https://pygeoapi.io)  
  （支持多种 [OGC API](https://ogcapi.org) 标准）
- [pycsw](https://pycsw.org)  
  （支持 OGC Web 目录服务规范 —— Catalogue Service for the Web，简称 **CSW**）

因此，整体的基础工作流如下所示：

```bash
数据发布     -> pygeoapi -> OGC API 客户端
元数据发布   -> pycsw    -> OGC CSW 与 OGC API - Records 客户端
```

关于这些服务以及如何与它们进行远程交互的更多示例，
将在第 10 和第 11 节中介绍。

在本示例中，我们使用的是
[WMO WOUDC](https://woudc.org)
提供的
[臭氧与紫外线监测站列表](https://woudc.org/data/stations)
（数据获取时间：2021-09-14）。

相关数据文件位于：

- 数据文件：`../data/woudc-stations.geojson`
- 元数据文件：`../data/woudc-stations.mcf`

## 运行服务

在开始之前，你需要先运行 `pygeoapi` 和 `pycsw` 服务。

请在 `workshop` 目录下的终端（shell）中执行以下命令：

* start: `docker-compose -f docker-compose-services.yml up -d`
* stop: `docker-compose -f docker-compose-services.yml stop`

(如果你的 Docker 版本要求使用新语法，也可以将 docker-compose 替换为 docker compose)。

## Publishing vector data

Let's inspect our current OGC API endpoint powered by pygeoapi:

http://localhost:5000/collections

To see the same listing in JSON:

http://localhost:5000/collections?f=json

Here you will see 10 feature collections listed on the resulting webpage.  Feature collections are
identified by the `"itemType": "feature"` in the collection definition in the JSON response.

Now let's add the WOUDC station data to our pygeoapi instance.


### Access via OWSLib

Use OWSLib to access pygeoapi OGC API endpoint via Python,
From your locally running `pygeoapi` server, which has the address `geopython-workshop-pygeoapi`
within the Docker network. [Remote Data](10-remote-data.ipynb) will go into more detail on access spatial web services from Python.

In [None]:
from owslib.ogcapi.features import Features
oa_feat = Features('http://geopython-workshop-pygeoapi')
oa_feat.links

In [None]:
# Get collections (datasets) in endpoint
collections = oa_feat.collections()
print(f'This OGC API Features endpoint has {len(collections)} datasets')

In [None]:
# Get items (paged) in Lakes collection
lakes = oa_feat.collection('lakes')
lakes_query = oa_feat.collection_items('lakes')
lakes_query['features'][0]

### Update pygeoapi configuration

- Using a text editor, in the directory in which you downloaded and extracted the workshop,
open the file `workshop/services/pygeoapi-config.yml`.  This is the runtime configuration for the pygeoapi instance at http://localhost:5000
- jump to line 608 in the file
- uncomment lines 608 to 632
- save the file and exit your text editor program

### Restart pygeoapi service

To ensure our updates are made available, we need to restart the Docker container that provides pygeoapi for this workshop:

```bash
docker restart geopython-workshop-pygeoapi
```

At this point the pygeoapi instance will provide the WOUDC stations as a feature collection.  To verify, inspect the following URLs:

http://localhost:5000/collections

Now you will see 11 feature collections listed on the resulting webpage.  To see the same listing in JSON:

http://localhost:5000/collections?f=json

Let's inspect our newly added feature collection:

http://localhost:5000/collections/woudc-stations

...and in JSON:

http://localhost:5000/collections/woudc-stations?f=json

Let's browse the items in the feature collection:

http://localhost:5000/collections/woudc-stations/items

...and in JSON:

http://localhost:5000/collections/woudc-stations/items?f=json

## Publishing raster data

pygeoapi also has the ability to publish raster data as coverages.  Our pygeoapi instance now has 11 collections, so let's add an SRTM GeoTIFF coverage to our pygeoapi instance.

### Update pygeoapi configuration


* Using a text editor, in the directory in which you downloaded and extracted the workshop, open the file workshop/services/pygeoapi-config.yml. This is the runtime configuration for the pygeoapi instance at http://localhost:5000
* jump to line 635 in the file
* uncomment lines 635 to 660
* save the file and exit your text editor program

### Restart pygeoapi service

Let's restart the Docker container again to ensure our server configuration updates are made available:

```bash
docker restart geopython-workshop-pygeoapi
```

At this point the pygeoapi instance will provide the SRTM data as a collection of type coverage. To verify, inspect the following URLs:

http://localhost:5000/collections

You should see 12 collections at this point.  Let's inspect the SRTM collection:

http://localhost:5000/collections/srtm

Notice the "Coverage" links at the bottom of the webpage.  Let's see how this looks in the JSON response:

http://localhost:5000/collections/srtm?f=json

In the collection `links` section, notice the links where the `rel` properties start with `http://www.opengis.net/def/rel/ogc/1.0/coverage-*`.  This signifies that the collection has a coverage representation.  A client can then interact with the coverage via the OGC API - Coverages standard:

http://localhost:5000/collections/srtm/schema

http://localhost:5000/collections/srtm

http://localhost:5000/collections/srtm/coverage



## Publishing metadata

We all know that data is useless without metadata right? Let's use what we learned in [Section 08 - Metadata](08-metadata.ipynb) to publish a metadata record of the WOUDC stations to pycsw.


In [None]:
!pygeometa metadata generate ../data/woudc-stations.yml --schema iso19139 --output ../data/woudc-stations.xml

In [None]:
!ls -l ../data/woudc-stations.*

At this point let's publish to Docker container providing the pycsw service for this workshop.  Run the following commands from a terminal.

```bash
docker exec -it geopython-workshop-pycsw pycsw-admin.py load-records -p /jupyter/content/data/woudc-stations.xml  -c /etc/pycsw/pycsw.cfg
```

### CSW examples

Now let's inspect the record in pycsw in the CSW default Dublin Core representation:

http://localhost:8001/csw?service=CSW&version=2.0.2&request=GetRecordById&id=woudc-stations

...via the ISO 19115:2003 representation:

http://localhost:8001/csw?service=CSW&version=2.0.2&request=GetRecordById&id=woudc-stations&outputschema=http://www.isotc211.org/2005/gmd

...using CSW 3.0 text search functionality:

http://localhost:8001/csw?service=CSW&version=3.0.0&request=GetRecords&typenames=csw:Record&q=ozone

If you have QGIS installed, use the MetaSearch plugin to:

- add the CSW at http://localhost:8001
- search the CSW for the WOUDC record

### OGC API - Records examples

Now, let's additionally see the record in the OGC API - Records functionality:

http://localhost:8001/collections/metadata:main (metadata collection information)

http://localhost:8001/collections/metadata:main/queryables (queryables)

http://localhost:8001/collections/metadata:main/items (items)

http://localhost:8001/collections/metadata:main/items?f=json (items as JSON)

Note that OGC API - Records support in QGIS MetaSearch is currently pending and should
be made available in an upcoming release.  For now, let's interact with the pycsw
catalogue's OGC API - Records support via OWSLib:

In [None]:
from owslib.ogcapi.records import Records

cat = Records('http://geopython-workshop-pycsw:8000')
cat.collections()

In [None]:
catalogue_name = 'metadata:main'

my_catalogue = cat.collection(catalogue_name)

cat.collection_queryables(catalogue_name)

In [None]:
my_catalogue_query = cat.collection_items(catalogue_name)
my_catalogue_query

### OGC API and formats

Notice anything different about the response formats in the various OGC API requests?  You got it, JSON and HTML are now prevalent in these APIs, further lowering the barrier to adoption!  We'll talk a bit more about the emerging OGC API efforts in Section 11.

## Docker magic
As noted previously, we are using Docker to be able to deploy pygeoapi and pycsw services in an easy and robust fashion.  For the purposes of this workshop, we need to be able to make parts of these services accessible to facilitate exercises (updating configuration, adding data/metadata).

### Local mounts
The configurations of pygeoapi and pycsw on their native Docker containers are overridden by local mounts which are made available to the workshop.  As a result, making changes to these configurations from the workshop results in these changes being reflected in the Docker containers.  This saves the workshop participant from logging into the Docker containers and updating configuration by hand.

### Docker command execution
Docker command execution (i.e. `docker exec` as exemplified above) allows for the workshop participant to run commands on the Docker container without having to login directly).  We use this approach in use of `pycsw-admin.py` tooling to publish metadata from disk.

---
[<- Metadata](08-metadata.ipynb) | [Remote data ->](10-remote-data.ipynb)