Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions .github/workflows/deploy-book.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Based on https://github.com/rust-lang/mdBook/wiki/Automated-Deployment%3A-GitHub-Actions
name: Deploy mdbook
on:
push:
branches:
- doc-book

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Install mdbook
run: |
mkdir mdbook
curl -sSL https://github.com/rust-lang/mdBook/releases/download/v0.4.14/mdbook-v0.4.14-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook
echo `pwd`/mdbook >> $GITHUB_PATH
- name: Deploy GitHub Pages
run: |
# This assumes your book is in the root of your repository.
# Just add a `cd` here if you need to change to another directory.
cd docs
mdbook build
git worktree add gh-pages
git config user.name "Deploy from CI"
git config user.email ""
cd gh-pages
# Delete the ref to avoid keeping history.
git update-ref -d refs/heads/gh-pages
rm -rf *
mv ../book/* .
git add .
git commit -m "Deploy $GITHUB_SHA to gh-pages"
git push --force --set-upstream origin gh-pages
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ tests/tmp/*
output/
**/.coverage
**/__pycache__
pgosm-data/*
pgosm-data/*
docs/book/*
291 changes: 3 additions & 288 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,210 +1,12 @@
# PgOSM Flex

PgOSM Flex provides high quality OpenStreetMap datasets in PostGIS using the
[osm2pgsql Flex output](https://osm2pgsql.org/doc/manual.html#the-flex-output).
This project provides a curated set of Lua and SQL scripts to clean and organize
the most commonly used OpenStreetMap data, such as roads, buildings, and points of interest (POIs).
osm2pgsql Flex output.
See [https://pgosm-flex.com/](https://pgosm-flex.com/) for the main project
documentation.

The recommended way to use PgOSM Flex is via the PgOSM Docker image
[hosted on Docker Hub](https://hub.docker.com/repository/docker/rustprooflabs/pgosm-flex).
Basic usage instructions are included in this README.md file, full Docker
usage instructions are available in [docs/DOCKER-RUN.md](docs/DOCKER-RUN.md).


## Project decisions

A few decisions made in this project:

* ID column is `osm_id`
* Geometry stored in SRID 3857 (customizable)
* Geometry column named `geom`
* Defaults to same units as OpenStreetMap (e.g. km/hr, meters)
* Data not included in a dedicated column goes into the `osm.tags` table's `JSONB` column
* Points, Lines, and Polygons are not mixed in a single table
* Tracks latest Postgres, PostGIS, and osm2pgsql versions

This project's approach is to do as much processing in the Lua styles
passed along to osm2pgsql, with post-processing steps creating indexes,
constraints and comments.

## Quick start

See the [Docker Usage](#docker-usage) section below for an explanation of
these commands.

```bash
mkdir ~/pgosm-data
export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=mysecretpassword

docker run --name pgosm -d --rm \
-v ~/pgosm-data:/app/output \
-v /etc/localtime:/etc/localtime:ro \
-e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
-p 5433:5432 -d rustprooflabs/pgosm-flex

docker exec -it \
pgosm python3 docker/pgosm_flex.py \
--ram=8 \
--region=north-america/us \
--subregion=district-of-columbia
```


## Versions Supported

Minimum versions supported:

* Postgres 12
* PostGIS 3.0
* osm2pgsql 1.8.0

Defining [Postgres indexes in the Lua styles](https://osm2pgsql.org/doc/manual.html#defining-indexes)
bumps osm2pgsql minimum requirement to 1.8.0.


## Minimum Hardware

### RAM

osm2pgsql requires [at least 2 GB RAM](https://osm2pgsql.org/doc/manual.html#main-memory).

### Storage

Fast SSD drives are strongly recommended. It should work on slower storage devices (HDD,
SD, etc),
however the [osm2pgsql-tuner](https://github.com/rustprooflabs/osm2pgsql-tuner)
package used to determine the best osm2pgsql command assumes fast SSDs.


## PgOSM via Docker

The PgOSM Flex
[Docker image](https://hub.docker.com/r/rustprooflabs/pgosm-flex)
is hosted on Docker Hub.
The image includes all the pre-requisite software and handles all of the options,
logic, an post-processing steps required. Features include:

* Automatic data download from Geofabrik and validation against checksum
* Custom Flex layers built in Lua
* Mix and match layers using Layersets
* Loads to Docker-internal Postgres, or externally defined Postgres
* Supports `osm2pgsql-replication` and `osm2pgsql --append` mode
* Export processed data via `pg_dump` for loading into additional databases


### Docker usage

This section outlines a typical import using Docker to run PgOSM Flex.
See the full Docker instructions in [docs/DOCKER-RUN.md](docs/DOCKER-RUN.md).

Create directory for the `.osm.pbf` file, output `.sql` file, log output, and
the osm2pgsql command ran.


```bash
mkdir ~/pgosm-data
```

Set environment variables for the temporary Postgres connection in Docker.
These are required for the Docker container to run.


```bash
export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=mysecretpassword
```

Start the `pgosm` Docker container. At this point, Postgres / PostGIS
is available on port `5433`.

```bash
docker run --name pgosm -d --rm \
-v ~/pgosm-data:/app/output \
-v /etc/localtime:/etc/localtime:ro \
-e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
-p 5433:5432 -d rustprooflabs/pgosm-flex
```

Use `docker exec` to run the processing for the Washington D.C subregion.
This example uses three (3) parameters to specify the total system RAM (8 GB)
along with a region/subregion.

* Total RAM for osm2pgsql, Postgres and OS (`8`)
* Region (`north-america/us`)
* Sub-region (`district-of-columbia`) (Optional)


```bash
docker exec -it \
pgosm python3 docker/pgosm_flex.py \
--ram=8 \
--region=north-america/us \
--subregion=district-of-columbia
```


The above command takes roughly 1 minute to run if the PBF for today
has already been downloaded.
If the PBF is not downloaded it will depend on how long
it takes to download the 17 MB PBF file + ~ 1 minute processing.


### After processing

The processed OpenStreetMap data is also available in the Docker container on port `5433`.
You can connect and query directly in the Docker container.

```bash
psql -h localhost -p 5433 -d pgosm -U postgres -c "SELECT COUNT(*) FROM osm.road_line;"

┌───────┐
│ count │
╞═══════╡
│ 39865 │
└───────┘
```


The `~/pgosm-data` directory has two (2) files from a typical single run.
The PBF file and its MD5 checksum have been renamed with the date in the filename.
This enables loading the file downloaded today
again in the future, either with the same version of PgOSM Flex or the latest version. The `docker exec` command uses the `PGOSM_DATE` environment variable
to load these historic files.


If `--pg-dump` option is used the output `.sql` is also saved in
the `~/pgosm-data` directory.
This `.sql` file can be loaded into any other database with PostGIS and the proper
permissions.


```bash
ls -alh ~/pgosm-data/

-rw-r--r-- 1 root root 18M Jan 21 03:45 district-of-columbia-2023-01-21.osm.pbf
-rw-r--r-- 1 root root 70 Jan 21 04:39 district-of-columbia-2023-01-21.osm.pbf.md5
-rw-r--r-- 1 root root 163M Jan 21 16:14 north-america-us-district-of-columbia-default-2023-01-21.sql
```


## Layer Sets


PgOSM Flex includes a few layersets and makes it easy to customize your own.
See [docs/LAYERSETS.md](docs/LAYERSETS.md) for details.



## QGIS Layer Styles

If you use QGIS to visualize OpenStreetMap, there are a few basic
styles using the `public.layer_styles` table created by QGIS.
This data is loaded by default and can be excluded with `--data-only`.

See [the QGIS Style README.md](https://github.com/rustprooflabs/pgosm-flex/blob/main/db/qgis-style/README.md)
for more information.


## Explore data loaded

Expand Down Expand Up @@ -240,100 +42,13 @@ SELECT s_name, t_name, rows, size_plus_indexes



## Meta table

PgOSM Flex tracks processing metadata in the ``osm.pgosm_flex`` table. The initial import
has `osm2pgsql_mode = 'create'`, the subsequent update has
`osm2pgsql_mode = 'append'`.


```sql
SELECT osm_date, region, srid,
pgosm_flex_version, osm2pgsql_version, osm2pgsql_mode
FROM osm.pgosm_flex
;
```

```bash
┌────────────┬───────────────────────────┬──────┬────────────────────┬───────────────────┬────────────────┐
│ osm_date │ region │ srid │ pgosm_flex_version │ osm2pgsql_version │ osm2pgsql_mode │
╞════════════╪═══════════════════════════╪══════╪════════════════════╪═══════════════════╪════════════════╡
│ 2022-11-04 │ north-america/us-colorado │ 3857 │ 0.6.2-e1f140f │ 1.7.2 │ create │
│ 2022-11-25 │ north-america/us-colorado │ 3857 │ 0.6.2-e1f140f │ 1.7.2 │ append │
└────────────┴───────────────────────────┴──────┴────────────────────┴───────────────────┴────────────────┘
```


## Query examples

For example queries with data loaded by PgOSM-Flex see
[docs/QUERY.md](docs/QUERY.md).


## Points of Interest (POIs)

PgOSM Flex loads an range of tags into a materialized view (`osm.poi_all`) for
easily searching POIs.
Line and polygon data is forced to point geometry using
`ST_Centroid()`. This layer duplicates a bunch of other more specific layers
(shop, amenity, etc.) to provide a single place for simplified POI searches.

Special layer included by layer sets `run-all` and `run-no-tags`.
See `style/poi.lua` for logic on how to include POIs.
The topic of POIs is subject and likely is not inclusive of everything that probably should be considered
a POI. If there are POIs missing
from this table please submit a [new issue](https://github.com/rustprooflabs/pgosm-flex/issues/new)
with sufficient details about what is missing.
Pull requests also welcome! [See CONTRIBUTING.md](CONTRIBUTING.md).


Counts of POIs by `osm_type`.

```sql
SELECT osm_type, COUNT(*)
FROM osm.vpoi_all
GROUP BY osm_type
ORDER BY COUNT(*) DESC;
```

Results from Washington D.C. subregion (March 2020).

```
┌──────────┬───────┐
│ osm_type │ count │
╞══════════╪═══════╡
│ amenity │ 12663 │
│ leisure │ 2701 │
│ building │ 2045 │
│ shop │ 1739 │
│ tourism │ 729 │
│ man_made │ 570 │
│ landuse │ 32 │
│ natural │ 19 │
└──────────┴───────┘
```

Includes Points (`N`), Lines (`L`) and Polygons (`W`).


```sql
SELECT geom_type, COUNT(*)
FROM osm.vpoi_all
GROUP BY geom_type
ORDER BY COUNT(*) DESC;
```

```
┌───────────┬───────┐
│ geom_type │ count │
╞═══════════╪═══════╡
│ W │ 10740 │
│ N │ 9556 │
│ L │ 202 │
└───────────┴───────┘
```



## One table to rule them all

Expand Down
Loading