# PostGIS on Greenplum Database

## Data

### ne_10m_admin_0_countries
Simple layer of countries at the Admin 0 level. Column geom contains the geometry, and name the common name of the country. Also includes various other data fields such as population, abbreviations, GDP estimates, and names in various other languages.

More information: https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-0-countries/

### ne_10m_admin_1_states_provinces
Contains level 1 adminitrative subdivisions such as states and provinces. Column geom contains the geometry and name contains the name of the division. Various other fields are also available.

More information: https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-1-states-provinces/

### ne_10m_populated_places
Contains point locations of populated places. Column geom contains the geometry, and name contains the name of the place. Various other fields also included.

More information: https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-populated-places/

### ne_10m_lakes
Contains global lakes and reservoirs, including the Europe and North America supplements. Column geom contains the geometry, featurecla is a feature class to differentiate between lakes and reservoirs, and name contains the name of the feature.

More information: https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-lakes/

### ne_10m_rivers
Contains global rivers and lakes centerlines, including the Europe and North America supplements. Column geom contains the geometry, and name contains the common name of the feature.

More information: https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-rivers-lake-centerlines/
                
### ne_10m_airports
Airport information derives from [Mile High Club](https://github.com/nvkelso/mile-high-club), a detailed GIS compilation of world wide airports that is in the public domain. Column geom contains the geometry, and name contains the common name of the feature.

More information: https://www.naturalearthdata.com/downloads/10m-cultural-vectors/airports/

### ne_10m_airports
Time zones primarily derive from the Central Intelligence Agency map of Time Zones, downloaded from the World Factbook website May 2012. Boundaries were adjusted to fit the Natural Earth line work at a scale of 1:10 million and to follow twelve nautical mile territorial sea boundary lines when running along coasts.

More information: https://www.naturalearthdata.com/downloads/10m-cultural-vectors/timezones/

## Workflow

In [30]:
import os, re
from IPython.display import display_html, display_markdown

import numpy as np
import pandas as pd

CONNECTION_STRING = os.getenv('AWSGPDBCONN')

cs = re.match('^postgresql:\/\/(\S+):(\S+)@(\S+):(\S+)\/(\S+)$', CONNECTION_STRING)

DB_USER   = cs.group(1)
DB_PWD    = cs.group(2)
DB_SERVER = cs.group(3)
DB_PORT   = cs.group(4)
DB_NAME   = cs.group(5)
con = CONNECTION_STRING 

%reload_ext sql
%sql $CONNECTION_STRING

In [31]:
%%sql $DB_USER@$DB_SERVER
SELECT UNNEST(ARRAY[version, postgis_full_version]) version_info FROM (SELECT version()) A, (SELECT postgis_full_version()) B

2 rows affected.


version_info
"PostgreSQL 9.4.24 (Greenplum Database 6.12.0 build commit:4c176763c7619fb678ce38095e6b3e8fb9548186) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Oct 28 2020 19:42:15"
"POSTGIS=""2.5.4"" [EXTENSION] PGSQL=""94"" GEOS=""3.4.2-CAPI-1.8.2 r3921"" PROJ=""Rel. 4.8.0, 6 March 2012"" GDAL=""GDAL 1.11.1, released 2014/09/24"" LIBXML=""2.9.1"" LIBJSON=""0.12"" RASTER"


In [32]:
def display_gist_url(url):
    gist = re.match('^http:\/\/geojson.io\/#id=gist:\/(\S+)$', url)
    gist = 'https://gist.github.com/cantzakas/' + gist.group(1)
    return display_markdown(gist, raw=True)

*** insert how to download and load the data here ****

## Examples

### [PostGIS.ST_WithIn()](https://postgis.net/docs/ST_Within.html) - Find all airports in the United Kingdom

In [53]:
%%sql
SELECT airports.geom
    , airports.name_en
    , airports.abbrev
FROM public.ne_10m_airports airports
    , public.ne_10m_admin_0_countries countries 
WHERE 
    ST_WithIn(airports.geom, countries.geom)
    AND countries.name = 'United Kingdom'
LIMIT 5;

 * postgresql://gpadmin:***@ec2-35-178-74-236.eu-west-2.compute.amazonaws.com:5432/dev
5 rows affected.


geom,name_en,abbrev
010100002031BF0D0011343E86E59BF4BF3940696037F04D40,Sumburgh Airport,LSI
010100002031BF0D00FF0F778A935DFBBF1A295C32BF844B40,Newcastle Airport,NCL
010100002031BF0D0068FD0AD46FB70AC0605F760C06B34940,Cardiff Airport,CWL
010100002031BF0D0017467205AB8EFABF22E8C6133EEF4A40,Leeds Bradford International Airport,LBA
010100002031BF0D0002BAC8160EEA0AC017A6C4376AF94B40,Edinburgh Airport,EDI


In [35]:
import geojsonio
import geopandas as gpd

sql = """
SELECT airports.geom
    , airports.name_en
    , airports.abbrev
FROM public.ne_10m_airports airports
    , public.ne_10m_admin_0_countries countries 
WHERE 
    ST_WithIn(airports.geom, countries.geom)
    AND countries.name = 'United Kingdom'
"""

df = gpd.read_postgis(sql, con)
url=geojsonio.display(df.to_json())

display_markdown(url, raw=True)
display_gist_url(url)

http://geojson.io/#id=gist:/3492df375d62c243e3e317fe65c35a47

https://gist.github.com/cantzakas/3492df375d62c243e3e317fe65c35a47

### [PostGIS.ST_Intersects()](https://postgis.net/docs/ST_Intersects.html) - Make a list of rivers in Poland

In [54]:
%%sql
SELECT rivers.name
    , rivers.name_en
    , ST_AsGeoJSON(rivers.geom)
FROM 
    public.ne_10m_admin_0_countries AS countries, 
    public.ne_10m_rivers_lake_centerlines AS rivers
WHERE 
    countries.name = 'Poland' 
    AND rivers.featurecla = 'River'
    AND rivers.name != '' 
    AND ST_Intersects(
        countries.geom,
        rivers.geom
    )
LIMIT 1;

 * postgresql://gpadmin:***@ec2-35-178-74-236.eu-west-2.compute.amazonaws.com:5432/dev
1 rows affected.


name,name_en,st_asgeojson
Mukhavyets,Mukhavets,"{""type"":""MultiLineString"",""coordinates"":[[[23.7497664722712,51.4848086609315],[23.7534285816462,51.4980736348899],[23.7538354826879,51.5094668640565],[23.7472436858129,51.5252546244732],[23.7535913420629,51.5324974630149],[23.7724715503962,51.5391706400982],[23.8007918628962,51.5447044942649],[23.8288680347712,51.5578880880149],[23.8474227222712,51.582831121869],[23.8651636076879,51.611883856244],[23.8913680347712,51.6347923848899],[23.9531356128962,51.6659610046815],[23.9775496753962,51.6782494161399],[23.9958602222712,51.694647528119],[24.0131942066462,51.6999779317649],[24.0338647795629,51.7019310567649],[24.0698348316462,51.7007103536399],[24.1001082691462,51.7063255880149],[24.1320906910212,51.717596746869],[24.2050073576879,51.7610944682232],[24.2067977222712,51.7737084005149],[24.2059839201879,51.7896589213482],[24.2148543628962,51.8232689473899],[24.2023218108129,51.8526472025982],[24.2041121753962,51.8648542338482],[24.2204695972712,51.8770612650982],[24.2485457691462,51.8908145203065],[24.2839461597712,51.9028587911399],[24.3447371753962,51.9175072286399],[24.3647567066462,51.9258080098899],[24.3812768889379,51.9298770203065],[24.4001570972712,51.9262962911399],[24.4213973316462,51.9284935567649],[24.4431258472712,51.9463972025982],[24.4604598316462,51.9718692078065],[24.4707137378962,51.9920514994732],[24.4696557951879,52.0100365255149],[24.4594832691462,52.0302188171815],[24.4431258472712,52.0479597025982],[24.4181421233129,52.068182684369],[24.4204207691462,52.0769310567649],[24.4186304045629,52.086981512494],[24.3790796233129,52.1108666036399],[24.3706160816462,52.1224225932232],[24.3778589201879,52.1327578796815],[24.3944604826879,52.1420352234315],[24.4043074878962,52.1523705098899],[24.3955184253962,52.165350653119],[24.3740340503962,52.1722272807232],[24.3535262378962,52.1679955098899],[24.3269149097712,52.1584740255149],[24.2562768889379,52.1482201192649],[24.2226668628962,52.1375593119732],[24.2009383472712,52.137274481244],[24.1806746753962,52.1408959005149],[24.1701766285212,52.1457787130149],[24.1625268889379,52.1469994161399],[24.1489363941462,52.1415469421815],[24.1262313160212,52.1381289734315],[24.0554305347712,52.1525332703065],[24.0079858733129,52.1548119161399],[23.9912215503962,52.1588809265565],[23.9757593108129,52.1520449890565],[23.9570418628962,52.1452090515565],[23.9296167326879,52.1214460307232],[23.9192000660212,52.1179466817649],[23.9137475920629,52.1146914734315],[23.9065047535212,52.1076927755149],[23.8969832691462,52.1006126973899],[23.8850203785212,52.0974388692649],[23.6355086597712,52.0837669942649]]]}"


In [37]:
sql = """
SELECT rivers.name
    , rivers.name_en
    , rivers.geom
FROM 
    public.ne_10m_admin_0_countries AS countries, 
    public.ne_10m_rivers_lake_centerlines AS rivers
WHERE 
    countries.name = 'Poland' 
    AND rivers.featurecla = 'River'
    AND rivers.name != '' 
    AND ST_Intersects(
        countries.geom,
        rivers.geom
    )
"""
df = gpd.read_postgis(sql, con)
url=geojsonio.display(df.to_json())

display_markdown(url, raw=True)
display_gist_url(url)

http://geojson.io/#id=gist:/73b6e7ebed36af54d379c21a28c03756

https://gist.github.com/cantzakas/73b6e7ebed36af54d379c21a28c03756

## [PostGIS.ST_Union()](https://postgis.net/docs/ST_Union.html) Find capital cities in Europe with a river running through it

In [55]:
%%sql
WITH ranked AS (
SELECT places.gid AS pgid
    , rivers.gid AS rgid
    , DENSE_RANK () OVER (ORDER BY places.pop_max DESC) AS rank
FROM
    public.ne_10m_populated_places_simple AS places,
    public.ne_10m_rivers_lake_centerlines AS rivers
WHERE 
    places.adm0cap = 1 
    AND rivers.name_en != '' 
    AND ST_Within(places.geom,
            (SELECT ST_Union(geom) FROM public.ne_10m_admin_0_countries WHERE continent='Europe')) 
    AND ST_Intersects(ST_Buffer(places.geom, 0.05), rivers.geom)
GROUP BY places.gid, rivers.gid, places.pop_max)

SELECT pl.name, ranked.rank, ST_AsGeoJSON(pl.geom)
FROM ranked, public.ne_10m_populated_places_simple AS pl
WHERE ranked.rank <= 10 AND pl.gid = ranked.pgid
UNION ALL
SELECT rr.name, ranked.rank, ST_AsGeoJSON(rr.geom)
FROM ranked, public.ne_10m_rivers_lake_centerlines AS rr
WHERE ranked.rank <= 10 AND rr.gid = ranked.rgid
LIMIT 1;

 * postgresql://gpadmin:***@ec2-35-178-74-236.eu-west-2.compute.amazonaws.com:5432/dev
1 rows affected.


name,rank,st_asgeojson
London,2,"{""type"":""Point"",""coordinates"":[-0.118667702475932,51.5019405883275]}"


In [51]:
sql = """
WITH ranked AS (
SELECT places.gid AS pgid
    , rivers.gid AS rgid
    , DENSE_RANK () OVER (ORDER BY places.pop_max DESC) AS rank
FROM
    public.ne_10m_populated_places_simple AS places,
    public.ne_10m_rivers_lake_centerlines AS rivers
WHERE 
    places.adm0cap = 1 
    AND rivers.name_en != '' 
    AND ST_Within(places.geom,
            (SELECT ST_Union(geom) FROM public.ne_10m_admin_0_countries WHERE continent='Europe')) 
    AND ST_Intersects(ST_Buffer(places.geom, 0.05), rivers.geom)
GROUP BY places.gid, rivers.gid, places.pop_max)

SELECT pl.name, ranked.rank, pl.geom
FROM ranked, public.ne_10m_populated_places_simple AS pl
WHERE ranked.rank <= 10 AND pl.gid = ranked.pgid
UNION ALL
SELECT rr.name, ranked.rank, rr.geom
FROM ranked, public.ne_10m_rivers_lake_centerlines AS rr
WHERE ranked.rank <= 10 AND rr.gid = ranked.rgid;
"""

df = gpd.read_postgis(sql, con)
url=geojsonio.display(df.to_json())

display_markdown(url, raw=True)
display_gist_url(url)

http://geojson.io/#id=gist:/5ccbde7ed51e4e75c8a0dae81965d64f

https://gist.github.com/cantzakas/5ccbde7ed51e4e75c8a0dae81965d64f

### Scratch pad