# Spatial Extensions to the Relational Databases

Relational databases are specialized software designed to manage data of basic numeric and character types.
As discussed previously, these databases can often be extended to support additional data types and data operations.
An important extension is the integration of spatial data types;
which adds geospatial geometry or feature data types as well as the necessary operations on said data. 

A spatial database is a database that has been extended and optimized to manage spatial objects defined in a geometric space. 
Spatial databases allow representing simple geometric objects such as points, lines and polygons. 
Advanced spatial databases handle more complex structures such as 3D objects, topological coverages, linear networks, and triangulated irregular network (TIN). 


There exist a number of proprietary and specialized geospatial database formats, such as Shape Files, GeoDB, etc.
However, these are typically suited to desktop software packages and lack the scalability of a DBMS. 
Additionally, these computing environments often lack the ability to integrate large scale heterogenous data, which is a fundamental capability of relational databases.

In modern information systems, location is becoming an increasingly important attribute of data.
This is prevalent in a variety of segments, such as 
  * location-based services,
  * Internet of Things,
  * spatial-statistical analysis, 
  * etc.

This lesson is introduction to geospatial extensions for relational databases.
Various databases support geospatial extensions, which give us tecnologies including
  * Oracle Spatial,
  * PostGIS,
  * SpatiaLite.

## Spatial Databases standards

(from Wikipedia)

The Open Geospatial Consortium developed the Simple Features specification (first released in 1997) and sets standards for adding spatial functionality to database systems.
The SQL/MM Spatial ISO/EIC standard is a part the SQL/MM multimedia standard and extends the Simple Features standard with data types that support circular interpolations.


### Features of spatial databases

(from Wikipedia)

Database systems use indexes to quickly look up values and the way that most databases index data is not optimal for spatial queries. 
Instead, spatial databases use a spatial index to speed up database operations.

In addition to typical SQL queries such as SELECT statements, spatial databases can perform a wide variety of spatial operations. 
The following operations and many more are specified by the Open Geospatial Consortium standard:
  * Spatial Measurements: Computes line length, polygon area, the distance between geometries, etc.
  * Spatial Functions: Modify existing features to create new ones, for example by providing a buffer around them, intersecting features, etc.
  * Spatial Predicates: Allows true/false queries about spatial relationships between geometries. Examples include "do two polygons overlap" or "is there a residence located within a mile of the area we are planning to build the landfill?"
  * Geometry Constructors: Creates new geometries, usually by specifying the vertices (points or nodes) which define the shape.
  * Observer Functions: Queries which return specific information about a feature such as the location of the center of a circle


### Key Datatypes

#### Point

A `point` is a 2-D or 3-D postion in the spatial reference system.
For example, the classical Latitude/Longitude system, a 2-D point is the (Longitude Latitude) corresponding to (X Y).
A 3-D variant has elevation, typically relative to sea level, (X Y Z).

Note: The order is always X, Y, Z.
Colloquially, we refer to Lat-Long (i.e., latitude and lognitude) in Y (latitude), X (longitude) order.
However, geometry constructors, functions, and rendering use (X,Y) order.

![spatial_grid.png MISSING](./images/spatial_grid.png)

#### Linestring

A `linestring` is a ordered list of `Points` that form a path.
Example `(PointA, PointB, PointC)`.

#### Polygon

A `polygon` is a set of closed linestrings (where the first point is the same as the last point).
The first `linestring` is the _outer ring_ and the subsequent `linstring`s are the _inner rings_, aka **holes**.


Example simple Polygon, no holes `((PointA, PointB, PointC, PointA))`


Example Polygon, with holes `((PointA, PointB, PointC, PointA),(PointZ, PointY, PointX,PointZ))`






## Spatial versus Geospatial

Both spatial and geospatial databases have the same geometric datatypes.
A key differnce is the spatial reference system.
Spatial are in cartesion or polar coordinates.
Geospatial systems are typically in a state plane or ellipsoidal (geodesic).


![geographic-coordinate-system.jpg MISSING](./images/geographic-coordinate-system.jpg)

The geodesic spatial references attempt to accurately account for the shape of the earth and the **projection** from the speroid into a 2-dimensional reference space.

![sphere_to_cylinder.jpg MISSING](./images/sphere_to_cylinder.jpg)

In geospatial terms, the X coordinate (longitude) is the East or West measurement and the Y coordinate (latitude) is the North South measurements.
If a Z coordinate 
The most common spatial reference system used for Lat-Long is [4326](http://spatialreference.org/ref/epsg/wgs-84/).

```SQL
Expanded display is on.
dsa_ro=# select * from spatial_ref_sys where srid = 4326;
-[ RECORD 1 ]------------------------------------------------
srid      | 4326
auth_name | EPSG
auth_srid | 4326
srtext    | GEOGCS["WGS 84"
              ,DATUM["WGS_1984"
                      ,SPHEROID["WGS 84"                      
                          ,6378137,298.257223563
                          ,AUTHORITY["EPSG","7030"]
                          ]
                      ,AUTHORITY["EPSG","6326"]
               ]
              ,PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]
             ]
            ,UNIT["degree",0.0174532925199433
                ,AUTHORITY["EPSG","9122"]
             ]
            ,AUTHORITY["EPSG","4326"]
            ]
proj4text | +proj=longlat +datum=WGS84 +no_defs 
```

Something to be careful of when computing in this spatial reference system is that measurements (length, distance, area) are in degrees not Metric of Imperial units.

#### `Multi` variants

Each of the above data types has a `MULTI` variant, such as `MULTIPOINT`.
These are effectively geometry collections.


### Well-Known Text and Well-Known Binary

Two standards exist for rendering and storage of geometry objects.
  * Well-known Text (WKT) is text markup language for representing vector geometry objects.
  * Well-known Binary (WKB) representations are typically shown in hexadecimal strings.

More information can be found here : https://en.wikipedia.org/wiki/Well-known_text

#### Examples:
  * `POINT (30 10)` $\rightarrow$ a point at X = 30, Y = 10.
  * `LINESTRING (30 10, 10 30, 40 40)` $\rightarrow$ a two segment linestring starting at point (30 10) and ended at point (40 40)
  * `POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))` $\rightarrow$ a polygon is an outer ring formed by a closed linestring, starting and ending point (30 10) in this case.
  * `POLYGON ((35 10, 45 45, 15 40, 10 20, 35 10),
(20 30, 35 35, 30 20, 20 30))` a polygon defined as one outer ring and one or more inner rings. The inner rings are holes.
  

## Geospatial Data represents features on Earth

![Geospatial_data.png MISSING](./images/Geospatial_data.png)

These features could be point features, line / path features, or area / bounds features.

In the next labs you will interact with a PostGIS database and begin to explore basic (geo)spatial operations.