# Chapter 2: Geospatial Data

# Contents

* An overview of common data formats
* Data structures
    - Common traits 
        - Geolocation 
        - Subject information
* Spatial indexing
    - Indexing algorithms
        - Quadtree index
        - R-tree index
    - Grids
* Overviews 
* Metadata
* File structure
* Vector data
    - Shapefiles
    - CAD files
    - Tag-based and markup-based formats
    - GeoJSON
* Raster data
    - TIFF files
    - JPEG, GIF, BMP, and PNG
    - Compressed formats
    - ASCII Grids
    - World files
* Point cloud data
* Web services 
* Summary

# An overview of common data formats

* Spreadsheets and comma-separated files (CSV files) or tab-separated files (TSV files)
* Geotagged photos
* Lightweight binary points, lines, and polygons
* Multi-gigabyte satellite or aerial images
* Elevation data such as grids, point clouds, or integer-based images
* XML files
* JSON files
* Databases (both servers and file databases)
* Web services

* TerraServer, which they relaunched around this time. In 2004, the 
* Open Geospatial Consortium (OGC) updated the version of its 
* Web Map Service (WMS) to

* asynchronous JavaScript and XML (AJAX)
* OpenLayers
* OpenStreetMap

* Global Positioning System (GPS)

* European Petroleum Survey Group (EPSG) 

The following URL provides an image taken from Wikipedia:
https://en.wikipedia.org/wiki/File:Tissot_mercator.png


<img src="figures/cap2.1.png" width=600 />

* vector data and 
* raster data

If you want to see a projection that shows the relative size of continents more accurately, refer to the Goode homolosine projection:
https://en.wikipedia.org/wiki/Goode_homolosine_projection

# Data structures
* Common traits 

## Common traits 
* Geolocation 
* Subject information

### Geolocation 

### Subject information

# Spatial indexing
* Indexing algorithms
* Grids

## Indexing algorithms
* Quadtree index
* R-tree index

### Quadtree index

<img src="figures/cap2.2.png" width=600 />

### R-tree index

<img src="figures/cap2.3.png" width=600 />

## Grids

# Overviews 

<img src="figures/cap2.4.png" width=600 />

# Metadata

* Federal Geographic Data Committee (FGDC) 
* Content Standard for Digital Geospatial Metadata (CSDGM), 
* Infrastructure for Spatial Information in the European Community (INSPIRE).

# File structure

When you unzip this, you will see three files. For this example, we'll be using hancock.shp. The Esri shapefile format has a fixed location and data type in the file header from byte 36 to byte 37 for the minimum x, minimum y, maximum x, and maximum y bounding box values. In this example, we will execute the following steps:
1. Import the struct module.
2. Open the hancock.zip shapefile in the binary read mode.
3. Navigate to byte 36.
4. Read each 8-byte double specified as d, and unpack it using the struct module in little-endian order as designated by the < sign.

The best way to execute this script is in the interactive Python interpreter. We will read the minimum longitude, minimum latitude, maximum longitude, and maximum latitude:

In [None]:
import struct

In [None]:
f = open("hancock.shp","rb")
f.seek(36)

In [None]:
struct.unpack("<d", f.read(8))

In [None]:
struct.unpack("<d", f.read(8))

In [None]:
struct.unpack("<d", f.read(8))

In [None]:
struct.unpack("<d", f.read(8))

You'll notice that when the struct module unpacks a value, it returns a Python tuple with one value. You can shorten the preceding unpacking code to one line by specifying all four doubles at once and increasing the byte length to 32 bytes as shown in the following code:

In [None]:
f.seek(36)

In [None]:
struct.unpack("<dddd", f.read(32))

# Vector data
* Shapefiles
* CAD files
* Tag-based and markup-based formats
* GeoJSON

## Shapefiles

* ARC/INFO
* OGR library
* Shapely and Fiona

The .shp, .shx, and .dbf files are required for a valid shapefile.

<img src="figures/cap2.5.png" width=600 />

<img src="figures/cap2.6.png" width=600 />

<img src="figures/cap2.7.png" width=600 />

## CAD files

* Curves
* Surfaces (for objects that are different from geospatial elevation surfaces)
* 3D solids
* Text (rendered as an object)
* Text styling
* Viewport configuration

## Tag-based and markup-based formats

* well-known text (WKT)
* Keyhole Markup Language (KML)
* Open Street Map (OSM)
* Geographic Markup Language (GML) 
* Web Feature Service (WFS)

In [None]:
<?xml version="1.0" encoding="utf-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
    <Placemark>
        <name>Mockingbird Cafe</name>
        <description>Coffee Shop</description>
        <Point>
            <coordinates>-89.329160,30.310964</coordinates>
        </Point>
    </Placemark>
</kml>

* It is a human readable format
* It can be edited in a text editor
* It is well-supported by programming languages (especially Python!)
* It is, by definition, easily extensible

In [None]:
GEOGCS["WGS 84",
       DATUM["WGS_1984",
           SPHEROID["WGS 84",6378137,298.257223563,
               AUTHORITY["EPSG","7030"]],
           AUTHORITY["EPSG","6326"]],
       PRIMEM["Greenwich",0,
           AUTHORITY["EPSG","8901"]],
       UNIT["degree",0.01745329251994328,
           AUTHORITY["EPSG","9122"]],
       AUTHORITY["EPSG","4326"]]

## GeoJSON

In [None]:
{ "type": "GeometryCollection",
     "geometries": [
       { "type": "Point",
         "coordinates": [-89.33, 30.0]
       },
       { "type": "LineString",
         "coordinates": [ [-89.33, 30.30], [-89.36, 30.28] ]
       }
  {"type": "Polygon",
    "coordinates": [[
      [-104.05, 48.99],
      [-97.22,  48.98]
    }
  ] 
}

In [None]:
gc = { "type": "GeometryCollection",
     "geometries": [
       { "type": "Point",
         "coordinates": [-89.33, 30.0]
       },
       { "type": "LineString",
         "coordinates": [ [-89.33, 30.30], [-89.36, 30.28] ]
       }
] }
   gc
   {'type': 'GeometryCollection', 'geometries': [{'type': 'Point',
     'coordinates': [
     -89.33, 30.0]}, {'type': 'LineString', 'coordinates': [[-89.33,
       30.3], [-89.36,30.28]]}]}


# Raster data
* TIFF files
* JPEG, GIF, BMP, and PNG
* Compressed formats
* ASCII Grids
* World files

 ASCII text files or Binary Large Objects (BLOBs) in databases.

* Common Data Form (NetCDF), GRIB, and HDF5
* Geospatial Data Abstraction Library (GDAL)

## TIFF files

Tagged Image File Format (TIFF)

## JPEG, GIF, BMP, and PNG

## Compressed formats

* Multi-resolution Seamless Image Database (MrSID) (.sid) and 
* Enhanced Compression Wavelet (ECW) (.ecw) 

## ASCII Grids   

In [None]:
<NCOLS xxx>
<NROWS xxx>
<XLLCENTER xxx | XLLCORNER xxx>
<YLLCENTER xxx | YLLCORNER xxx>
<CELLSIZE xxx>
{NODATA_VALUE xxx}
row 1 row 2 .
.
.
row n

* The number of columns
* The number of rows
* The x-axis cell center coordinate | x-axis lower-left corner coordinate
* The y-axis cell center coordinate | y-axis lower-left corner coordinate
* The cell size in mapping units
* The no-data value (typically, 9999)

## World files

<img src="figures/cap2.8.png" width=600 />

The structure of a world file is very simple. It is a six-line text file as follows:
* Line 1: The cell size along the x axis in ground units
* Line 2: The rotation on the y axis
* Line 3: The rotation on the x axis
* Line 4: The cell size along the y axis in ground units
* Line 5: The center x-coordinate of the upper left cell
* Line 6: The center y-coordinate of the upper left cell

In [None]:
The following is an example of world file values:
   15.0
   0.0
   0.0
   -15.0
   -89,38
   45.0

U.S. Geological Survey (USGS)

<img src="figures/cap2.9.png" width=600 />

# Point cloud data

### LIDAR 
LIDAR uses powerful laser range-finding systems to model the world with very high precision. The term LIDAR or LiDAR is a combination of the words light and radar. Some people claim it also stands for Light Detection and Ranging. LIDAR sensors can be mounted on aerial platforms including satellites, airplanes, or helicopters. They can also be mounted on vehicles for ground-based collection.

<img src="figures/cap2.10.png" width=600 />

# Web services 

 Representational State Transfer (REST)

# Summary

# 참고자료