# Spatial Data: Sources and Tools<br>UMN Day of Data 2019

In [None]:
%%html
<style>.rendered_html img {max-width: 1000px; margin-left: 0; margin-right: 0;}</style>

## Must be covered: Coordinate systems and projections

### What we know

We know the earth is neither flat nor spherical nor ellipsoidal, although it's commonly represented as an ellipsoid (oblate spheroid). We typically show maps in two dimensions and, for large-scale maps (showing a small area) we often use Cartesian coordinates (not lat/lng, just x,y).

![Mercator projection compared with Equal Earth projection](https://i.imgur.com/gIbwdBB.png)
Mercator projection (left), Equal Earth projection (right).<br>Images: [Equal Earth projection, shadedrelief.com](http://shadedrelief.com/ee_proj/)

### Geographic coordinate systems (GCS)

* Latitude and longitude
* Different GCS use different datums, or representations of the earth, usually an ellipsoid (semimajor and semimajor axis lengths) and a starting point
* WGS84 — global positioning system (GPS), uses the WGS84 datum
* NAD83 — common in the US and Canada (with many variants), uses the GRS80 datum

### Same location, different coordinates

A single location, somewhere in Bellingham, WA.

| Datum    | Longitude         | Latitude         |
| -------- | ----------------- | ---------------- |
| NAD 1927 | -122.466903686523 | 48.7440490722656 |
| NAD 1983 | -122.46818353793  | 48.7438798543649 |
| WGS 1984 | -122.46818353793  | 48.7438798534299 |

Example from: [Projection basics the GIS professional needs to know](http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=Projection_basics_the_GIS_professional_needs_to_know)

### Projected coordinate systems (PCS)

* Set of mathematical equations to transform geographic coordinates into planar coordinates (trig!); sometimes with additional rules/cases
* PCS related to GCS, so datum does matter
* How do you flatten a sphere, ellipsoid, or other representation of the earth and retain features?
* Web Mercator is everywhere. Don't use it for analysis.

### Distortion in projections

* Equal-area projections preserve area
* Conformal projections don't distort angles
* Equidistant
* Azimuthal preserves direction from a single location
* Compromise (halfway happy?)

### Spatial reference ID (SRID)

* Unique identifiers for GCS/PCS
* Many from the European Petroleum Survey Group (EPSG)
* Check out [spatialreference.org](http://spatialreference.org/)

### Minnesota

* NAD83 / UTM zone 15N (EPSG:26915) is common for statewide (and TC metro)
* WGS84 / UTM zone 15N (EPSG:32615) appearing more
* Three State Plane Coordinate System zones
* MnDOT breaks the state down extensively
  * at least one coordinate system per county
  * North Shore
* Using ArcGIS and dealing with county coordinate systems? Install the coordinate systems add-on

[![](https://i.imgur.com/jCrIYrX.png)](https://www.dot.state.mn.us/surveying/pdf/mncoordsys.pdf)

### When is a foot not a foot?
When it's a [US survey foot](https://en.wikipedia.org/wiki/Foot_(unit)#US_survey_foot)!

The difference is small, but will add up over large areas. Know your linear unit of measure.

## Spatial data formats

* OGC GeoPackage (first on the list, but not so common yet)
* [Esri] Shapefile
* GeoJSON
* TopoJSON
* Esri geodatabase
    * enterprise
    * file
    * personal (don't do it!)
* KML/Z
* WKT
* WKB

## Data Sources

### The world

#### [Natural Earth](https://www.naturalearthdata.com/)

Cultural and physical vector data at varying scales, plus large-scale raster data.

[![](https://i.imgur.com/NSeG8Pg.png)](https://www.naturalearthdata.com/)


#### [GADM (global administrative boundaries)](https://gadm.org/)

[![](https://i.imgur.com/73TCYpz.png)](https://gadm.org/)


#### [TransitFeeds.com](http://transitfeeds.com/)
General Transit Feed Specification (GTFS) data from around the world.

> An extensive archive of public transit data for software developers, transit agencies and more. Browse and download official GTFS & GTFS-realtime feeds from around the world. -- 

#### [USGS EarthExplorer](https://earthexplorer.usgs.gov/)

Search by AOI, date, and more. Good source for Landsat and Sentinel-2 imagery, among many others.

### United States

#### [The National Map](https://www.usgs.gov/core-science-systems/national-geospatial-program/national-map)

Another source for a variety of maps and data, US only.

#### [U.S. Census Bureau](https://www.census.gov/)

* Decennial Census
* American Community Survey (ACS)
* [![American Fact Finder banner and logo](https://i.imgur.com/PJgFQIF.png)](https://factfinder.census.gov/) [American Fact Finder](https://factfinder.census.gov/) (data access)

* [![TIGER/Line logo](https://i.imgur.com/7ppcgu3.png)](https://www.census.gov/geo/maps-data/data/tiger-line.html)
Statistical and administrative units ([TIGER/Line](https://www.census.gov/geo/maps-data/data/tiger-line.html)) 

#### [National Historical Geographic Information System (NHGIS)](https://www.nhgis.org/)

[![](https://i.imgur.com/MrgAehR.png)](https://www.nhgis.org/)

* An IPUMS product
* Significantly more usable interface for obtaining Decennial Census, ACS, and other demographic data from the U.S. Census Bureau
* Same holds for geography/geometry
* Coastal water (including Great Lakes) are erased from their boundaries, which is great for working with Minnesota when Census cartographic boundaries won't cut it


#### [PRISM Climate Data](http://prism.oregonstate.edu/)
From the PRISM Climate Group at Oregon State, includes temperatures, precipitation, and more for the US, including 30-year normals, yearly, and monthly values.

![30-year normal precipitation (annual) map of the U.S.](https://i.imgur.com/E0dVQtN.png)


#### [USDA NRCS Geospatial Data Gateway](https://datagateway.nrcs.usda.gov/)
Find NAIP imagery and a variety of other data by location.

Looking for Minnesota NAIP imagery? See the goodies section.


### Minnesota

#### [Minnesota Geospatial Commons](https://gisdata.mn.gov/)
[![Minnesota Geospatial Commons wordmark](https://i.imgur.com/uvblB2D.png)](https://gisdata.mn.gov/)
Looking for geospatial data related to Minnesota? Start here. The U, state agencies, counties, the Met Council, and more.
* [MnDOT datasets](https://gisdata.mn.gov/organization/us-mn-state-dot) available via the Commons (e.g., centerlines, airports)
* [MetroGIS Regional Parcel Dataset (quarterly)](https://gisdata.mn.gov/dataset/us-mn-state-metrogis-plan-regonal-prcls-open), plus end-of-year options back to 2002
* [MNDNR Hydrography](https://gisdata.mn.gov/dataset/water-dnr-hydrography)
* [City, township, and unorganized territory (CTU) boundaries](https://gisdata.mn.gov/dataset/bdry-mn-city-township-unorg)

#### [MnTopo](http://arcgis.dnr.state.mn.us/maps/mntopo/)
Select an area of interest, get zipped data
* DEM
* Elevation contours
* Hillshades
* LiDAR points (LAS)
* Building footprints (your mileage will vary)

![Screenshot of MnTOPO interface with download options](https://i.imgur.com/dj0ygo1.png)

#### [Borchert Map Library](https://www.lib.umn.edu/borchert)
* [Minnesota Historical Aerial Photographs Online (MHAPO)](https://www.lib.umn.edu/apps/mhapo/)
* [Map & Geography Related Online Resources](https://www.lib.umn.edu/borchert/map-geography-online-resources)

#### [Remote Sensing and Geospatial Analysis Laboratory](https://rs.umn.edu/mndata)
* various land cover, imperviousness, and canopy products
* 15-meter resolution statewide, 1-meter metro areas
* [Minnesota Geospatial Image Service](http://www.mngeo.state.mn.us/chouse/wms/geo_image_server.html)

#### [Minnesota LiDAR information](http://www.mngeo.state.mn.us/chouse/elevation/lidar.html)
* Easy to get via MnTopo for small areas
* Note that it's a bit dated; there's hope for new collections (effort underway?)

### Goodies on `files.umn.edu`

U-Spatial keeps a collection of datasets on `files.umn.edu` for anyone at the U to use (at least, for research and teaching). These datasets tend to be large, difficult to easily come by, or slow to download from their authoritative home.

`\\files.umn.edu\us\gis\u-spatial\UMN_Users\data`

You can also try getting there using [webfiles.umn.edu](https://webfiles.umn.edu).

* Census -- some enumeration units because the Census FTP server can be dreadful
* Esri Data -- Includes the latest address locator files from the Business Analyst US 2018 data (for research and academic use only)
* Imagery
    * National Agriculture Imagery Program (NAIP), whole-state four-band (RGB + NIR) for various years; not flown every year
    * Hennepin County<sup>*</sup>
    * Ramsey County<sup>*</sup>
    * <sup>*</sup>_For use other than research and academic use, check with U-Spatial_
* Soil -- Minnesota and surrounding states gridded SSURGO data

## Apps, tools, symbolization

### [QGIS](https://qgis.org/)

<img src="https://i.imgur.com/mOZFDzL.png" alt="QGIS logo" style="max-width:300px;">
QGIS is free and open source software, cross-platform, and has a solid set of plugins. Give it a spin, especially if you're not running Windows.

And, if you find yourself using ArcGIS primarily to view data, you might find QGIS quite a bit faster for some things.

### R/RStudio

FWIW, I'm not a regular R user, but R (with RStudio on top of it) has come a long way and is becoming more popular for making maps, and as a stats application offers a lot of power when working with attribute data.

As seen at FOSS4G NA 2018:

In [1]:
%%html
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Thanks to everyone who came to Mapping American Community Survey with R! <a href="https://twitter.com/hashtag/FOSS4GNA?src=hash&amp;ref_src=twsrc%5Etfw">#FOSS4GNA</a> Workshop Materials: <a href="https://t.co/A3BKL6XLKe">https://t.co/A3BKL6XLKe</a></p>&mdash; Lee Hachadoorian (@LHachadoorian) <a href="https://twitter.com/LHachadoorian/status/996426940717379584?ref_src=twsrc%5Etfw">May 15, 2018</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

### Esri ArcGIS
The University of Minnesota has an enterprise agreement with Esri that makes most Esri software available to researchers, instructors, and students at no cost (some differences for admin use). Good for U-owned machines. For personal machines contact [uspatial@umn.edu](mailto:uspatial@umn.edu) for a license code for covered software. See the [U-Spatial software page](https://research.umn.edu/units/uspatial/software) for more details.

#### Some of the software
* [ArcGIS Online](https://umn.maps.arcgis.com/)
* ArcGIS Desktop
    * contemporary: [ArcGIS Pro](http://pro.arcgis.com/en/pro-app/)
    * "legacy:" [ArcMap/ArcCatalog](http://desktop.arcgis.com/en/arcmap/)
* [Business Analyst Desktop](https://doc.arcgis.com/en/business-analyst/desktop/what-is-business-analyst.htm) with US dataset
* [Business Analyst Online](https://doc.arcgis.com/en/business-analyst/web/welcome.htm)
* [Survey123](https://survey123.arcgis.com/) (surveys that can easily take point location input)
* [Collector for ArcGIS](https://www.esri.com/en-us/arcgis/products/collector-for-arcgis/overview) (field data collection)
* [ArcGIS Enterprise](https://enterprise.arcgis.com/en/) (Server and Portal)

#### Data
ArcGIS Online offers basemaps and a variety of spatial data (e.g., [Living Atlas](https://livingatlas.arcgis.com/en/)) that can be used in ArcGIS Online and other Esri/ArcGIS products.

The Business Analyst US 2018 dataset includes ACS data, derived demographics, business listings, address coders, and more. 

#### Tip: Python scripting using `arcpy`

If you're using an ArcGIS Desktop product for the tools and want to develop a workflow using the Python `arcpy` module, do it! Easy option: Run a tool with the appropriate parameters in ArcGIS and "copy Python snippet" into a `.py` file somewhere else. Not everything in ArcGIS is available in `arcpy`, but you'll be mostly covered. If you're comfortable with Python, it's arguably a lot easier than using ModelBuilder.

![](https://i.imgur.com/38c1HkV.png)

```
arcpy.management.AddField(
    r"C:\workspaceC\MWMO_flowpath_pro\MWMO_flowpath_pro.gdb\subshed_initial\subshed_flow_path_points",
    "PathOrder",
    "Short",
    None,
    None,
    None,
    None,
    "NULLABLE",
    "REQUIRED",
    None,
)
```

### Python modules  `fiona` and `shapely`

`fiona` is a go-to Python module for reading and writing shapefiles.

Want to operate on geometries? See your geometries easily in Jupyter? Check out `shapely`. Want convex hulls? Buffers? Manipulation is easy, and it's easy to tie together with Fiona and GeoJSON.

In [None]:
# Check the coordinate reference system (CRS) and schema of the source file
import os
import fiona
from pprint import pprint

with fiona.open(os.path.join('data', 'MN_tract_2010_wgs84.shp')) as _in:
    print('CRS:', _in.crs, '\n')
    pprint(_in.schema)

In [None]:
# Who needs geometry? Count tracts by county (FIPS), get the seven most common
from collections import Counter
with fiona.open(os.path.join('data', 'MN_tract_2010_wgs84.shp')) as _in:
    tractsByCounty = Counter([x['properties']['COUNTYFP10'] for x in _in])
pprint(tractsByCounty.most_common()[:7])

In [None]:
# Why download county population counts? Roll them up from tracts
with fiona.open(os.path.join('data', 'MN_tract_2010_wgs84.shp')) as _in:
    popByCounty = Counter()
    [popByCounty.update({x['properties']['COUNTYFP10']: x['properties']['POP2010']}) for x in _in]
pprint(popByCounty.most_common()[:7])

In [None]:
# Stop reading the file over and over, just keep it in memory
with fiona.open(os.path.join('data', 'MN_tract_2010_wgs84.shp')) as _in:
    tracts = [x for x in _in]

In [None]:
# Union the tracts together to get counties
from shapely.geometry import shape
from shapely.ops import unary_union
countyGeoms = {x['properties']['COUNTYFP10']: [] for x in tracts}
[countyGeoms[x['properties']['COUNTYFP10']].append(shape(x['geometry'])) for x in tracts]
for k, v in countyGeoms.items():
    countyGeoms[k] = unary_union(v)

In [None]:
# Ramsey
countyGeoms['123']

In [None]:
# Hennepin
countyGeoms['053']

In [None]:
# The seven county metro
from shapely.geometry import MultiPolygon
MultiPolygon([g for k, g in countyGeoms.items() if k in ['003', '019', '037', '053', '123', '139', '163']])

In [None]:
# Union the counties together to get the state
mn = unary_union(countyGeoms.values())
mn

In [None]:
# A most exciting point
from shapely.geometry import Point
ourCoords = (-93.237278, 44.974063)
weAreHere = Point(ourCoords)
weAreHere

In [None]:
# Basic intersection testing
print('We are in Minnesota?', weAreHere.intersects(mn))
print('We are in Ramsey County?', weAreHere.intersects(countyGeoms['123']))
print('We are in Hennepin County?', weAreHere.intersects(countyGeoms['053']))

In [None]:
# What tract are we in? Better ways to do this, no doubt, but this is quick enough
for t in [x for x in tracts if x['properties']['COUNTYFP10'] == '053']:
    if weAreHere.intersects(shape(t['geometry'])):
        print('Located Hennepin County, {}'.format(t['properties']['NAMELSAD10'])) 
        break

In [None]:
# How many transit stops (excluding MVTA) are there within
# a half kilometer of our point?
import csv
bufferDist = 500
with open(os.path.join('data', 'transit_stops.txt')) as _in:
    reader = csv.DictReader(_in)
    stops = [row for row in reader]

# Buffering geographic coordinates doesn't work well, convert
# to UTM15/WGS84 (EPSG:32615). Using utm, could also use pyproj.
# The linear unit of measure is meters. 
import utm
bufferedHere = Point(utm.from_latlon(*reversed(ourCoords))[:2]).buffer(bufferDist)

# Create points for all the stops in the same PCS and test
nearbyStops = []
for s in stops:
    pt = Point(utm.from_latlon(float(s['stop_lat']), float(s['stop_lon']))[:2])
    if pt.intersects(bufferedHere):
        nearbyStops.append(s['stop_name'])

print(f'There are {len(nearbyStops)} stops with {bufferDist}m of our location:')
for x in sorted(nearbyStops):
    print('-', x)

### [mapshaper](https://mapshaper.org/)
Need to convert a shapefile to GeoJSON? TopoJSON? Some other direction? Mapshaper can help. Simplification options are included, coordinate precision control is provided (avoid GeoJSON bloat!), field can be excluded, and more. Node.js app with CLI and GUI interfaces.

A note on TopoJSON: It's great for data with shared boundaries, e.g. states, counties, and Census statistical enumeration units, not so useful without replicated geometry.

![Screenshot of Minnesota voting districts and export options in MapShaper](https://i.imgur.com/xFCrHYa.png)

#### Quick demo of mashaper GUI
1. [Download MN county boundaries](https://maps.umn.edu/day-of-data-2019/shp_bdry_counties_in_minnesota.zip) ([alternate link](ftp://ftp.gisdata.mn.gov/pub/gdrs/data/pub/us_mn_state_dnr/bdry_counties_in_minnesota/shp_bdry_counties_in_minnesota.zip))
3. Visit [mapshaper](https://mapshaper.org)
4. Upload the boundaries
5. Try out simplification

### [LAStools](https://rapidlasso.com/LAStools/)
Working with LiDAR data? Need to generate a DEM, DSM, or other product from classified point? Check out LAStools, which the U licenses. See the [LAStools section on the U-Spatial software page](https://research.umn.edu/units/uspatial/software#lastools) for more details.

![view of Bruininks hall and more using lasview](https://i.imgur.com/jhU6bO7.png)

Data: [LiDAR Elevation, Twin Cities Metro Region, Minnesota, 2011](https://gisdata.mn.gov/dataset/elev-lidar-metro2011)

### [BBBike Map Compare tool](https://mc.bbbike.org/mc/)
You might just want a basemap. Side-by-side comparison of basemaps. Choose wisely; know the purpose of your map before choosing a basemap and before symbolizing what you put on top of the basemap.
[![screenshot of BBBike Map Compare tool](https://i.imgur.com/Ishawog.jpg)](https://mc.bbbike.org/mc/)

### [ColorBrewer](http://colorbrewer2.org)

Color vision deficiency (CVD) affects roughly 4-5% of the population, predominantly men, and can pose significant accessibility and usability problems with maps, websites, and more. ColorBrewer presents a range of color palettes and ramps, with a colorblind-friendly option available.

ColorBrewer styles are being baked into more apps, e.g. D3, ArcGIS Pro, and QGIS. Default color schemes may get tiresome, but at least a colorblind-friendly default won't cause accessibility problems.

[![screeshot of colorbrewer2.org](https://i.imgur.com/Kdvz9fM.png)](http://colorbrewer2.org)

## Learning, training, etc

### U-Spatial
[U-Spatial offers training](https://research.umn.edu/units/uspatial/training) throughout fall and spring semesters. These are typically half-day or full-day workshops, at little or no cost for UMN attendees. Most training materials are available for download off the webpage.

### QGIS learning
* [QGIS training material](https://www.qgis.org/en/site/forusers/trainingmaterial/)
* [PyQGIS 101](https://anitagraser.com/pyqgis-101-introduction-to-qgis-python-programming-for-non-programmers/)

### ArcGIS training
Many (all?) of Esri's web courses are available at no charge to UMN students, faculty, and staff. Visit the [Esri course catalog](https://www.esri.com/training/catalog/search/) when signed in to Esri with your Internet ID to see the offerings. Try limiting to "web courses."

[![screenshot of Esri training catalog search](https://i.imgur.com/u7bHOt1.png)](https://www.esri.com/training/catalog/search/)

### Esri event videos
U-Spatial purchases videos of sessions from the Esri User Conference and Esri Developer Summit. Find them near the goodies at `\\files.umn.edu\us\gis\u-spatial\UMN_Users\videos` (or [webfiles.umn.edu](https://webfiles.umn.edu)).

### FOSS4G (internationl and more)
Some FOSS4G conferences have slides and/or video for sessions available, for example [FOSS4G 2017 (Boston)](http://2017.foss4g.org/post_conference/#slides).