In [19]:
# HIDDEN
from datascience import *
import math
import numpy as np
from geopy.distance import vincenty # Used to calculate distances on an ellipsoid
import json

from osgeo import ogr, osr, gdal

# Enable GDAL/OGR exceptions
gdal.UseExceptions()

# If gdal is having trouble finding the *.csv files 
# that contain the Proj definitions, set the path explicity
gdal.SetConfigOption('GDAL_DATA','/opt/conda/share/gdal')
#print(gdal.GetConfigOption('GDAL_DATA'))

# GDAL error handler function so we get more descriptive and helpful error messages
# Source: The Python GDAL/OGR Cookbook
def gdal_error_handler(err_class, err_num, err_msg):
    errtype = {
            gdal.CE_None:'None',
            gdal.CE_Debug:'Debug',
            gdal.CE_Warning:'Warning',
            gdal.CE_Failure:'Failure',
            gdal.CE_Fatal:'Fatal'
    }
    err_msg = err_msg.replace('\n',' ')
    err_class = errtype.get(err_class, 'None')
    print('Error Number: %s' % (err_num))
    print('Error Type: %s' % (err_class))
    print('Error Message: %s' % (err_msg))

gdal.PushErrorHandler(gdal_error_handler)

0

## Homework Exerise 3: Exploring Coordinate Reference Systems and Distance, Area, and Direction Calculations
*last updated 2/16/2016 by pattyf@berkeley.edu*


## What to Submit

- Do your work in this notebook. Add **Markdown or Code cells** if needed to input your work / responses to questions.

- Save your work frequently as you go along.

- When you are done download your work as a both an ipython notebook and as a PDF file. Zip the two together and give the file a name like **yourlastname_hw3.zip**

- Upload your zip file on bCourses as your submission for this assignment.


## Part 1. Distance Calculations 


### Introduction

You may have learned in high school that you can use the Pythagorean Theorem to calculate the shortest, or straight-line distance, between two points in a plane. The term **plane** is used to describe two-dimensional space and is also called a **Cartesian coordinate system**, named after the French philosopher, mathematician and scientist Rene Descartes. In geospatial texts this is also called a **planar coordinate system**. The theorem can be written as an equation that relates the 3 sides of a right triangle:

$$a^2 + b^2 = c^2$$



Here, the distance between the two points,$(x_0,y_0)$ and $(x_1,y_1)$, is represented by line segment **C** and is equal to the square root of sum of differences between the two points.  This distance **D** is known as the **Euclidean distance** and is calcuted as follows:

$$D = \sqrt{(x_0-x_1)^2 + (y_0-y_1)^2}$$


We can implement this formula in python using the function below.

In [5]:
# a function to calculate distance using the pythagorean theorem
def euclidean_dist ( x1,y1, x2, y2 ):
    x_dist = x1 - x2  
    y_dist = y1 - y2    
    dist_sq = x_dist**2 + y_dist**2   

    distance = round(math.sqrt(dist_sq), 4) 

    return distance

Let's use this function to calculate the distance between two international airports, Seattle Tacoma International Airport (SEA) and Dallas / Fort Worth International Airport (DFW). We can get the geographic coordinates for these places via a search http://geonames.org. Geonames is a **gazetter**, an index or database that provides the geographic coordinates for place names as well as other information like place type. 

Using Geonames the following coordinates were retrieved:

- **47.449, -122.309** # Seattle Tacoma International Airport (SEA)
- **32.896, -97.037** # Dallas / Fort Worth International Airport (DFW)

The distance between these two airports, according to the Great Circle Mapper (**GCM**) (http://www.gcmap.com/) is **1660** miles or **2671.5** kilometers (km). Let's figure out the best method to calculate an accurate distance between these locations and why.


Below we create python tuples that contain coordinates for a set of airport locations using coordinates from geonames.org.

In [6]:
SEA = (47.449, -122.309) # Seattle Tacoma International Airport (SEA)
DFW = (32.896, -97.037)  # Dallas / Fort Worth International Airport (DFW)
SFO = (37.619, -122.376)  # San Francisco International Airport
LAX = (33.943, -118.409) # Los Angeles International Airport
MNL = (14.505, 121.004)  # Ninoy Aquino International in Manilla, Phillipines

# not used yet
BLR = (13.201, 77.709)   # Kempegowda International Airport in Bengaluru India
BKK = (13.692, 100.751)  # Suvarnabhumi Airport Bangkok Thailand
ATL = (33.641, -84.423)  # Hatfield Atlanta International Airport

We can then use the function to calculate the distance between DFW and SEA as follows:

In [7]:
euclidean_dist(SEA[1],SEA[0],DFW[1], DFW[0])

29.1627

### Question 1
a) In the points for SEA and DFW, which index (0 or 1) represents the geographic east-west value, or longitude, and which the north-south value, or latitude? 

b) Does coordinate order (x,y vs y,x) matter when calculating Euclidean distance?

c) In the cell below re-write the function above to take two points as arguments rather than the four x,y values.

In [8]:
def euclidean_dist ( pt1, pt2 ):
    # your code here
    return distance

### Interpreting the output

What does that result mean (29.16..)? What are the units? Is that the correct distance?

The value 29 is no where near the distance of 2671.5 km given by [Great Circle Mapper](http://www.gcmap.com/). The units are in decimal degrees which are angular units not linear. One might be tempted to convert decimal degrees to kilometers thinking they both measure distance on the surface of the Earth so there must be a conversion factor. Well, because the Earth is not flat it's complicated:

- 1 degree of latitude on the earth measures about 111.2 km or 69 miles. It varies very slightly from 110.6 km at equator to 111.7 km at the poles.
- 1 degree of longitude varies from 111.3 km  at the equator to 0 at the poles.


### Question 2

Given the above information, in the cell below write expressions to:

a) Estimate the circumference of the earth in kilometers along the equator.

b) Estimate the circumference along the poles given a flattening ration of 0.0033

c) Do these values differ? Why? What is the relationship between these two values called?


In [9]:
# Replace the ones with your code
earth_circumference_along_equator = 1 # replace with your formula
flattening_ratio = 0.0033
earth_circumference_along_poles = 1 # replace with your formula
semi_major_axis = 1 # replace with formula
semi_minor_axis = 1 # replace with formula

print("The circumference of the earth along the equator is approximately ", earth_circumference_along_equator, "km." )
print("and along the poles is approximately ", earth_circumference_along_poles, "km.")
print("These values are different because ...")
print('Based on these values the flattening ratio is  ', flattening_ratio)
print('Earth Radius at equator: ', semi_major_axis)
print('Earth Radius at poles: ', semi_minor_axis)

The circumference of the earth along the equator is approximately  1 km.
and along the poles is approximately  1 km.
These values are different because ...
Based on these values the flattening ratio is   0.0033
Earth Radius at equator:  1
Earth Radius at poles:  1


### Converting Decimal Degrees to Kilometers
Let's simplify and assume that one degree of latitude or longitude is about 111 KM and convert the distance between these two airports from degrees to kilometers.

In [10]:
euclidean_dist(SEA,DFW) * 111

NameError: name 'distance' is not defined

Our distance calculation of 3237 km is now off by **565** km of the distance given by Great Circle Mapper - 2671.5 km. So, clearly we cannot assume a flat earth and use Euclidean geometry to calculate the distance between geographic coordinates. The results would range from inaccurate to nonsense.

### Measuring Distance on the Spherical Earth

The simplest model of the non-flat Earth is a sphere. The shortest path between two points on a sphere is along a circle formed when the sphere is sliced through those points and the center of the sphere. This is known as a **great circle arc**. We can calculate the length of the great circle arc, also called *great circle distance*, between two points $(Lon_0,Lat_0)$ and $(Lon_1,Lat_1)$ using the following formula (GISS, p. 89).

$$D = R* cos^-1*[sin({Lat_0})sin({Lat_1}) + cos({Lat_0})cos({Lat_1})cos({(Lon_0-Lon_1)})]$$

This is known as the [haversine formula](https://en.wikipedia.org/wiki/Haversine_formula) and it can be used to estimate the distance between two points on the earth if we know their latitude (**Lat**) and longitude (**Lon**) values and the radius (**R**) of the earth. The latitude and longitude values are first converted to radians (one radian is equal to 180/pi degrees). Note, **cos-1** is the inverse cosine, or math.acos in python.

Below is a function to calculate the distance between two points specified by longitude and latitude using the haversine formula.

In [11]:
# a function to calculate distance on a sphere using the haversine formula
EARTH_RADIUS_KM = 6371  # Radius of Earth
                        # If the radius is specified in kilometers, 
                        # the return value will also be in kilometers
def haversine_dist(pt1,pt2,order=0):
    lon0 = math.radians(pt1[0+order])
    lon1 = math.radians(pt2[0+order])
    lat0 = math.radians(pt1[1-order])
    lat1 = math.radians(pt2[1-order])
    
    d = EARTH_RADIUS_KM * math.acos((math.sin(lat0)*math.sin(lat1)) + 
                        (math.cos(lat0)*math.cos(lat1)*math.cos(lon0-lon1))) 
    d = round(d,4)

    return d


### Question 3.
a) Use the above function to compute the distance between SEA and DFW with the haversine Formula. Enter the code and a print statement below.

b) In a comment in the cell below the describe **order** argument and show what it should be for this calculation.

In [12]:
# Haversine distance bewteen SEA and DFW
print("your code and answer in here" )

your code and answer in here


Using the haversine formula, our distance calculation is now only about 3 km off of the Great Circle Mapper distance  (2671.226 - 2668.2822 km). That's pretty good. Why not just call it a day? There are two main reasons: 1) because some applications require a higher level of accuracy (e.g., a drone delivering a package to your doorstep) and  2) because the Earth, as much as we would like to to be, is not a sphere.  Its shape is an **ellipsoid** (which is sometimes refered to as a spheriod). We have only calculated the distance between two points on the earth. Distance calculations based on a spherical earth model may be much more accurate for other locations.

### Measuring Distance on the Ellipsoidal Earth

A ellipsoidal model of the Earth is specified by the radius of the major and minor axes or by the radius of the major axis and the polar flattening ratio. The main reference ellipsoids in use in North America are:

- World Geodetic System of 1984, or **WGS-84** (6378.137,    6356.7523142)
- The Geodetic Reference system of 1980, or **GRS-80** (6378.137,    6356.7523141)
- **Clarke (1866)** (6,378.206 m,	6,356.5838 m, 294.978698214) 

WGS84 is the default ellipsoidal model and is implemented by the WGS84 Geographic Coordinate Reference System (CRS). GRS-80 is the ellipsoidal model used by the North American Datum of 1983 (NAD83) and the NAD83 CRS. The Clark 1866 ellipsoid is used by the North American Datum of 1927, or NAD27 CRS. NAD27 is obsolete but there are data still in use that use it. NAD27 locations can differ from NAD83 or WGS84 locations by up to 100 meters. See https://en.wikipedia.org/wiki/North_American_Datum for a discussion and details.

The NAD83 and WGS84 CRSs are almost identical. However, when performing geoprocessing operations in software *almost* isn't good enough - you may get errors if you do not transform values to the same CRS.

Distance calculataions based on the ellipsoidal model of the earth employ **Vincenty's formula**. This is much more complicated math so we will use it's implementation in the python **geopy** module. You can read about it here:
https://github.com/geopy/geopy/blob/master/geopy/distance.py

In [13]:
# try vincenty - default ellipsoid is WGS84. Other ellipsoids are listed at: 
# https://github.com/geopy/geopy/blob/master/geopy/distance.py
print(vincenty(SEA, DFW).km)


2671.1369110333435


We now have approximately the same distance using the vincenty formula as we got from Great Circle Mapper! 
That is because GCM uses this formula (see: http://www.gcmap.com/faq/gccalc#gchow0). 
                                                                                                                

### Question 4.
By how many **meters** does the Vicenty function differ from the Haversine function for the distance between SEA and DFW? Show you work in the code block below.

In [14]:
# Calculate difference between Vincenty and Haversine Distances

### Question 5.
Complete the code block below to compute the Vincenty distance between these two airports using all three ellipsoids mentioned above. See the geopy url above for the function arguments. TIP: you may need to define a custom ellipsoid. The geopy documentation shows you how to do this. Discuss any differences in the results. In what order does the vincenty() function require the coordinates?

In [15]:
### Insert your code:
#wgs84_dist = vincenty(..).km
#grs80_dist = ..
#clark1866_dist = ..
print('WGS84 distance between SEA and DFW is %.2f km' % wgs84_dist)
#...add your print code
print('The difference between WGS84 and GRS80 distance calculations is ...')
print('The difference between GRS80 and Clark1866 distance calculations is ...')

NameError: name 'wgs84_dist' is not defined

### Question 6.

a) Calculate the Haversine and Vincenty distances for SEA and MNL. 

b) Is the difference between these values similar as they were between SEA and DFW? If not why not?

## Map Projections

If we can accurately calculate distances using the ellipsoidal earth model why do we need map projections? Map projections are used to transform 3D geographic coordinates to a 2D planar surface. We do this so we can produce 2D visualizations of geographic locations, e.g., on printed maps and computer screens. We also need map projections because ellipsoidal calculations more complex than point distances are too computationally intensive and are not widely implemented.  Map projections are designed to minimize distortion in area, shape, distance and/or direction for specific regions on the earth. No map projection can eliminate all forms of distortion for all regions. The selection of the appropriate map projection is extremely important and must be made with respect to the needs of the application at hand.

### We can use the GDAL & OGR libraries to transform geographic coordinates to projected coordinates.

In order to transform geographic coordinates to projected coordinates we need to import some python modules to help with the calculations. The osgeo gdal, ogr and osr libraries are python ports of the powerful and widely used GDAL/OGR libraries for importing, manipulating and transforming geospatial data. See http://gdal.org for details on its functionality. The Python GDAL/OGR Cookbook (http://pcjericks.github.io/py-gdalogr-cookbook) provides lots of examples for implementing this functionality. GDAL/OGR use the [proj4](https://trac.osgeo.org/proj/) library for coordinate reference system definitions and transformations.

### Coordinate Reference System (CRS) / Spatial Reference System (SRS) 

The terms coordinate reference system (CRS) and spatial reference system (SRS) are used interchangeably to refer to the definition of the space that gives geographic meaning to locations expressed as coordinates. There are thousands of CRSs in current use but only a handful are widely used.  You can use http://spatialreference.org to find the parameters for most CRSs. The easiest way to look up a CRS on spatialreference.org is by using the EPSG (European Petroleum Standards Group) code.

- **Geographic Coordinate Reference Systems** (aka unprojected CRS)
    - 4326 - WGS 84  # used by GPS and international mapping efforts
    - 4269 - NAD 83  # used by datasets produced by U.S. agencies like the Census
    
    
- **Projected Coordinate Reference Systems** (aka map projections)
    - 3310 - California Albers Equal Area # for CA-wide or large area data
    - 5070 - Conus (continental US) NAD83 # for US-wide data
    - 26910 - UTM Zone 10N, NAD 83 # for Northern CA data
    - 26911 - UTM Zone 11N, NAD 83 # for Southern CA data
    - 3857 - Web Mercator (aka Google Mercator, Spherical Mercator, Psuedo Mercator) # for web mapping

We can use the EPSG codes to create a spatial reference object and use it to transform coordinates from one CRS to another. When EPSG codes are not available, you can use the **proj4** string from spatialreference.org to define the CRS.

In [16]:
# Create and define a CRS using an EPSG code
WGS84_CRS = osr.SpatialReference()
WGS84_CRS.ImportFromEPSG(4326)
print(WGS84_CRS.ExportToWkt())

# Create and define the same CRS using the proj4 string rather than the EPSG code
WGS84_CRS2 = osr.SpatialReference()
WGS84_CRS2.ImportFromProj4('+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs')
print(WGS84_CRS2.ExportToWkt())

#Are the strings identical? No, slight differences

Error Number: 4
Error Type: Failure
Error Message: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.

GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9108"]],AUTHORITY["EPSG","4326"]]


In the next code block we create SpatialReference objects for each of the CRSs we listed above. We can then use those to transform our coordinates from one CRS to another.

In [17]:
# Define the CRSs that we will use

CalAlbers_CRS= osr.SpatialReference()
CalAlbers_CRS.ImportFromEPSG(3310)

NAD83_CRS = osr.SpatialReference()
NAD83_CRS.ImportFromEPSG(4326)

ConUS_CRS= osr.SpatialReference()
ConUS_CRS.ImportFromEPSG(5070)
 
UTM10_CRS= osr.SpatialReference()
UTM10_CRS.ImportFromEPSG(26910)

UTM11_CRS= osr.SpatialReference()
UTM11_CRS.ImportFromEPSG(26911)

Web_CRS= osr.SpatialReference()
Web_CRS.ImportFromEPSG(3857)

Error Number: 4
Error Type: Failure
Error Message: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
Error Number: 4
Error Type: Failure
Error Message: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
Error Number: 4
Error Type: Failure
Error Message: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
Error Number: 4
Error Type: Failure
Error Message: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
Error Number: 4
Error Type: Failure
Error Message: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
Error Number: 4
Error Type: Failure

6

### Well-Known Text (WKT) Format 
In order to leverage the power of the GDAL/OGR libraries we must put our coordinates in a format that those libraries can work with. One format that works is WKT.  In the next section we will convert our points for SFO and LAX to WKT and then compute the straight line, or as the crow flies, distance between these two airports.

First, create WKT points for these two locations.

In [18]:
# Create points in well-known text format (WKT)
LAX_wkt = "POINT ("+ str(LAX[1]) + " " + str(LAX[0]) + ")" 
SFO_wkt = "POINT ("+ str(SFO[1]) + " " + str(SFO[0]) + ")"

print("LAX: ", LAX_wkt) # Note: there are no comma between the coords in WKT format
print("SFO: ", SFO_wkt)

LAX:  POINT (-118.409 33.943)
SFO:  POINT (-122.376 37.619)


### Transforming Coordinates

Now that we have defined our CRSs and have our WKT, let's create a function to transform geographic to projected coordinates.


In [20]:
# Define a function to tranform geometry in WKT format between two CRSs
def wktTransformCRS(wkt_feature, sourceCRS,targetCRS):
    transform = osr.CoordinateTransformation(sourceCRS, targetCRS)
    feature_geom = ogr.CreateGeometryFromWkt(wkt_feature) # ogr polygon
    coordTransform = osr.CoordinateTransformation(sourceCRS,targetCRS)
    feature_geom.Transform(coordTransform)
    
    return feature_geom.ExportToJson() # We use json here bc it is easies to convert the coordinates 
                                       # from JSON to a simple list format


Test the function with the SFO and LAX coordinates and view the output.

In [21]:
# Transform the ouput from a JSON string to a list

LAX_json = wktTransformCRS(LAX_wkt, WGS84_CRS, CalAlbers_CRS)
SFO_json = wktTransformCRS(SFO_wkt, WGS84_CRS, CalAlbers_CRS)
print("JSON Feature: ", SFO_json)
LAX_xy = json.loads(LAX_json)['coordinates']
SFO_xy = json.loads(SFO_json)['coordinates']
print("Projected Coordinates: ", LAX_xy[0], LAX_xy[1])

JSON Feature:  { "type": "Point", "coordinates": [ -209409.964107661653543, -41549.634357079397887 ] }
Projected Coordinates:  147083.38813425685 -451255.20336084766


### Question 7.
Compute the Vincenty distance using geographic coordinates and Euclidean distances using projected coordinates between SFO and LAX. Then print the difference between these values.

In [22]:
# get distance between to projected coordinates in kilometers
projected_dist_km = 0 #update
vincenty_dist = 0 # update code

print('Projected distance in %.2f km' % projected_dist_km)
print('Vincenty distance in %.2f km' % vincenty_dist)

the_diff =  vincenty_dist - projected_dist_km
print("Difference between Haversine Distance and Projected distance is %.2f km" % the_diff)

Projected distance in 0.00 km
Vincenty distance in 0.00 km
Difference between Haversine Distance and Projected distance is 0.00 km


### Summary
Using projected coordinates and simple Euclidean geometry we can get very accurate distance measurements without the computational overhead of using ellipsoidal calculations. However, for larger distances, eg greater than the length of a country, use ellipsoidal calculations when high accuracy is required.

### Question 8.

Elephant seals migrate to Ano Nuevo State Park (37.1436, -122.3472) to mate between the months of Dec and March (http://www.parks.ca.gov/?page_id=1115).
When they depart, males and females go their separate ways. The males travel approximately as far as (56.9643, -153.2407) while the can travel as far as (19.2469, -155.1130). (http://www.esajournals.org/doi/abs/10.1890/0012-9615%282000%29070%5B0353%3AFEONES%5D2.0.CO%3B2)

a) Do the males or females travel farther? *Show your work - indicating how far males and females travel and how you determined those values.*

b) Are the coordinates above listed as (lat, lon) or (lon,lat)? How can you tell?

c) Where do the males & females go respectively? For each group name the closest land mass. Tip, enter the coordinates in lat,lon order in the google maps search bar

In [1]:
#Put any code and text (as comments) for Question 8 in this cell.





## Part 2. Map Projections and Area Measurments

Calculating the distance between two points is the same as calculating the length of a line defined by two points. For a line with more than two points you would sum the length of each line segment. Let's now consider polygons, or vector delineations of areas, to see how area measurements are impacted by different map projections. First, import a table that contains the WKT definitions for polygons. 

In [23]:
wkt_file = "./ca_counties_wkt.csv"
myco = Table.read_table(wkt_file)
myco

WKT,STATEFP,COUNTYFP,NAMELSAD,ALAND,AWATER
"MULTIPOLYGON (((-120.655585 39.69356,-120.654227 39.7066 ...",6,91,Sierra County,2468686345,23299112
"MULTIPOLYGON (((-121.188571 38.714308,-121.151857 38.711 ...",6,67,Sacramento County,2499176690,76080730
"MULTIPOLYGON (((-120.581897 34.098557,-120.582264 34.107 ...",6,83,Santa Barbara County,7083926262,2729818040
"MULTIPOLYGON (((-120.630933 38.3411,-120.631795 38.34602 ...",6,9,Calaveras County,2641819811,43808748
"MULTIPOLYGON (((-119.636302 33.27304,-119.63382 33.28932 ...",6,111,Ventura County,4773381212,945947858
"MULTIPOLYGON (((-118.667602 33.477489,-118.666719 33.493 ...",6,37,Los Angeles County,10510365728,1794809423
"MULTIPOLYGON (((-122.93506 38.313952,-122.939709 38.3108 ...",6,97,Sonoma County,4081401375,497545782
"MULTIPOLYGON (((-119.958925 36.255468,-119.959227 36.400 ...",6,31,Kings County,3598582246,5468555
"MULTIPOLYGON (((-117.437426 33.17953,-117.467871 33.2124 ...",6,73,San Diego County,10895213672,826296339
"MULTIPOLYGON (((-121.065436 39.006533,-121.061653 39.010 ...",6,61,Placer County,3644346108,246376805


### About the data
The original source of these data was the US Census TIGER/Line California County data. These data include two columns, **ALAND** and **AWATER** that contain the area in meters for the land and water portions of the county respectively. In this next section we will calculate the area for each county using the WKT and compare it to the values reported by the Census. In order to compute the area we must first transform the coordinate data from unprojected geographic coordinates (latitude and longitude values) to projected coordinates. In order to transform the CRS you need to know that the source CRS for this and all Census geographic data (when obtained from the Census) is NAD83. We can then use the [GDAL/OGR GetArea() function](https://pcjericks.github.io/py-gdalogr-cookbook/geometry.html#calculate-the-area-of-a-geometry) to compute the area of the feature.

In the code block below we define a simple function to transform area data in WKT format from one CRS to another and return the area.

In [24]:
# A function to transform the data in WKT format
# from one CRS to another
# and then compute area
# assumes polygon data!
def getProjectedArea(wkt_feature, sourceCRS,targetCRS):
    feature_geom = ogr.CreateGeometryFromWkt(wkt_feature) # ogr polygon
    coordTransform = osr.CoordinateTransformation(sourceCRS,targetCRS)
    feature_geom.Transform(coordTransform)
    return feature_geom.GetArea()

We can now compute the area of one polygon in the data table and compare it to the sum of the Census values.

In [25]:
# Compare the area reported by the census with our calculation
census_area = (myco['ALAND'][0] + myco['AWATER'][0]) / (1000 * 1000)
calc_area = getProjectedArea(myco['WKT'][0],NAD83_CRS,CalAlbers_CRS) / (1000 * 1000)
area_diff = census_area - calc_area
print('Census area in sq km: ', census_area)
print('Projected area in sq km: ', calc_area)
print('Difference in sq km:', abs(area_diff) / (1000 * 1000))

Census area in sq km:  2491.985457
Projected area in sq km:  2492.0388487105547
Difference in sq km: 5.33917105545e-08


We can see that the difference for this county is less than one sq km, which is quite small given the total size of the county. We can use the same approach to compute the area from the WKT geometry for each county.

In [26]:
# Compute the area in square meters for the entire table
myco['area_albers'] = myco.apply(lambda x: getProjectedArea (x, NAD83_CRS, CalAlbers_CRS), ['WKT'])
myco

WKT,STATEFP,COUNTYFP,NAMELSAD,ALAND,AWATER,area_albers
"MULTIPOLYGON (((-120.655585 39.69356,-120.654227 39.7066 ...",6,91,Sierra County,2468686345,23299112,2492040000.0
"MULTIPOLYGON (((-121.188571 38.714308,-121.151857 38.711 ...",6,67,Sacramento County,2499176690,76080730,2575380000.0
"MULTIPOLYGON (((-120.581897 34.098557,-120.582264 34.107 ...",6,83,Santa Barbara County,7083926262,2729818040,9806650000.0
"MULTIPOLYGON (((-120.630933 38.3411,-120.631795 38.34602 ...",6,9,Calaveras County,2641819811,43808748,2685480000.0
"MULTIPOLYGON (((-119.636302 33.27304,-119.63382 33.28932 ...",6,111,Ventura County,4773381212,945947858,5717830000.0
"MULTIPOLYGON (((-118.667602 33.477489,-118.666719 33.493 ...",6,37,Los Angeles County,10510365728,1794809423,12301400000.0
"MULTIPOLYGON (((-122.93506 38.313952,-122.939709 38.3108 ...",6,97,Sonoma County,4081401375,497545782,4577120000.0
"MULTIPOLYGON (((-119.958925 36.255468,-119.959227 36.400 ...",6,31,Kings County,3598582246,5468555,3603400000.0
"MULTIPOLYGON (((-117.437426 33.17953,-117.467871 33.2124 ...",6,73,San Diego County,10895213672,826296339,11721300000.0
"MULTIPOLYGON (((-121.065436 39.006533,-121.061653 39.010 ...",6,61,Placer County,3644346108,246376805,3892310000.0


Let's compute the area from the WKT for one county using a number of different map projections.

In [27]:
# Let's take a close look at Sacramento County
sacto = myco.where('NAMELSAD', 'Sacramento County')
sacto_wkt = sacto['WKT'][0]
sacto_area_km = (sacto['ALAND'][0] + sacto['AWATER'][0]) / (1000 * 1000)
print('Area of Sacramento County is: ', sacto_area_km, 'km')

print("Area differences in sq km for", sacto['NAMELSAD'][0], "between source data and:")
print("- ConUS CRS data: ", getProjectedArea(sacto_wkt, NAD83_CRS,ConUS_CRS) / (1000 * 1000) - sacto_area_km)
print("- CalAlbers CRS data: ", getProjectedArea(sacto_wkt, NAD83_CRS,CalAlbers_CRS)/ (1000 * 1000) - sacto_area_km)
print("- UTM10 data: ", getProjectedArea(sacto_wkt, NAD83_CRS,UTM10_CRS)/ (1000 * 1000) - sacto_area_km)
print("- UTM11 data: ", getProjectedArea(sacto_wkt, NAD83_CRS,UTM11_CRS) / (1000 * 1000) - sacto_area_km)
print("- Web Mercator: ", getProjectedArea(sacto_wkt, NAD83_CRS,Web_CRS) / (1000 * 1000) - sacto_area_km)


Area of Sacramento County is:  2575.25742 km
Area differences in sq km for Sacramento County between source data and:
- ConUS CRS data:  0.118206518671
- CalAlbers CRS data:  0.121499768839
- UTM10 data:  -0.597591841197
- UTM11 data:  7.22489207854
- Web Mercator:  1629.86222832


### Question 9.

For the CRS results shown above, explain the differences between the Census data area totals (ALAND + AWATER)  and the area that was calculated using the different map projections. Specifically, what map projection gives you the most and least accurate results and why? 



In [3]:
# Enter you response to Question 9 in this cell as code, comments or markdown.

### Summary 


- Measuring distance between two points within a small geographic regions can be extremely fast and reasoanably accurate when using **projected data**, or coordinate data referenced to a map projection. This approach scales up well when working with large amounts of data or many complex spatial calculations.
- For larger distances (and areas), calculations will be more accurate on a **spheriod** (and to a lesser extent on a sphere) than with a planar map projection, but a bit slower. 
- Since lots of data is available in WGS84 it is often easier to work with unprojected geographic coordinates, especially when you don't fully understand map projections.
- Not all software provides functionality for shperiodal or shperical calculations.

#### Reminders

- Pay attention to lat vs lon order.
- Be mindful of units

#### References
- http://postgis.refractions.net/documentation/manual-1.5/ch04.html
- https://source.opennews.org/en-US/learning/choosing-right-map-projection


## Part 3. Direction Calculations



In navigation, the term bearing, or azimuth, refers to the direction of motion. See https://en.wikipedia.org/wiki/Azimuth for a discussion. For example, when flying from one airport to another, a plane is on a course with an initial bearing, which will change along the way. That bearing is the angle between the line from the start point to 0 degrees (true north) and a line from the start point to the destination point. When the result is a *compass bearing* the angle is measured clockwise from north and will always be a positive value between 0 and 359 degrees.

Let's calculate the initial bearing from one point to another. First, we define a function to compute bearing using geographic coordinates.

In [29]:
def getBearingOnSphere(pt1,pt2,latlonOrder=0,returnCompass=0):
    # After https://gist.github.com/jeromer/2005586

    order = latlonOrder
    lon1 = math.radians(pt1[0+order])
    lon2 = math.radians(pt2[0+order])
    lat1 = math.radians(pt1[1-order])
    lat2 = math.radians(pt2[1-order])    
    
    bearing = math.atan2(math.sin(lon2-lon1)*math.cos(lat2), math.cos(lat1)*math.sin(lat2)-math.sin(lat1) * math.cos(lat2)* math.cos(lon2-lon1))
    bearing = math.degrees(bearing)
    
    if returnCompass > 0:
        #return compass bearing
        bearing = (bearing + 360) % 360
    
    return bearing

We can use the function to determine the initial bearing from LAX to SFO.

In [30]:
print(getBearingOnSphere(LAX,SFO,1,1))
print(getBearingOnSphere(LAX,SFO,1,0))

319.9429598628572
-40.05704013714277


### Question 10.

How are the two results above the same and how are they different?


We need to use a different formula to calculate bearing with projected coordinates.

In [31]:
def getBearingPlanar(pt1,pt2,order=0,returnCompass=0):
    lon1 =  pt1[0+order]
    lon2 =  pt2[0+order]
    lat1 =  pt1[1-order]
    lat2 =  pt2[1-order]
    
    bearing = (180/math.pi)*math.atan2(lon2-lon1,lat2-lat1)
    
    if returnCompass > 0:
        #return compass bearing
        bearing = (bearing + 360) % 360
    
    return bearing

In [32]:
SFO_wkt = "POINT(-122.375923156738 37.6189384460449)"
LAX_wkt = "POINT(-118.408981323242 33.9425506591797)"

SFO_json = wktTransformCRS(SFO_wkt, WGS84_CRS, CalAlbers_CRS)
LAX_json = wktTransformCRS(LAX_wkt, WGS84_CRS, CalAlbers_CRS)


SFO_xy = json.loads(SFO_json)['coordinates']
LAX_xy = json.loads(LAX_json)['coordinates']

print(getBearingPlanar(SFO_xy,LAX_xy,0,1)) 


138.97607509956845


You can see above that the difference in bearing with geographic and projected coordinates is slightly different when the projected CRS is CalAlbers.


### Question 11.

What projected CRS gives a bearing closest to the one calculated using geographic coordinates? Why do these results vary?

We can translate compass bearings into directions like North, South, East or West. This is often more helpful than the angular measurements.

In [2]:
def getDirectionFromCompassBearing(angle):
    if (45 < angle <= 135):
        theDirection = 'east'
    elif (135 < angle <= 225):
        theDirection = 'south'
    elif (225 < angle <= 315 ):
        theDirection = 'west'
    else:
        theDirection = 'north'
        
    return theDirection

We can use the two functions **getDirectionFromCompassBearing** and **getBearingOnSphere**  to compute the initial direction from SFO to LAX and from LAX to SFO.

In [3]:
print('Initial direction from SFO to LAX:', getDirectionFromCompassBearing(getBearingOnSphere(SFO,LAX,1,1)))

NameError: name 'getBearingOnSphere' is not defined

### Question 12.
Complete the code in the cell below to print out distance and directions for the array of point locations. Hint: you can enter the points in Google Maps to identify the locations and see if the directions make sense.


In [4]:
#array of points
mypts = np.array([[37.867984, -122.265605],[37.868729, -122.259146],[37.871558, -122.259929],[37.872024, -122.257837]])

i = 0 # index to start with the first point in array
while i + 1 < len(mypts):
    startpt =  ... #your code
    endpt =  ... #your code
    
    the_distance = ... #your code
    the_direction = ... #your code
    print("Walk %s for %.2f miles." %  (the_direction, the_distance))

    i = i + 1 #increment the array index

print("You have arrived!")

NameError: name 'np' is not defined