<a id="section7"></a>
## 1.7 Coordinate Reference Systems (CRS) and Map Projections

Before moving onto our next lesson, let's about how talk coordinate reference systems (CRS) and Map Projections are handled by GeoPandas.

In fact, we have gotten pretty far without talking about these!

<img src="http://www.pngall.com/wp-content/uploads/2016/03/Light-Bulb-Free-PNG-Image.png" width="20" align=left >  Do you have experience with Coordinate Reference Systems?

As a refresher, a CRS describes how the coordinates in a geospatial dataset relate to locations on the surface of the earth. 

A `geographic CRS` consists of: 
- a 3D model of the shape of the earth (a `datum`), approximated as a sphere or spheroid (aka ellipsoid)
- the `units` of the coordinate system (e.g, decimal degrees, meters, feet) and 
- the `origin `(0,0 location), specified as the `equator` and the `prime meridian`

A `projected CRS` consists of
- a geographic CRS
- a **map projection** and related parameters used to transform the geographic coordinates to `2D` space.
  - a map projection is a mathematical model used to transform coordinate data

### A Geographic vs Projected CRS
<img src ="https://www.e-education.psu.edu/natureofgeoinfo/sites/www.e-education.psu.edu.natureofgeoinfo/files/image/projection.gif" height="100" width="500">

### There are many, many CRSs

Theoretically the number of CRSs is unlimited!

Why? Primariy, because there are many different definitions of the shape of the earth. Our understanding of its shape and our ability to measure it has changed greatly over time.

### Why are CRSs Important?

- You need to know the data about your data (or `metadata`) to use it appropriately.


- All projected CRSs introduce distortion in shape, area, and/or distance. So understanding what CRS best maintains the characteristics you need for your area of interest and your analysis is important.


- Some analysis methods expect geospatial data to be in a projected CRS
  - For example, `geopandas` expects a geodataframe to be in a projected CRS for area or distance based analyses.


- Some Python libraries, but not all, implement dynamic reprojection from the input CRS to the required CRS and assume a specific CRS (WGS84) when a CRS is not explicitly defined.


- Most Python spatial libraries, including Geopandas, require geospatial data to be in the same CRS if they are being analysed together.

### What you need to know when working with CRSs

- What CRSs used in your study area and their main characteristics
- How to identify, or `get`, the CRS of a geodataframe
- How to `set` the CRS of geodataframe (i.e. define the projection)
- Hot to `transform` the CRS of a geodataframe (i.e. reproject the data)

### Codes for CRSs commonly used with CA data

CRSs are typically referenced by an [EPSG code](http://wiki.gis.com/wiki/index.php/European_Petroleum_Survey_Group).  

It's important to know the commonly used CRSs and their EPSG codes for your geographic area of interest.  

For example, below is a list of commonly used CRSs for California geospatial data along with their EPSG codes.

##### Geographic CRSs
-`4326: WGS84` (units decimal degrees) - the most commonly used geographic CRS

-`4269: NAD83` (units decimal degrees) - the geographic CRS customized to best fit the USA. This is used by all Census geographic data.

>  `NAD83 (epsg:4269)` are approximately the same as `WGS84(epsg:4326)` although locations can differ by up to 1 meter in the continental USA and elsewhere up to 3m. That is not a big issue with census tract data as these data are only accurate within +/-7meters.
##### Projected CRSs

-`5070: CONUS NAD83` (units meters) projected CRS for mapping the entire contiguous USA (CONUS)

-`3857: Web Mercator` (units meters) conformal (shape preserving) CRS used as the default in web mapping

-`3310: CA Albers Equal Area, NAD83` (units meters)  projected CRS for CA statewide mapping and spatial analysis

-`26910: UTM Zone 10N, NAD83` (units meters) projected CRS for northern CA mapping & analysis

-`26911: UTM Zone 11N, NAD83` (units meters) projected CRS for Southern CA mapping & analysis

-`102641 to 102646: CA State Plane zones 1-6, NAD83` (units feet) projected CRS used for local analysis.

You can find the full CRS details on the website https://www.spatialreference.org

### Getting the CRS of a gdf

GeoPandas GeoDataFrames have a `crs` attribute that returns the CRS of the data.

In [None]:
# Check the CRS of our gdf
tracts_acs_gdf_ac.crs

The above CRS definition specifies 
- the name of the CRS (`NAD83`), 
- the axis units (`latitude` and `longitude`)
- the shape (`datum`),
- and the origin (`Prime Meridian`, and the equator)
- and the area for which it is best suited (`North America`)

> Notes:
>    - `geocentric` latitude and longitude assume a spherical (round) model of the shape of the earth
>    - `geodetic` latitude and longitude assume a spheriodal (ellipsoidal) model, which is closer to the true shape.
>    - `geodesy` is the study of the shape of the earth.

Note that the ouput looks very different if you print it.

In [None]:
print(tracts_acs_gdf_ac.crs)

Printing the crs is useful because it outputs the code you should use if you want to `set` the CRS.


### Setting the CRS

You can set the CRS of a gdf with the `crs` method.  You would set the CRS if is not defined or if you think it is incorrectly defined.

> In desktop GIS terminology setting the CRS is called `defining the projection`

As an example, let's set the CRS of our data to `None`

In [None]:
# first set the CRS to None
tracts_acs_gdf_ac.crs = None

In [None]:
# Check it again
tracts_acs_gdf_ac.crs

...hummm...

If a variable has a null value (None) then displaying it without printing it won't display anything!

In [None]:
# Check it again
print(tracts_acs_gdf_ac.crs)

In [None]:
# Set it to 4326
tracts_acs_gdf_ac.crs = "epsg:4326"

In [None]:
# Show it
tracts_acs_gdf_ac.crs

Opps, that was wrong, the CRS is `4269`

In [None]:
# Set it to 4269
tracts_acs_gdf_ac.crs = "epsg:4326"

> #### Important note
> - You can `set` the CRS to anything you like - that doesn't make it correct!
> - Setting the CRS does not change the coordinate data. It just tells the software how to interpret it.

### Transforming or Reprojecting the CRS
You can transform the CRS of a geodataframe with the `to_crs` method.


> In desktop GIS terminology transforming the CRS is called `projecting the data`

When you do this you want to save the output to a new geodataframe.

In [None]:
tracts_acs_ac_utm10 = tracts_acs_gdf_ac.to_crs('epsg:26910')

Now take a look at the CRS.

In [None]:
tracts_acs_ac_utm10.crs

You can see the result immediately by plotting the data.

- What two key differences do you see?

In [None]:
# plot geographic gdf
tracts_acs_gdf_ac.plot();

# plot utm gdf
tracts_acs_ac_utm10.plot();

#### Exercise

In the code cell below:
1. transform the CRS of the `tracts_acs_gdf_ac` geodataframe to the `CA Albers Equal Area` CRS and save it to a new geodataframe
2. display the CRS defintion of the output geodataframe
3. plot the data to see if how the shape and range of coordinate values differ from those for the tracts_acs_gdf_ac and tracts_acs_gdf_ac_3310 geodataframes.





In [None]:
# Your code here

*Double-click here to view the solution*

<!--
tracts_acs_gdf_ac_3310 = tracts_acs_gdf_ac.to_crs('epsg:3310')
tracts_acs_gdf_ac_3310.crs
tracts_acs_gdf_ac_3310.plot()
-->

### Geopandas for Spatial Measurement Calculations

To see the immediate usefulness of this transformation from a geographic to a projected CRS, let's consider our calculation of population density above.

That calculation was based on the ALAND column, or land area in sq meters, that is included in the census tract data.

- What if the data did not contain that column?

If your geodataframe is in a projected CRS that is appropriate for area or distance calculations you can calculate these values for each feature using the `area` or `length` attributes. 

For geodatraframes with polygon geometry,
- `geodataframe_name.area` will return the area of each row's geometry

For geodatraframes with line or polygon geometry,
- `geodataframe_name.length` will return the length (or perimeter) of each row's geometry


The output units will be the units of the CRS.

In [None]:
tracts_acs_ac_utm10.area # returns the area of each feature

In [None]:
tracts_acs_ac_utm10.length # returns the perimeter of each feature in meters

We can also get the total area or length.

In [None]:
tracts_acs_ac_utm10.area.sum()

So if we want to calculate the area of Alameda County, we could do so as follows.

- *Below we use the constants we defined earlier.*

In [None]:
tracts_acs_ac_utm10.area.sum()  / SQMETER_PER_SQKM

How do this value compare to we get above using the column `ALAND`?

In [None]:
tracts_acs_ac_utm10.ALAND.sum() / SQMETER_PER_SQKM

### Getting Help with CRSs and Map Projections

See the [GeoPandas](https://geopandas.org/projections.html) website for more info on managing projections of geodataframes.

As you work with geospatial data in GeoPandas or in any software you will want to transform you data to the CRS that is most appropriate for you work.  

Most spatial analysis operations will assume a projected CRS. For example, you would not want to compute area using a geographic CRS.

For more introductory materials on CRSs and Map Projections see the references listed at the end of ths notebook.
