# Geospatial Data Carpentry

For this practice, you will build on what you have learned and your previous data carpentry skills to acquire, stage, ingest, and render various datasets.

We will be accessing data linked at the US Government's Geospatial Platform: https://www.geoplatform.gov/


All the datasets will be in different formats. Some you may have seen, some will be new.
 * [New Mexico Populated Places (GNIS), 2009](http://gstore.unm.edu/apps/rgis/datasets/c73b5e4d-fd64-4a2c-8a93-668e47d982d8/gnis_nm_poppl09.derived.csv)
 * [Bureau of Land Management Land Grant Boundaries](http://gstore.unm.edu/apps/rgis/datasets/3d23ac95-2b28-4c1f-b5cc-b656133a018f/land_grants.original.zip/)
 * http://gstore.unm.edu/apps/rgis/datasets/b4ae8f53-8dff-46bb-9058-e5501cabdd1b/school_district_boundaries.derived.gml
 * http://gstore.unm.edu/apps/rgis/datasets/ab17adb4-0992-436b-8ae4-575d8405d188/gpsrdsddshp.derived.kml

These datasets, while discoverable on geoplatform.gov are hosted at the University of New Mexico.

## A first Data Set

The first dataset we will work with is [http://gstore.unm.edu/apps/rgis/datasets/c73b5e4d-fd64-4a2c-8a93-668e47d982d8/gnis_nm_poppl09.derived.csv](http://gstore.unm.edu/apps/rgis/datasets/c73b5e4d-fd64-4a2c-8a93-668e47d982d8/gnis_nm_poppl09.derived.csv).

Read about this dataset [here](https://catalog.data.gov/dataset/new-mexico-populated-places-gnis-2009).

### Acquire
If you click the link in the bullet-list above, your browser will try to download a [ZIP file](https://en.wikipedia.org/wiki/Zip_%28file_format%29).
We can do this in Python using CURL Library as well as a few other tools.

In [None]:
import urllib.request
import shutil
from pathlib import Path

In [None]:
# Designate the URL for a file we want;
file_URL = 'http://gstore.unm.edu/apps/rgis/datasets/c73b5e4d-fd64-4a2c-8a93-668e47d982d8/gnis_nm_poppl09.derived.csv'

# Designate the local filename
local_file_name = 'gnis_nm_populated_place.zip'

# Designate the local file name with a path to a temp directory.
     # Your Repo comes with this folder.  If not, use terminal and 
     # navigate to course folder > module2 and them : mkdir temp
file_Path = Path('../temp/')  
file_Path /= local_file_name


# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(file_URL) as response,  file_Path.open(mode='w+b') as out_file:
    shutil.copyfileobj(response, out_file)
    


The above cell will open the URL and pull the HTTP Response Data into the binary file specified in the `temp/` directory of `module2`.

Using a Terminal:
 * Replace "course_folder" with the actual course folder name!
```BASH
cd course_folder/module2/temp
ls -lt
total 288
-rw-r--r-- 1 scottgs dsa_user 294807 Jan 14 11:15 gnis_nm_populated_place.zip
```

### Stage 
Then we peek inside the file, notice the "-l" ... lower case el.
```BASH
unzip -l gnis_nm_populated_place.zip 

Archive:  gnis_nm_populated_place.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
   235767  01-05-2018 12:38   gnis_nm_poppl09.csv
    58782  01-05-2018 12:38   gnis_nm_poppl09.csv.xml
---------                     -------
   294549                     2 files
```
  * Repeating without the -l will unpack the files! **Do that now**.

We are very familiar with CSV files, so you can just load it up with Pandas.
Please refer to the recipe in your [GeoPandas Information Sheet](../../resources/GeoPandasInfoSheet.pdf), as well as the [API documentation for the shapely package](http://shapely.readthedocs.io/en/stable/manual.html).

### Ingest and Render

#### TODO: 
 1. Load data with Pandas
 2. Convert to GeoPanda data with geometry
 3. Render points!


In [None]:
## M2:P1:Cell01
## ----- Add Ingestion Code Below -----------






In [None]:
## M2:P1:Cell02
## ----- Add more Ingestion Code Below -----------

import geopandas as gpd
from shapely.geometry import Point

# Generate a List of shapely.geometry.Point
#   from the (lon, lat) pair for each row
geom_list = # Add a List Comprehension here







In [None]:
## M2:P1:Cell03
## ----- Add Render Code Below -----------






## A Second Data Set

The second dataset we will work with is [http://gstore.unm.edu/apps/rgis/datasets/3d23ac95-2b28-4c1f-b5cc-b656133a018f/land_grants.original.zip](http://gstore.unm.edu/apps/rgis/datasets/3d23ac95-2b28-4c1f-b5cc-b656133a018f/land_grants.original.zip).

Read about this dataset [here](https://catalog.data.gov/dataset/bureau-of-land-management-land-grant-boundaries).

### Acquire and Stage
If you click the link in the bullet-list above, your browser will try to download a [ZIP file](https://en.wikipedia.org/wiki/Zip_%28file_format%29).

Repeat the steps above.
However, be cautious that ZIP file in this case contains the setof files that make of the Shapefile format.

![images/land_grants_zip.png MISSING](../images/land_grants_zip.png)

You should create a sub-folder in `temp/` named **`land_grants`** and unzip the files in there!

In [None]:
## M2:P1:Cell04
## ----- Add Acquisition Code Below -----------






### Ingest and Render

See your labs and related material!

In [None]:
import fiona
import json

GEODATA_FILE = '../temp/land_grants'

## M2:P1:Cell05
## ----- Add Exploration (Fiona) Code Below -----------




In [None]:
## M2:P1:Cell06
## ----- Add Ingest Code Below -----------






In [None]:
## M2:P1:Cell07
## ----- Add Render Code Below -----------





### The last two files will be part of an exercise.  

#### <span style="background:yellow">They will require some more advanced parsing, as they are both XML derivative file formats.</span>

# Save Your Notebook
## Then Notebook Menu: File > Close and Halt