# Canadian Provinces and Territories into the Spatial Feature Registry

#### This code is in progress.  The code registers the Canadian Provinces and Territories into the Spatial Feature Registry (SFR within Data Distilleries GC2 instance) using the following workflow.  All data are retained from the source (unaltered), three registration fields are added (_id, reg_date, reg_source) and data are exported to a GeoJSON file.   The GeoJSON file is then uploaded to ScienceBase to document the final data as it is represented in the SFR.  Currently we are uploading data to the SFR using a manual process, with plans to automate this step in the future. 

#### General workflow involves:
     1: Retrieve Data From Source (ScienceBase Item: https://www.sciencebase.gov/catalog/item/5ab555c6e4b081f61ab78093)
     2: Create GeoDataFrame and identify native crs
     3: Define Variables needed throughout process
     4: Create new ScienceBase item to describe registration process
     5: Build and export GeoJSON representation of the data.  This process includes the addition of two registration fields that document information about registration (reg_source-> points to new SB item), and a registered uuid (_id).  
     6: Upload GeoJSON file to new ScienceBase item to document what was registered into SFR, along with additional information about when and how registration occured.  This process will likely change as we introduce a more systematic way of tracking prov.   During this step the user will upload data to GC2 as well (SFR schema).  Currently this process is done manually through the UI.
     

#### Code by: Daniel Wieferich (USGS)

Date: 20180402

In [1]:
#Import Needed Packages
import geopandas as gpd
import urllib.request as ur
import subprocess
import geojson
from sfr_load_utils import *

#### Step 1: Retrieve data from source

In [2]:
# Step 1: Retrieve Dataset from ScienceBase: https://www.sciencebase.gov/catalog/item/56bba648e4b08d617f657960, file = PADUS1_4Shapefile.zip
# STEP 1:  For PADUS1_4 the shapefile triggers a SB large file download, creating a temporary download link.  
# At this time we can't access the file programatically.  Download the file to the working directory using the 
# url: After download a field was added called FeatureID and values from fid were copied to this field.  
# This step was included because geopandas doesn't import fid.

#### Step 2: Import shapefile into GeoDataFrame and identify native crs

In [3]:
#note the file name has one more '0' than the zip folder
file = 'PADUS1_4Shapefile/PADUS1_4Combined.shp'
#Create GeoDataFrame from downloaded shapefile
df = gpd.read_file(file)

In [4]:
#Eventually will need a coded method to extract the epsg number (used as variable later), might be tricky given how this is returned
df.crs

{'datum': 'NAD83',
 'lat_0': 23,
 'lat_1': 29.5,
 'lat_2': 45.5,
 'lon_0': -96,
 'no_defs': True,
 'proj': 'aea',
 'units': 'm',
 'x_0': 0,
 'y_0': 0}

In [6]:
#Transform coordinate system to web mercator
#This dataset is already albers equal area, but was set manually instead of using epsg.  This transform will define the epsg, but should not alter the geospatial data.
df = df.to_crs({'init': 'epsg:5070'}) 

In [7]:
df.head()

Unnamed: 0,Access,Access_Src,Agg_Src,Category,Comments,Date_Est,Des_Tp,FeatureID,GAPCdDt,GAPCdSrc,...,d_Category,d_Des_Tp,d_GAP_Sts,d_IUCN_Cat,d_Mang_Nam,d_Mang_Typ,d_Own_Name,d_Own_Type,d_State_Nm,geometry
0,UK,GAP - Default,NOAA_PADUS1_4MPA_MPAIMember_Eligble2016,Designation,,1990,MPA,0,2016,GAP - NOAA,...,Designation,Marine Protected Area,3 - managed for multiple uses - subject to ext...,Other Conservation Area,U.S. Fish & Wildlife Service,Federal,U.S. Fish & Wildlife Service,Federal,South Carolina,"(POLYGON ((1424277.798300002 1184612.6506, 142..."
1,UK,GAP - Default,NOAA_PADUS1_4MPA_MPAIMember_Eligble2016,Designation,,1980,MPA,1,2016,GAP - NOAA,...,Designation,Marine Protected Area,2 - managed for biodiversity - disturbance eve...,IV: Habitat / species management,U.S. Fish & Wildlife Service,Federal,U.S. Fish & Wildlife Service,Federal,Not Applicable,"(POLYGON ((-3799758.5226 4890818.809499999, -3..."
2,UK,GAP - Default,NOAA_PADUS1_4MPA_MPAIMember_Eligble2016,Designation,,1980,MPA,2,2016,GAP - NOAA,...,Designation,Marine Protected Area,2 - managed for biodiversity - disturbance eve...,IV: Habitat / species management,U.S. Fish & Wildlife Service,Federal,U.S. Fish & Wildlife Service,Federal,Alaska,(POLYGON ((-2468323.288700001 4003732.81449999...
3,UK,GAP - Default,NOAA_PADUS1_4MPA_MPAIMember_Eligble2016,Designation,,1984,MPA,3,2016,GAP - NOAA,...,Designation,Marine Protected Area,2 - managed for biodiversity - disturbance eve...,IV: Habitat / species management,U.S. Fish & Wildlife Service,Federal,U.S. Fish & Wildlife Service,Federal,North Carolina,(POLYGON ((1768150.140400001 1590880.056099996...
4,UK,GAP - Default,NOAA_PADUS1_4MPA_MPAIMember_Eligble2016,Designation,,1963,MPA,4,2016,GAP - NOAA,...,Designation,Marine Protected Area,2 - managed for biodiversity - disturbance eve...,IV: Habitat / species management,U.S. Fish & Wildlife Service,Federal,U.S. Fish & Wildlife Service,Federal,Texas,(POLYGON ((139868.2466999998 744803.6877999974...


#### Step 3: Define Variables

In [8]:
#User Defined Variables
epsg = {'code':'5070'}
expected_geom_type = 'MultiPolygon'
outfile_name = 'padus1_4'
source_sbitem = '56bba648e4b08d617f657960'
list_tags = ['Jurisdictional Units','Protected Areas','BIS Spatial Feature Registry','United States']
date = '2018-04-02'


#### Step 4: Create SB Item to describe SFR Registration 

In [9]:
#Build SB Item to house SFR GeoJSON File, including description of item.  
#This step outputs source_uri (uri to the new sb item that describes the data) to be included as registration information.

#Turns list of tags into json format accepted by SB
sb_tags = build_sb_tags(list_tags)
#Create SB session and log in
sb = sb_login()   
#Creates JSON needed to build and describe new SB item
item_info = sfr_item_info(sb,source_sbitem, sb_tags, date)
#Builds new SB item
new_item = build_new_sfr_sbitem(sb,item_info)
#URI of new SB item.  This is inserted into GEOJSON so we have a direct connection in SFR to documentation... this step may not
#be needed as we build prov capabilities.
source_uri = str(new_item['link']['url'])
print (source_uri)

username: dwieferich@usgs.gov
········
https://www.sciencebase.gov/catalog/item/5ac272d5e4b0e2c2dd0aa3e7


#### Step 5: Build and export GeoJSON representation of data.  Add registration id and source_uri (newly created SB item). Verify that the correct number of features were included in the GeoJSON dataset.

In [10]:
collection = df_to_geojson(df, epsg, source_uri, expected_geom_type)
print (verify_correct_count(collection, df))

#export_geojson(outfile_name, collection)
#Add file to SB Item

Correct number of features


In [11]:
file = export_geojson(outfile_name, collection)
outfile_zip = zip_geojson(outfile_name)

#### Step 6: Upload GeoJSON file to ScienceBase Item and also upload to GC2 using UI (make sure to specify UTF-8 encoding and MultiPolygon).

In [12]:
sb.upload_file_to_item(new_item, outfile_zip)

{'body': 'Protected Areas Database of the United States (PAD-US) data registered into the spatial feature registry.&nbsp; Source data are documented at&nbsp;https://www.sciencebase.gov/catalog/item/56bba648e4b08d617f657960.',
 'contacts': [{'active': True,
   'contactType': 'person',
   'email': 'dwieferich@usgs.gov',
   'firstName': 'Daniel',
   'jobTitle': 'Physical Scientist',
   'lastName': 'Wieferich',
   'middleName': 'J',
   'name': 'Daniel J Wieferich',
   'oldPartyId': 66431,
   'orcId': '0000-0003-1554-7992',
   'organization': {'displayText': 'Biogeographic Characterization'},
   'primaryLocation': {'building': 'DFC Bldg 810',
    'buildingCode': 'KBT',
    'faxPhone': '3032024710',
    'mailAddress': {},
    'name': 'CN=Daniel J Wieferich,OU=CSS,OU=Users,OU=OITS,OU=DI,DC=gs,DC=doi,DC=net - Primary Location',
    'officePhone': '3032024603',
    'streetAddress': {'city': 'Lakewood',
     'line1': 'W 6th Ave Kipling St',
     'state': 'CO',
     'zip': '80225'}},
   'type': '

In [None]:
#Currently the new SB item needs to have some additional information uploaded.  The UI can be used for this for now but in the future we will want to build as much as we can into this process.