# States of Mexico into the Spatial Feature Registry

#### This code is in progress.  The code registers the States of Mexico into the Spatial Feature Registry (SFR within Data Distilleries GC2 instance) using the following workflow.  All data are retained from the source (unaltered), three registration fields are added (_id, reg_date, reg_source) and data are exported to a GeoJSON file.   The GeoJSON file is then uploaded to ScienceBase to document the final data as it is represented in the SFR.  Currently we are uploading data to the SFR using a manual process, with plans to automate this step in the future. 

#### General workflow involves:
     1: Retrieve Data From Source (ScienceBase Item: https://www.sciencebase.gov/catalog/item/5ab57393e4b081f61ab781f4)
     2: Create GeoDataFrame and identify native crs
     3: Define Variables needed throughout process
     4: Create new ScienceBase item to describe registration process
     5: Build and export GeoJSON representation of the data.  This process includes the addition of two registration fields that document information about registration (reg_source-> points to new SB item), and a registered uuid (_id).  
     6: Upload GeoJSON file to new ScienceBase item to document what was registered into SFR, along with additional information about when and how registration occured.  This process will likely change as we introduce a more systematic way of tracking prov.   During this step the user will upload data to GC2 as well (SFR schema).  Currently this process is done manually through the UI.
     

Code by: Daniel Wieferich (USGS)

Date: 20180330

In [1]:
#Import Needed Packages
import geopandas as gpd
import urllib.request as ur
import subprocess
import geojson
from sfr_load_utils import *

#### Step 1: Retrieve data from source

In [None]:
### Step 1: Retrieve Dataset from ScienceBase
#Geostatistical Framework of Mexico dataset stored at https://www.sciencebase.gov/catalog/item/5ab555c6e4b081f61ab78093

#Define url of zipped shapefile download
downloadUrl ='https://www.sciencebase.gov/catalog/file/get/5ab57393e4b081f61ab781f4?f=__disk__a3%2Fb6%2F61%2Fa3b6610d637a38e0f76ca42a86b07607b2abd7c7'
#Download government unit file to local directory
ur.urlretrieve(downloadUrl, '889463142683_s.zip')
#In working directory unzips file
subprocess.call(r'"C:\Program Files\7-Zip\7z.exe" x ' + '889463142683_s.zip' )

#### Step 2: Import shapefile into GeoDataFrame and identify native crs

In [3]:
#GC2 currently does not hand the original epsg of 6372, transforming in python ran into issues so ESRI arcpy was used to do this step.
import arcpy
input_file = 'conjunto_de_datos/areas_geoestadisticas_estatales.shp'
file = 'mexico_states_4326.shp'
out_coord_sys = arcpy.SpatialReference('WGS 1984')
arcpy.Project_management(input_file, file, out_coord_sys)

In [5]:
#Create GeoDataFrame from downloaded shapefile
df = gpd.read_file(file)

In [6]:
#Eventually will need a coded method to extract the epsg number (used as variable later), might be tricky given how this is returned
df.crs

{'init': 'epsg:4326'}

#### Step 3: Define Variables

In [7]:
#User Defined Variables
epsg = {'code':'4326'}    #starts as https://epsg.io/6372 but GC2 can't render this so transformed to 4326 (see above)
expected_geom_type = 'MultiPolygon'
outfile_name = 'mexico_states'
source_sbitem = '5ab57393e4b081f61ab781f4'
list_tags = ['Jurisdictional Units','Area Beyond National Jurisdiction','BIS Spatial Feature Registry','Mexico']
date = '2018-04-06'


#### Step 4: Create SB Item to describe SFR Registration 

In [8]:
#Build SB Item to house SFR GeoJSON File, including description of item.  
#This step outputs source_uri (uri to the new sb item that describes the data) to be included as registration information.

#Turns list of tags into json format accepted by SB
sb_tags = build_sb_tags(list_tags)
#Create SB session and log in
sb = sb_login()   
#Creates JSON needed to build and describe new SB item
item_info = sfr_item_info(sb,source_sbitem, sb_tags, date)
#Builds new SB item
new_item = build_new_sfr_sbitem(sb,item_info)
#URI of new SB item.  This is inserted into GEOJSON so we have a direct connection in SFR to documentation... this step may not
#be needed as we build prov capabilities.
source_uri = str(new_item['link']['url'])
print (source_uri)

username: dwieferich@usgs.gov
········
https://www.sciencebase.gov/catalog/item/5ac79b0ee4b0e2c2dd1014d4


#### Step 5: Build and export GeoJSON representation of data.  Add registration id and source_uri (newly created SB item). Verify that the correct number of features were included in the GeoJSON dataset.

In [9]:
#verify correct number of features
collection = df_to_geojson(df, epsg, source_uri, expected_geom_type)
print (verify_correct_count(collection, df))


Correct number of features


In [10]:
#export_geojson(outfile_name, collection)
#Add file to SB Item
file = export_geojson(outfile_name, collection)
outfile_zip = zip_geojson(outfile_name)

#### Step 6: Upload GeoJSON file to ScienceBase Item and also upload to GC2 using UI (make sure to specify UTF-8 encoding and MultiPolygon).

In [11]:
sb.upload_file_to_item(new_item, outfile_zip)

{'dates': [{'dateString': '2018-04-06',
   'label': 'Creation',
   'type': 'creation'},
  {'dateString': '2017', 'label': 'Begin Position', 'type': 'beginPosition'},
  {'dateString': '2017', 'label': 'End Position', 'type': 'endPosition'}],
 'distributionLinks': [{'files': [{'contentType': 'application/zip',
     'name': 'mexico_states.zip',
     'size': 10383132,
     'title': None}],
   'name': 'SpatialFeatureR.zip',
   'rel': 'alternate',
   'title': 'Download Attached Files',
   'type': 'downloadLink',
   'typeLabel': 'Download Link',
   'uri': 'https://www.sciencebase.gov/catalog/file/get/5ac79b0ee4b0e2c2dd1014d4'}],
 'files': [{'checksum': None,
   'contentEncoding': None,
   'contentType': 'application/zip',
   'dateUploaded': '2018-04-06T16:06:56Z',
   'downloadUri': 'https://www.sciencebase.gov/catalog/file/get/5ac79b0ee4b0e2c2dd1014d4?f=__disk__45%2Fea%2F7f%2F45ea7f89fc2647b6b07c4c92f05272879074173a',
   'imageHeight': None,
   'imageWidth': None,
   'name': 'mexico_states.zi

In [None]:
#Currently the new SB item needs to have some additional information uploaded.  The UI can be used for this for now but in the future we will want to build as much as we can into this process.