## DRAFT: Visualization of Significance to Fish Habitat Condition of Landscape Level Disturbance Variables in a Selected Spatial Unit
#### Daniel Wieferich -USGS

This notebook builds a table of landscape level disturbance variables that were test in the 2015 National Fish Habitat Partnerships (NFHP) National Inland Assessment of Streams.  The user passes a spatial unit of interest from the Spatial Feature Registry's placeNameLookup table  (e.g. a national park, state, ecoregion...).  The table will display all variables tested in the assessment for that unit, highlighting those that were significant in influencing fish habitat condition.  More information about the 2015 NFHP assessment methodology can be found in the following chapter of the Through A Fish's Eye Report. 

http://assessment.fishhabitat.org/#578a9a48e4b0c1aacab8976c/578a99a6e4b0c1aacab895dd   



This code is in progress and may change through time.

#### Generalized Steps
    #1.Access Data from GC2
    #2.Build base table
    #3.Finalize table based on spatial unit specific information




##### Example of what the table should look like is found in other-materials as "Example_Disturbance_Table.jpg"

In [1]:
# Python 3.6.1

# import packages and dependencies
import requests
import pandas as pd
from bis2 import gc2


#Define database information needed for the notebook to run
#This currently requires the bis2 package, which houses credentials for the database 
thisRun = {}
thisRun['url'] = gc2.baseURLs["sqlapi_datadistillery_bcb_beta"]

In [2]:
#these functions needs some work.  eventually these should be generalized and pass in a query and return dataframe or json

def request_tested_dist(url):
    import requests
    import pandas as pd
    import geojson

    query_tested_disturbances = url+"?q=select * from nfhp.nfhp2015_tested_dist_variables"
    nfhp_request = requests.get(url=query_tested_disturbances).json()
    
    #Build this into list comprehension
    data = []
    for feature in nfhp_request['features']:
        data.append(feature['properties'])
    df = pd.DataFrame(data)
    return df


#Function that passes base url, and place of interst (supplied from user or application) 
#returns json 

def requestDist(url, place):
    import requests
    
    query_disturbance = url+"?q=select place_name, lb_list_dist, nb_list_dist, lc_list_dist, nc_list_dist \
    from nfhp.hci2015_summaries_mp where source_id='" + place + "'"
    
    hci_disturbance = requests.get(url=query_disturbance).json()
    data = hci_disturbance['features'][0]['properties']
   
    return (data)

#### Step 1: Access Data from GC2
Data are accessed from 2 tables.  

The first is nfhp2015_tested_dist_variables (represented in next cell as df_tested_dist).  This table will give us a list of all tested variables for each of 4 spatial scales.

The second table is hci2015_summaries_mp (represented as json in step X) calculated by nfhp-2015-hci-summarized-using-midpoint.ipynb (in this repo).  This table holds summaries of NFHP information for each placeNameLookup spatial feature.  We will access lists of significant disturbance variables for each spatial scale.  These will help populate the value "significant" in the table.


In [3]:
df_tested_dist = request_tested_dist(thisRun['url'])
df_tested_dist.head()

Unnamed: 0,disturbance,gid,scale,super_category,tested
0,Agriculture water withdrawal,1,local catchment,water withdrawal,0
1,All mine density,2,local catchment,mines,1
2,CERCLIS site density,3,local catchment,point source pollution,1
3,Coal mine density,4,local catchment,mines,1
4,Cultivated crops,5,local catchment,agriculture land use,1


#### Step 2: Build base table.  The result of this step is a base table that represents all tested disturbance variables, noting when a disturbance variable was not tested for a specific spatial scale ("NT").

In [4]:

#From dataframe extract list of field names for table and complete list of tested disturbances
field_names = []
disturbance = []
for row in df_tested_dist.itertuples():
    if (row.scale).replace(" ","_") not in field_names:
        field_names.append((row.scale).replace(" ","_"))
    if row.disturbance not in disturbance:
        disturbance.append(row.disturbance)

#print (field_names)
#print (disturbance)

In [5]:
#Create the beginning of the target table using a dataframe 
dist_df = pd.DataFrame(data={'disturbance_variable':disturbance})
for name in field_names:
    dist_df[name]=''
dist_df.head()

Unnamed: 0,disturbance_variable,local_catchment,network_catchment,local_buffer,network_buffer
0,Agriculture water withdrawal,,,,
1,All mine density,,,,
2,CERCLIS site density,,,,
3,Coal mine density,,,,
4,Cultivated crops,,,,


In [7]:
#Identify which disturbance variables were not tested for each spatial scale and update our table (dist_df) with this information

not_tested_lc = set([])
not_tested_nc = set([])
not_tested_lb = set([])
not_tested_nb = set([])

for row in df_tested_dist.itertuples():
    if row.tested == 0 and row.scale == 'local catchment':
        not_tested_lc.add(row.disturbance)
    elif row.tested == 0 and row.scale == 'network catchment':
        not_tested_nc.add(row.disturbance)
    elif row.tested == 0 and row.scale == 'local buffer':
        not_tested_lb.add(row.disturbance)
    elif row.tested == 0 and row.scale == 'network buffer':
        not_tested_nb.add(row.disturbance)   

dist_df.loc[dist_df.disturbance_variable.isin(not_tested_lc), 'local_catchment'] = 'NT'
dist_df.loc[dist_df.disturbance_variable.isin(not_tested_nc), 'network_catchment'] = 'NT'
dist_df.loc[dist_df.disturbance_variable.isin(not_tested_lb), 'local_buffer'] = 'NT'
dist_df.loc[dist_df.disturbance_variable.isin(not_tested_nb), 'network_buffer'] = 'NT'

In [8]:
#This is the base table for all visualizations. This will be the same for ALL iterations (is not specific to SFR spatial feature).
#From this table we will add spatial feature specific "Significant" values to the table in the next step.
dist_df.head(25)

Unnamed: 0,disturbance_variable,local_catchment,network_catchment,local_buffer,network_buffer
0,Agriculture water withdrawal,NT,,NT,NT
1,All mine density,,,NT,NT
2,CERCLIS site density,,,NT,NT
3,Coal mine density,,,NT,NT
4,Cultivated crops,,,,
5,Domestic water withdrawal,NT,,NT,NT
6,Downstream mainstem dam density,NT,,NT,NT
7,High intensity urban,,,,
8,Impervious surface,,,,
9,Industrial water withdrawal,NT,,NT,NT


#### Step 3: Return list of disturbances per spatial scale that were significant.  This will CHANGE in each iteration (based on selected spatial unit) of the table and each of these sequences in the table with be denoted by the word 'significant' and these cells should be shaded to make them stand out.   This step also returns the title of the table.

In [9]:
data = requestDist(thisRun['url'], 'doi lands:5324')

In [10]:
#This example run is for spatial feature 'doi lands:5324' from the placeNameLookup table.  
#This feature represents "Buenos Aires National Wildlife Refuge"
data = requestDist(thisRun['url'], 'doi lands:5324')


#Title of Table
title = 'Disturbances Influencing Risk to Fish Habitat Condition in ' + data['place_name']
print(title)


def clean_data(data):
    set_data = set([])
    data = data.replace('{', '')
    data = data.replace('}', '')
    data = data.replace('"', '')
    for word in data.split(','):
        if word in ['No limiting disturbace', 'No limiting disturbance', 'No limiting disurbance']:
            continue
        else:
            if word in ['CERCLIS site denistyLIS site density', 'CERCLIS site density']:
                word = 'CERCLIS site density'
            elif word in ['COAL','Coal mine density mine density','Coal mine density']:
                word = 'Coal mine density'
            elif word in ['Cultivated Cultivated cropss','Cultivated crop', 'Cultivated crops','Cultivated crop density','Cultviated crops','Cultviated crops']:
                word = 'Cultivated crops'
            elif word in ['Downstream mainstem dam density.1','Downstream maistem dam density','Downstream mainstem dam density']:
                word = 'Downstream mainstem dam density'
            elif word in ['High intenisty urban', 'High intensity urban ', 'High intensity urban']:
                word = 'High intensity urban'
            elif word in ['Impervious Surface', 'Impervious surfaceervious surface','Impervious surface']:
                word = 'Impervious surface'
            elif word == 'Industrial water withdawal':
                word = 'Industrial water withdrawal' 
            elif word in ['Pasture', 'PastureURE']:
                word = 'Pasture and hay'
            elif word in ['TOTWW','Total water withdrawal']:
                word = 'Total water withdrawal'        
            elif word in ['TRI site denisty', 'TRI site density']:
                word = 'TRI site density'
            elif word in ['Upstream network dam density']:
                word = 'Upstream mainstem dam density'
            set_data.add(word)
    return set_data   


for name in field_names:
    if name == 'local_catchment':
        lc_significant = clean_data(data['lc_list_dist']) 
        dist_df.loc[dist_df.disturbance_variable.isin(lc_significant), 'local_catchment'] = 'Significant'
    elif name == 'network_catchment':
        nc_significant = clean_data(data['nc_list_dist'])
        dist_df.loc[dist_df.disturbance_variable.isin(nc_significant), 'network_catchment'] = 'Significant'
    elif name == 'local_buffer':
        lb_significant = clean_data(data['lb_list_dist']) 
        dist_df.loc[dist_df.disturbance_variable.isin(lb_significant), 'local_buffer'] = 'Significant'
    elif name == 'network_buffer':
        nb_significant = clean_data(data['nb_list_dist']) 
        dist_df.loc[dist_df.disturbance_variable.isin(nb_significant), 'network_buffer'] = 'Significant'

Disturbances Influencing Risk to Fish Habitat Condition in Buenos Aires National Wildlife Refuge


In [11]:
#Show final dataframe with information stored similar to expected table output
dist_df.head(25)

Unnamed: 0,disturbance_variable,local_catchment,network_catchment,local_buffer,network_buffer
0,Agriculture water withdrawal,NT,,NT,NT
1,All mine density,,,NT,NT
2,CERCLIS site density,,,NT,NT
3,Coal mine density,,,NT,NT
4,Cultivated crops,,,,Significant
5,Domestic water withdrawal,NT,Significant,NT,NT
6,Downstream mainstem dam density,NT,,NT,NT
7,High intensity urban,,,,
8,Impervious surface,,,Significant,
9,Industrial water withdrawal,NT,,NT,NT
