In [1]:
"""
Reporting Gap Analysis
Author: Liam Megraw, RIT Envirionmental Science Technician
Date last edited: 3/14/2023
ESRI ArcGIS Pro Version 2.7

Description:
This code processes uses results from the RIT-developed computer 
vision model and iMapInvasives records to identify gaps in reporting
on a per-species basis. The first part compares all iMap and model 
records for the species of interest at the same time, while the 
second part compares them on a per-species basis.

Inputs:
> Single point dataset of model prediction points with n species each
  having their own confidence score column
> 7 iMapInvasives datasets for n species
    > Presence points (confirmed)
    > Presence lines (confirmed)
    > Presence polygons (confirmed)
    > Presence points (unconfirmed)
    > Presence lines (unconfirmed)
    > Presence polygons (unconfirmed)
    > Not detected polygons
> n point datasets of model presence predictions at a threshold (For per-species approach)

Outputs:
The final outputs are two polygon layers at a 1 km resolution with 
overall and per-species attributes detailing the type of records 
within a cell (model only, iMap only, or both), and if there is 
overlap, a comparison value between the two types of records.

How to Use:
These layers can be hosted on the ArcGIS Online Public and Manager Dashboards.
"""

'\nReporting Gap Analysis\nAuthor: Liam Megraw, RIT Envirionmental Science Technician\nDate last edited: 3/9/2023\nESRI ArcGIS Pro Version 2.7\n\nDescription:\nThis code processes uses results from the RIT-developed computer \nvision model and iMapInvasives records to identify gaps in reporting\non a per-species basis. The first part compares all iMap and model \nrecords for the species of interest at the same time, while the \nsecond part compares them on a per-species basis.\n\nInputs:\n> Single point dataset of model prediction points with n species each\n  having their own confidence score column\n> 7 iMapInvasives datasets for n species\n    > Presence points (confirmed)\n    > Presence lines (confirmed)\n    > Presence polygons (confirmed)\n    > Presence points (unconfirmed)\n    > Presence lines (unconfirmed)\n    > Presence polygons (unconfirmed)\n    > Not detected polygons\n> n point datasets of model presence predictions at a threshold (For per-species approach)\n\nOutputs:

In [2]:
"""
Pseudocode Overview

Assign workspace
(Optionally) create state-wide fishnets at 1 km resolution
Create lists of input files
    iMap: point, line, polygon
    model: point
Define function to creat field mappings for spatial joins

Thresholdless approach
    Effectively, for both model data and iMap data, and each imap geometry type:
            Spatial join records to fishnet
            Add & calculate fields
                Total join count for that species
                Overlap type if statement:
                    Cells where model data join count is above zero and iMap join count above zero: both (i.e., overlap)
                    Cells where model data join count is above zero and iMap join count is zero: (i.e, model only)
                    Cells where model data join count is zero and iMap join count is above zero (i.e., iMap only)
                Calculate comparison between model and iMap
Thresholded (per-species) approach
    For each species:
        Export iMap records to layers split by species, record type, and geometry
        Spatially join records to fishnet
        
Export results

"""

'\nPseudocode Overview\n\nAssign workspace\n(Optionally) create state-wide fishnets at 1 km resolution\nCreate lists of input files\n    iMap: point, line, polygon\n    model: point\nDefine function to creat field mappings for spatial joins\n\nThresholdless approach\n    Effectively, for both model data and iMap data, and each imap geometry type:\n            Spatial join records to fishnet\n            Add & calculate fields\n                Total join count for that species\n                Overlap type if statement:\n                    Cells where model data join count is above zero and iMap join count above zero: both (i.e., overlap)\n                    Cells where model data join count is above zero and iMap join count is zero: (i.e, model only)\n                    Cells where model data join count is zero and iMap join count is above zero (i.e., iMap only)\n                Calculate comparison between model and iMap\nThresholded (per-species) approach\n    For each species:\n 

In [1]:
#----- Get and set workspace to gdb -----
import arcpy
from arcpy import env
arcpy.env.workspace = r'C:\Users\ltmsbi\Documents\ArcGIS\Projects\Final_Deployment\Final_Deployment.gdb'

arcpy.env.OverwriteOutput = True

In [2]:
# Necessary input files
model_pred = ["pred_finalDeployment_all",] # each species must have their own column

def create_SJ_FieldMappings(targetLayer, joinLayer): # Return field mappings for spatial joins when called 
    
    # List starting fields for spatial joins that'll be updated with each successive join
    keepFields = list()
    omitFields = ["OBJECTID", "Shape", "Shape_Area", "Shape_Length"]

    for field in arcpy.ListFields(targetLayer):
        if field.name not in omitFields:
            keepFields.append(field.name)
    
    fieldMappings = arcpy.FieldMappings() # field mapping variable; this will store all field mappings

    # Create list of field names to keep in the output file
    targetTable = []
    for i in arcpy.ListFields(targetLayer):
        if i.name in (keepFields):
            targetTable.append(i.name)

    # List of input feature classes for the spatial join
    f = [targetLayer, joinLayer]

    for k in targetTable: # loop through main table
        #print("Field: ",k)
        fieldMap = arcpy.FieldMap() # create an empty field map variable
        fieldMap.addInputField(targetLayer,k) # insert the target layer as the first input into the field map
        for feature in f: # loop through feature classes
            for field in arcpy.ListFields(feature): # loop through field of each feature class
                if k in field.name: # check if any field matches with our target field then append it as an input field
                    fieldMap.addInputField(feature,field.name) 
        fieldMappings.addFieldMap(fieldMap) # add the current field map to the main field map variable
    return(fieldMappings)

def generateWhereClauses(type): # Return a dictionary of where clauses to select records for calculations when called 
    if type == "THRESHOLDED":
        l_suffix = "_"+species
        points = "model"+l_suffix
        extras = ["",")"]
    elif type == "NOT_THRESHOLDED":
        l_suffix = ""
        points = "model_points"
        extras = [" And iMap_nd = 0", " Or iMap_nd > 0)"]

    iMap_cnfrm = "imap_cnfrm"+l_suffix
    iMap_uncnfrm = "imap_uncnfrm"+l_suffix
    iMap_nd = "imap_nd"+l_suffix
    model_points = "model_points"
    model_positives = "model"+l_suffix # Unused name in the thresholdless version

    whereClauses = { # SQL queries used to select records
                # These will be set to negative integers
                "model-only_conf": points+" > 0 And "+iMap_cnfrm+" = 0"+extras[0], 
                "model-only_unconf": points+" > 0 And "+iMap_cnfrm+" = 0 And "+iMap_uncnfrm+" = 0"+extras[0],
                # These three will be set to 0
                "iMap-only_conf": points+" = 0 And ("+iMap_cnfrm+" > 0"+extras[1], 
                "iMap-only_unconf": points+" = 0 And ("+iMap_cnfrm+" > 0 Or "+iMap_uncnfrm+" > 0"+extras[1],
                # These three will be set to positive floats
                "Overlap_conf": points+" > 0 And ("+iMap_cnfrm+" > 0"+extras[1], 
                "Overlap_unconf": points+" > 0 And ("+iMap_cnfrm+" > 0 Or "+iMap_uncnfrm+" > 0"+extras[1],
                # Cells with neither record type will retain a null designation during calculation
                }

    if type == "THRESHOLDED": # Add extra conditions
        whereClauses["model-only_nd"] = model_points+" > 0 And "+model_points+" > "+model_positives+" And "+iMap_nd+" = 0"
        whereClauses["iMap-only_nd"] = model_points+" = 0 And "+iMap_nd+" > 0"
        whereClauses["Overlap_nd"] = model_points+" > 0 And "+model_points+" > "+model_positives+" And "+iMap_nd+" > 0"
    
    return(whereClauses) # Return dictionary of where clauses when called

def generateCalcFieldDict(type): # Return a dictionary of fields to assign calculated value 
    fields_dict = generateWhereClauses(type) # Create a reference to the whereClause dict
    
    # Update dict entries with the values being the field where the value to be calculated later should be stored
    if type == "THRESHOLDED":
        s_suffix = "_"+species
        for field in ["model-only_nd", "iMap-only_nd", "Overlap_nd"]:
            fields_dict[field] = "NDc"+s_suffix
    elif type == "NOT_THRESHOLDED":
        s_suffix = "_overall"
    for field in ["model-only_conf", "iMap-only_conf", "Overlap_conf"]:
        fields_dict[field] = "Cc"+s_suffix
    for field in ["model-only_unconf", "iMap-only_unconf", "Overlap_unconf"]:
        fields_dict[field] = "CUc"+s_suffix
    
    return(fields_dict)

def generateCalcExpressions(type): # Return a dictionary of expressions to calculate comparison values 
    exp_dict = generateWhereClauses(type) # Create reference to main whereClause dict
    
    if type == "THRESHOLDED":
        l_suffix = "_"+species
        points = "!model"+l_suffix+"!"
        extra = ""
    elif type == "NOT_THRESHOLDED":
        l_suffix = ""
        points = "!model_points!"
        extra = "+ !iMap_nd!"
        
    # Generate proper layer names to reference in calculations 
    iMap_cnfrm = "!imap_cnfrm"+l_suffix+"!"
    iMap_uncnfrm = "!imap_uncnfrm"+l_suffix+"!"
    iMap_nd = "!imap_nd"+l_suffix+"!"
    model_positives = "!model"+l_suffix+"!" # Unused name in the thresholdless version

    # Update dictionary with the values being the expression used to calculate the field determined by generateCalcFieldDict()
    for field in ["model-only_conf", "model-only_unconf"]:
        exp_dict[field] = "-"+points
    for field in ["iMap-only_conf", "iMap-only_unconf"]:
        exp_dict[field] = "0"
    exp_dict["Overlap_conf"] = points+"/("+iMap_cnfrm+extra+")" # Effectively, "extra" adds iMap_nd if thresholdless and doesn't for thresholded
    exp_dict["Overlap_unconf"] = points+"/("+iMap_cnfrm+" + "+iMap_uncnfrm+extra+")" # Same effect as above line
    if type == "THRESHOLDED":
        exp_dict["iMap-only_nd"] = "0"
        exp_dict["model-only_nd"] = "-(!model_points! - "+model_positives+")"
        exp_dict["Overlap_nd"] = "(!model_points! - "+model_positives+")/"+iMap_nd
    
    return(exp_dict)
    
# Create list of imap features to iterate over
imap_data = list() # To be filled to include points, lines, polygons, and not detected
geometries = ["POINT","LINE","POLYGON"]
types = ["CONFIRMED","UNCONFIRMED"]
for geometry in geometries:
    for record_type in types:
        if record_type is "CONFIRMED":
            imap_data.append("PRESENCE_"+geometry)
        else:
            imap_data.append("PRESENCE_"+geometry+"_"+record_type)
imap_data.append("NOT_DETECTED_POLYGON")

print("iMap datasets:")
print(imap_data)

# Create dictionary of long names
# Names used for filtering in ArcGIS Online
species_fullnames = {
    "phrag": "'Phragmites, Unspecified'", # extra sinlge quotes are intentional since these are used in a field calculation
    "knot": "'Knotweed, Unspecified'",
    "wp": "'Wild Parsnip'",
    "toh": "'Tree-of-Heaven (Ailanthus)'",
    "pl": "'Purple Loosestrife'"
}

# Extract only the keys to a list
species_shortnames = list(species_fullnames.keys())

# IDs that iMap assigns to the various species of interest
jurisdiction_ids = {
    "phrag": 1277,
    "wp": 1182,
    "pl": 1265,
    "toh": 1167,
    "knot": (1074, 1191, 1278, 1479) # Includes Japanese knotweed, giant knotweed, bohemian knotweed, and knotweed species unknown
}

iMap datasets:
['PRESENCE_POINT', 'PRESENCE_POINT_UNCONFIRMED', 'PRESENCE_LINE', 'PRESENCE_LINE_UNCONFIRMED', 'PRESENCE_POLYGON', 'PRESENCE_POLYGON_UNCONFIRMED', 'NOT_DETECTED_POLYGON']


In [5]:
# # Code to create a fishnet for the state if you do not already have one
# aoi = ['4,481,032.099500 105,606.381800 4,985,489.904000 770,761.900100'] # New york state boundary coordinates in UTM Zone 18N projection (Coordinates are expressed in the order of x-min, y-min, x-max, y-max)
# cellsize = '1' # The width and height argument for the fishnet function
# fishnet_output_name = 'grid_nys_18N_1km'
# # Create fishnet 
# arcpy.management.CreateFishnet(fishnet_output_name, '4,985,489.904000 105,606.381800', '4,481,032.099500 105,606.381800', cellsize, cellsize, '0', '0', {corner_coord}, 'NO_LABELS', aoi, 'POLYGON')

# For Thresholdless Only

In [6]:
# For thresholdless reporting analysis
dataset_lists = [imap_data, model_pred]

# Lists of fields to fill in later
tmpCnfrm = list()
tmpUncnfrm = list()

tmpFeatures = list() # List of temp features to deleve later

# Define target feature for first run
targetFeature = "reporting_analysis_grid_empty"

for datasets in dataset_lists:
    if datasets is model_pred:
        print("Joining model data")
    else:
        print("Joining iMap data")
    for joinFeature in datasets:

        # If model data ----------
        if datasets is model_pred: 

            outFeature = "tmpRAG_model"
            fieldName = "model_points"
                
        # If iMap data ----------    
        else:
        
            # Define unique output names
            if "UNCONFIRMED" in joinFeature:
                if "POINT" in joinFeature:
                    outFeature = "tmpRAG_point_uncnfrm"
                    fieldName = "imap_point_uncnfrm"
                if "LINE" in joinFeature:
                    outFeature = "tmpRAG_line_uncnfrm"
                    fieldName = "imap_line_uncnfrm"
                if "POLYGON" in joinFeature:
                    outFeature = "tmpRAG_poly_uncnfrm"
                    fieldName = "imap_poly_uncnfrm"
                tmpUncnfrm.append("!"+fieldName+"!")
            elif "NOT_DETECTED" in joinFeature:
                outFeature = "tmpRAG_nd"
                fieldName = "iMap_nd"
            else:
                if "POINT" in joinFeature:
                    outFeature = "tmpRAG_point_cnfrm"
                    fieldName = "imap_point_cnfrm"
                if "LINE" in joinFeature:
                    outFeature = "tmpRAG_line_cnfrm"
                    fieldName = "imap_line_cnfrm"
                if "POLYGON" in joinFeature:
                    outFeature = "tmpRAG_poly_cnfrm"
                    fieldName = "imap_poly_cnfrm"

                # Add field to list for use in calculating later
                tmpCnfrm.append("!"+fieldName+"!")
        
        print("***"+joinFeature)
        fm = create_SJ_FieldMappings(targetFeature, joinFeature) # Create the field mappings for the join
        arcpy.analysis.SpatialJoin(targetFeature, joinFeature, outFeature, "JOIN_ONE_TO_ONE", "KEEP_ALL", fm) # Count the features within each grid cell
        
        arcpy.management.AlterField(outFeature, "JOIN_COUNT", fieldName, fieldName) # Rename join_count field
        arcpy.management.DeleteField(outFeature, "TARGET_FID") # Delete unnecessary field
        
        
        tmpFeatures.append(outFeature) # Add feature to list to delete later
        
        targetFeature = outFeature # Make the output feature the input for the next join

# Copy the final output feature that has all desired fields
print("Copying final output")
RAG_tl = "reporting_analysis_grid_thresholdless"
arcpy.management.CopyFeatures(outFeature, RAG_tl)

# Calculate the total number of iMap features joined
print("Calculating total iMap features joined")

cName = "iMap_cnfrm"
uName = "iMap_uncnfrm"
arcpy.management.AddFields(RAG_tl, [
    [cName, 'SHORT'],
    [uName, 'SHORT']
])

tmpCnfrm = tuple(tmpCnfrm)
print(tmpCnfrm)
arcpy.management.CalculateField(RAG_tl, cName, tmpCnfrm[0]+"+"+tmpCnfrm[1]+"+"+tmpCnfrm[2])
print(tmpUncnfrm)
tmpUncnfrm = tuple(tmpUncnfrm)
arcpy.management.CalculateField(RAG_tl, uName, tmpUncnfrm[0]+"+"+tmpUncnfrm[1]+"+"+tmpUncnfrm[2])

print("Adding comparison fields")
ccName = "C_comp_overall"
cucName = "CU_comp_overall"

arcpy.management.AddFields(RAG_tl, [
    [ccName, 'FLOAT'],
    [cucName, 'FLOAT']
])

print("Calculating comparison values")
type = "NOT_THRESHOLDED"
fieldDict = generateCalcFieldDict(type)
expDict = generateCalcExpressions(type)
wCs = generateWhereClauses(type)

for key in wCs: # Loop to calculate the reporting analysis values
    sel = arcpy.management.SelectLayerByAttribute(RAG_tl, "NEW_SELECTION", wCs[key]) # Make the selection
    arcpy.management.CalculateField(sel, fieldDict[key], expDict[key]) # Calculate values in field based on expression
    del sel

for field_list in [tmpCnfrm, tmpUncnfrm]:
    arcpy.management.DeleteField(RAG_tl, field_list) # Delete per-geometry fields that are no longer needed
    
del type, fieldDict, expDict, wCs

print("Done!")

Joining iMap data
***PRESENCE_POINT
***PRESENCE_POINT_UNCONFIRMED
***PRESENCE_LINE
***PRESENCE_LINE_UNCONFIRMED
***PRESENCE_POLYGON
***PRESENCE_POLYGON_UNCONFIRMED
***NOT_DETECTED_POLYGON
Joining model data
***pred_finalDeployment_all
Copying final output
Calculating total iMap features joined
('!imap_point_cnfrm!', '!imap_line_cnfrm!', '!imap_poly_cnfrm!')
['!imap_point_uncnfrm!', '!imap_line_uncnfrm!', '!imap_poly_uncnfrm!']
Adding comparison fields
Calculating comparison values
Done!


In [8]:
# Delete temporary files
# This way is necessary to delete the feature itself and not just its contents
import os
cws = arcpy.env.workspace

# Delete unmerged presence points
for input in tmpFeatures:
  input_path = os.path.join(cws, input)
  if arcpy.Exists(input_path):
    arcpy.Delete_management(input_path)

# For Species-Based Approach

In [6]:
# Change this if running for different thresholds

# Define the threshold used to determine presences
# Used as a suffix and as a field in the viewing version of the final layer
threshold = 'precision'

In [10]:
# Separate out iMap records by species and geometry ----------

# Add model data layer names to processing list ----------


# Create list of n + n*3 files to process, where n is the number of species
ps_records = [] # empty list to append items onto

# Add names to list
for n in range(0,len(species_shortnames)):
    # Add n items for model data 
    ps_records.append('SVI_Project_presences_'+species_shortnames[n]+"_"+threshold)

    
# ----------

# Define list of geometries in the iMap data
imap_geometries = ["POINT", "LINE", "POLYGON"]

# Define imap records types
imap_record_types = {
    "cnfrm": "_Conf", # The "suffix" for confirmed records is blank
    "uncnfrm": "_Unconf"
}

# Add n*2*3 + n items for iMap data (accounts for 2 record and 3 geometry types, 
# plus not-detected records) 
for n in range(0,len(species_shortnames)):
    # Set up per-species query for selecting by attribute
    if species_shortnames[n] is "knot":
        # Set initial SQL query
        idClause = "jurisdiction_species_id = "+str(jurisdiction_ids["knot"][0])
        # Add more conditions to query
        for ID in jurisdiction_ids["knot"][1:]:
            idClause = idClause + " Or jurisdiction_species_id = " + str(ID)
    else:
        idClause = "jurisdiction_species_id = "+str(jurisdiction_ids[species_shortnames[n]]) 
    
    # Copy a per-species subset of not detected polygons
    sel = arcpy.management.SelectLayerByAttribute("NOT_DETECTED_POLYGON", "NEW_SELECTION", idClause)
    imap_nd = "iMap_nd_"+species_shortnames[n]
    arcpy.management.CopyFeatures(sel, imap_nd)
    ps_records.append(imap_nd)
    # Remove variables and selections to save memory
    del sel, imap_nd
    arcpy.management.SelectLayerByAttribute("NOT_DETECTED_POLYGON", "CLEAR_SELECTION")
    
    # Copy a per-species subset for each record and geometry type
    for rt in range(0,len(imap_record_types)):
        if rt == 0: # Confirmed
            rt_suffix = ""
        elif rt == 1: # Unconfirmed
            rt_suffix = "_UNCONFIRMED"
        for g in range(0,len(imap_geometries)):
            sel = arcpy.management.SelectLayerByAttribute("PRESENCE_"+imap_geometries[g]+rt_suffix, "NEW_SELECTION", idClause)
            imap_subset = "iMap_"+imap_geometries[g].lower()+"_"+list(imap_record_types.keys())[rt]+"_"+species_shortnames[n]
            arcpy.management.CopyFeatures(sel, imap_subset)
            ps_records.append(imap_subset)
            # Remove variables and selections to save memory
            del sel, imap_subset
            arcpy.management.SelectLayerByAttribute("PRESENCE_"+imap_geometries[g], "CLEAR_SELECTION")
        
        
# ----------

print("Number of layers to process: "+str(len(ps_records)))
print(ps_records)

del imap_geometries, imap_record_types

Number of layers to process: 40
['SVI_Project_presences_phrag_precision', 'SVI_Project_presences_knot_precision', 'SVI_Project_presences_wp_precision', 'SVI_Project_presences_toh_precision', 'SVI_Project_presences_pl_precision', 'iMap_nd_phrag', 'iMap_point_cnfrm_phrag', 'iMap_line_cnfrm_phrag', 'iMap_polygon_cnfrm_phrag', 'iMap_point_uncnfrm_phrag', 'iMap_line_uncnfrm_phrag', 'iMap_polygon_uncnfrm_phrag', 'iMap_nd_knot', 'iMap_point_cnfrm_knot', 'iMap_line_cnfrm_knot', 'iMap_polygon_cnfrm_knot', 'iMap_point_uncnfrm_knot', 'iMap_line_uncnfrm_knot', 'iMap_polygon_uncnfrm_knot', 'iMap_nd_wp', 'iMap_point_cnfrm_wp', 'iMap_line_cnfrm_wp', 'iMap_polygon_cnfrm_wp', 'iMap_point_uncnfrm_wp', 'iMap_line_uncnfrm_wp', 'iMap_polygon_uncnfrm_wp', 'iMap_nd_toh', 'iMap_point_cnfrm_toh', 'iMap_line_cnfrm_toh', 'iMap_polygon_cnfrm_toh', 'iMap_point_uncnfrm_toh', 'iMap_line_uncnfrm_toh', 'iMap_polygon_uncnfrm_toh', 'iMap_nd_pl', 'iMap_point_cnfrm_pl', 'iMap_line_cnfrm_pl', 'iMap_polygon_cnfrm_pl', 'iMap

## Create analysis-ready version (1 layer with fields for all species)

In [11]:
# Spatial join repeatedly
RAG_tl = "reporting_analysis_grid_thresholdless" # Defined a few cells above
input_grid = RAG_tl

import os # For deleting intermediate spatial joins

tmpFeatures = list()

# Used for tracking iterations which enables field calculations to be performed after 
# confirmed and unconfirmed iMap records of all geometries are spatially joined
counter = 1

# Effectively, for iMap and model data, each species, and geometry/record type (if iMap data):
for i, r in enumerate(ps_records):
    # Define field naming convention
    if "SVI" in r:
        # Slice to only retain the species and create new field name
        species = r[22:-10]
        field_name = "model_"+species
    else:
        field_name = r

    print(field_name)

    fm = create_SJ_FieldMappings(input_grid, r) # Create the field mappings for this join

    # Perform the spatial join & clean the table ----------

    out_grid = "tmp_grid_"+str(i)
    tmpFeatures.append(out_grid) # add to list for deleting later
    arcpy.analysis.SpatialJoin(input_grid,r,out_grid,"JOIN_ONE_TO_ONE","KEEP_ALL",fm)
    del fm

    # Rename Join_Count field
    arcpy.management.AlterField(out_grid, "Join_Count", field_name, field_name)
    # Delete TARGET_FID field
    arcpy.management.DeleteField(out_grid, "TARGET_FID")

    # ----------


    # Sum records only after confirmed or unconfirmed iMap records of all geometries are spatially joined ----------

    if i >= len(species_shortnames): # If our iteration index is past processing model data
        

        # Determine the oscillation
        if "nd" in r:
            x = 4
        elif "point_uncnfrm" in r:
            x = 3


        # Effectively oscillate between running every 4th and 3rd entry, respectively 
        if (counter == x): # If the counter is at the oscillating xth entry we're targeting
            counter = 0 # Reset the count

            sum_fields = ps_records[i-2:i+1] # Select three fields of interest to sum
            sum_name = "iMap_"+r[13:] # Create desired field name
            express = "!"+sum_fields[0]+"!+!"+sum_fields[1]+"!+!"+sum_fields[2]+"!" # Create expression for calculating sum 
            arcpy.management.AddField(out_grid, sum_name, "SHORT") # Create the field
            arcpy.management.CalculateField(out_grid,sum_name,express) # Sum the fields
            for field in sum_fields: # Loop through the fields used when summing
                arcpy.management.DeleteField(out_grid, field) # Delete the field from the feature since they're no longer necessary

        counter += 1 # Add 1 to the count

    # ----------


    input_grid = "tmp_grid_"+str(i) # Make the next run's target grid the output of this iteration's spatial join

    if i == len(ps_records)-1: # If the last iteration
        RAG_final = "SVI_Proj_reporting_analysis_grid_1km"
        arcpy.management.CopyFeatures(out_grid, RAG_final) # copy to a new feature
        
        del counter, input_grid, out_grid

# Delete temporary files
# This way is necessary to delete the feature itself and not just its contents
cws = arcpy.env.workspace

# Delete unmerged presence points
for input in tmpFeatures:
  input_path = os.path.join(cws, input)
  if arcpy.Exists(input_path):
    arcpy.Delete_management(input_path)
    
del tmpFeatures
    
print("All done")

model_phrag
model_knot
model_wp
model_toh
model_pl
iMap_nd_phrag
iMap_point_cnfrm_phrag
iMap_line_cnfrm_phrag
iMap_polygon_cnfrm_phrag
iMap_point_uncnfrm_phrag
iMap_line_uncnfrm_phrag
iMap_polygon_uncnfrm_phrag
iMap_nd_knot
iMap_point_cnfrm_knot
iMap_line_cnfrm_knot
iMap_polygon_cnfrm_knot
iMap_point_uncnfrm_knot
iMap_line_uncnfrm_knot
iMap_polygon_uncnfrm_knot
iMap_nd_wp
iMap_point_cnfrm_wp
iMap_line_cnfrm_wp
iMap_polygon_cnfrm_wp
iMap_point_uncnfrm_wp
iMap_line_uncnfrm_wp
iMap_polygon_uncnfrm_wp
iMap_nd_toh
iMap_point_cnfrm_toh
iMap_line_cnfrm_toh
iMap_polygon_cnfrm_toh
iMap_point_uncnfrm_toh
iMap_line_uncnfrm_toh
iMap_polygon_uncnfrm_toh
iMap_nd_pl
iMap_point_cnfrm_pl
iMap_line_cnfrm_pl
iMap_polygon_cnfrm_pl
iMap_point_uncnfrm_pl
iMap_line_uncnfrm_pl
iMap_polygon_uncnfrm_pl
All done


In [3]:
# Calculate per-species ND ratios, confirmed ratios, unconfirmed ratios, and rato percentiles

from scipy.stats import percentileofscore # For calculating the percentile of each value
import pandas as pd

# RAG_final = "SVI_Proj_reporting_analysis_grid_1km" # Defined a cell above

for species in species_shortnames:
    print("Adding comparison fields")
    ccName = "Cc_"+species
    cucName = "CUc_"+species
    ndcName = "NDc_"+species
    arcpy.management.AddFields(RAG_final, [
        [ccName, 'FLOAT'],
        [cucName, 'FLOAT'],
        [ndcName, 'FLOAT']
    ])

    print("Calculating comparison values")
    type = "THRESHOLDED"
    fieldDict = generateCalcFieldDict(type)
    expDict = generateCalcExpressions(type)
    wCs = generateWhereClauses(type)

    a = len(fieldDict)
    b = len(expDict)
    c = len(wCs)
    
    if a == b and b == c:
        pass
    else:
        print("Error in key names of one or more of fieldDict, expDict, and wCs")
        quit()

    for key in wCs: # Loop to calculate the reporting analysis values
        sel = arcpy.management.SelectLayerByAttribute(RAG_final, "NEW_SELECTION", wCs[key]) # Make the selection
        arcpy.management.CalculateField(sel, fieldDict[key], expDict[key]) # Calculate values in field based on expression
        del sel

    del type, fieldDict, expDict, wCs

    # Calculate percentile field that describes the percentile of model-only record magnitude and imap-model overlap ratio ----------
    

    import scipy
    from scipy.stats import percentileofscore # For calculating the percentile of each value

    # Calculate percentile field that describes the percentile of model-only record magnitude or imap-model overlap ratio ----------

    filters = [" < 0", " > 0"]
    comp_types = ["Cc_", "CUc_", "NDc_"]

    for comp_type in comp_types: # For each comparison type (Confirmed, Unconfirmed, Not detected)
        ratio_field = comp_type+species
        pct_field = "pct_"+ratio_field

        arcpy.management.AddField(RAG_final, pct_field, "FLOAT")

        for filter in filters: # For (effectively) both the overlap and model-only sets
            rga_values = [abs(row.getValue(ratio_field)) for row in arcpy.SearchCursor(RAG_final, ratio_field+filter)] # Return a list of all reporting gap analysis magnitudes (only positives or negatives, no nulls)
            cursor = arcpy.UpdateCursor(RAG_final, ratio_field+filter)
            
            if filter is " < 0":
                x = -1 # Make model-only percentiles negative
            else:
                x = 1 # Keep overlap percentiles positive
            
            for i, row in enumerate(cursor): # For each row in the filtered dataset
                pct_v = x*scipy.stats.percentileofscore(rga_values, abs(row.getValue(ratio_field))) # Calculate the percentile of the reporting gap analysis value
        #         if i < 20: # View the first 20 ratio and percentile values
        #             print("r ",abs(row.getValue(ratio_field)))
        #             print("p ",pct_v)
                row.setValue(pct_field, pct_v) # Assign the percentile of that reporting gap analysis value
                cursor.updateRow(row) # As far as I understand, this locks in that edit
                del pct_v
                
            del rga_values, cursor

        # Add zeroes to cells where there's only iMap data
        cursor = arcpy.UpdateCursor(RAG_final, ratio_field+" = 0")
        for row in cursor:
            row.setValue(pct_field, 0)
            cursor.updateRow(row)
        del cursor
        

    # ----------
del filters
print("Analysis version of the reporting gap analysis complete")

Adding comparison fields
Calculating comparison values
Adding comparison fields
Calculating comparison values
Adding comparison fields
Calculating comparison values
Adding comparison fields
Calculating comparison values
Adding comparison fields
Calculating comparison values
Analysis version of the reporting gap analysis complete


## Create the viewing-ready version (1 layer with combined comp and pct fields)

In [4]:
# Section pseudocode
# For each species:
    # Select records with non-null values for that species
    # Copy to new feature
    # Collapse per-species ratio and percentile fields into two new ones 
    # Delete per-species ratio and percentile fields
    # Add & calculate common name, jurisdiction id, and op criteria fields
# Merge layers into one for upload into AGOL

In [7]:
merge_sets = list() # Updated to contain the per-species layers to merge

for species in species_shortnames: # For each species
    whereClause = ""
    # Create where clause used to select cells where there are records
    for i, comp_type in enumerate(comp_types):
        ratio_field = comp_type+species
        
        if i < 1:
            whereClause = ratio_field + " IS NOT NULL"
        else: 
            whereClause = whereClause + " or " + ratio_field + " IS NOT NULL"
        del ratio_field

    sel = arcpy.management.SelectLayerByAttribute(RAG_final, "NEW_SELECTION", whereClause) # Select cells where there are records
    
    ps_viewing_layer = "tmp_ps_view_"+species
    arcpy.management.CopyFeatures(sel, ps_viewing_layer) # Create a copy of records for just this species
    merge_sets.append(ps_viewing_layer) # Add to list for merging later

    
    for comp_type in comp_types:
        ratio_field = comp_type+species
        pct_field = "pct_"+ratio_field
        
        # Collapse per-species field values into two fields
        arcpy.management.CalculateField(ps_viewing_layer, comp_type+"ps", "!"+ratio_field+"!")
        arcpy.management.CalculateField(ps_viewing_layer, comp_type+"ps_pct", "!"+pct_field+"!")

        arcpy.management.DeleteField(ps_viewing_layer, [ratio_field, pct_field]) # Delete per-species fields
    
    # Add some attribute fields to enable filtering in the ArcGIS Online dashboard
    arcpy.management.CalculateField(ps_viewing_layer, "Common_Nam", species_fullnames[species])
    if species is "knot":
        jsid = 1479 # Knotweed unspecified
    else:
        jsid = jurisdiction_ids[species]
    arcpy.management.CalculateField(ps_viewing_layer, "jurisdiction_species_id", jsid)
    arcpy.management.CalculateField(ps_viewing_layer, "op_criteria", "'"+threshold+"'")

arcpy.management.Merge(merge_sets, "SVI_Proj_RAG_1km_Viewing")

print("Viewing layer of the reporting gap analysis complete")

Viewing layer of the reporting gap analysis complete


In [9]:
# Delete temporary files
# This way is necessary to delete the feature itself and not just its contents
cws = arcpy.env.workspace

import os
tmpFeatures = list()
for species in species_shortnames:
    tmpFeatures.append("tmp_ps_view_"+species) # Add per-species layers
    tmpFeatures.append("iMap_nd_"+species) # Add iMap not-detected
    for rt in ['cnfrm', 'uncnfrm']:
        for geometry in ['point', 'line', 'polygon']:
            tmpFeatures.append("iMap_"+geometry+"_"+rt+"_"+species) # Add iMap polygon
    
# Delete unmerged presence points
for input in tmpFeatures:
  input_path = os.path.join(cws, input)
  if arcpy.Exists(input_path):
    arcpy.Delete_management(input_path)

# Delete user-defined variables
for obj in dir():
    if not obj.startswith("__"): # If not a system var
        del globals()[obj] # Delete