# Python lists and custom sorting

## Prepare GIS data for scenario

In [56]:
import pandas as pd
import arcgis
import arcpy
import os

The code below preapres the data we want for the analysis.
- Runs Spatial Join geoprocessing tool to add the EPA Region name to the species population with the greatest overlap..
- Returns the results of the spatial join analysis as a pandas dataframe, spatially enabled through the ArcGIS API for Python.
The <code>field_mapping</code> parameter for <code>arcpy.analysis.SpatialJoin</code> limits which attributes are transferred, since we just want the EPA Region name.

In [57]:
gdb = os.path.join(r"C:\Demos\PYTS\ViablePops.gdb")

viablePops = 'RareSpecies'
epaRegions = 'EPA_Regions'
viablePopsRegions = 'RareSpecies_SpatialJoin'

arcpy.env.overwriteOutput = True
arcpy.env.workspace = gdb
    

In [58]:
arcpy.analysis.SpatialJoin(
    target_features=viablePops,
    join_features=epaRegions,
    out_feature_class=viablePopsRegions,
    join_operation="JOIN_ONE_TO_ONE",
    join_type="KEEP_ALL",
    field_mapping=f'Species "Species" true true false 255 Text 0 0,First,#,{viablePops},Species,0,255;Viability "Viability" true true false 1 Text 0 0,First,#,{viablePops},Viability,0,1;ObservationDate "ObservationDate" true true false 8 Date 0 0,First,#,{viablePops},ObservationDate,-1,-1;EPAREGION "EPA Region" true true false 50 Text 0 0,First,#,{epaRegions},EPAREGION,0,50',
    match_option="LARGEST_OVERLAP",
    search_radius=None,
    distance_field_name=""
)
print("EPA Region and viable populations now combined.")

EPA Region and viable populations now combined.


In [59]:
df = pd.DataFrame.spatial.from_featureclass(viablePopsRegions)
df.head(5)

Unnamed: 0,OBJECTID,Join_Count,TARGET_FID,Species,Viability,ObservationDate,EPAREGION,SHAPE
0,1,1,1,S1,A,,Region 4,"{""rings"": [[[-8872754.7646, 4267030.036399998]..."
1,2,1,2,S1,A,,Region 3,"{""rings"": [[[-9111767.8654, 4635699.698799998]..."
2,3,1,3,S1,A,,Region 7,"{""rings"": [[[-10369664.3401, 5079475.3587], [-..."
3,4,1,4,S1,A,,Region 7,"{""rings"": [[[-10484599.0434, 4587810.239100002..."
4,5,1,5,S1,H,,Region 2,"{""rings"": [[[-8616910.1151, 5226336.368500002]..."


## List <code>sort</code> method

We have a list that represents population viability, from Best (A) to worst (H): 
<code>['A','B','E','C','D','F','X','H']</code>.

The current order of the elements in the list are the order we want to sort datasets (of species populations in various regions). In this list, <code>E</code> is better than <code>C</code>, and <code>X</code> is better than <code>H</code>.

A list's <code>sort</code> method can list values in ascending or descening order. This doesn't apply to this scenario though.

In [60]:
popViability = ['A','B','E','C','D','F','X','H']
popViability.sort()
popViability

['A', 'B', 'C', 'D', 'E', 'F', 'H', 'X']

## List <code>sort</code> method's <code>key</code> parameter

We will use the <code>Viability</code> attribute from our spatial join results by converting it to a <code>list</code>.

We will need a dictionary to associate a numeric value for each viability score. This dictionary will be used in the custom function for the <code>key</code> parameter.

In [61]:
viabilityList = df['Viability'].to_list()

popViabilityD = {
    'A':0,
    'B':1,
    'E':2,
    'C':3,
    'D':4,
    'F':5,
    'X':6,
    'H':7
}

The <code>sort</code> method has an optional parameter, <code>key</code> allows you to provide a custom function defining the sorting criteria. Here's how that code works:
- Access the sort method for the list.
- Use the <code>key</code> parameter.
- <code>lambda</code> is an anonymous function. A what? Details:
    - This is useful so we don't have to create a global function in other lines of code, that will only be once.
    - A way of reading the <code>lambda</code> function is "Hey python, <code>popViabilityD</code> is a dictionary. Access each <code>key</code> as <code>x</code>. Return the value for that dictionary key".
    - This way, the <code>key</code> parameter uses the order of <code>popViabilityD</code>'s values as the custom sorting to apply to the list, <code>viabilityList</code>.

In [62]:
print("Original list from the data:")
print([x for x in viabilityList])

viabilityList.sort(key=lambda x:popViabilityD[x])

print("That list sorted using a custom key:")
print([x for x in viabilityList])

Original list from the data:
['A', 'A', 'A', 'A', 'H', 'H', 'B', 'B', 'X', 'X', 'A', 'A', 'E', 'B', 'X', 'C', 'D', 'A', 'B', 'B', 'F', 'E', 'B', 'E', 'D', 'D']
That list sorted using a custom key:
['A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B', 'E', 'E', 'E', 'C', 'D', 'D', 'D', 'F', 'X', 'X', 'X', 'H', 'H']


## Apply custom sorting to a <code>pandas</code> <code>DataFrame</code>

This list is now sorted in the order we need. However, this list was extracted from a <code>DataFrame</code>. Let's see how we can sort the whole <code>DataFrame</code> according to the <code>Viability</code> attribute using this custom ordering.

- To start, we will create a <code>DataFrame</code> containing the desired order of values, <code>orderViableDF</code>. <code></code>.
- Then set the  <code>Viability</code> attribute of <code>orderViableDF</code> as the index, so that each value is associated with the indexing sequence.
- Then, sort the <code>DataFrame</code> of the data according to the <code>orderViableDF</code> index.

In [65]:
orderViableDF = pd.DataFrame({'Viability':['A','B','E','C','D','F','X','H']})
sort = orderViableDF.reset_index().set_index('Viability')
df['V_Num'] = df['Viability'].map(sort['index'])
df.head(10)

Unnamed: 0,OBJECTID,Join_Count,TARGET_FID,Species,Viability,ObservationDate,EPAREGION,SHAPE,V_Num
0,1,1,1,S1,A,,Region 4,"{""rings"": [[[-8872754.7646, 4267030.036399998]...",0
1,2,1,2,S1,A,,Region 3,"{""rings"": [[[-9111767.8654, 4635699.698799998]...",0
2,3,1,3,S1,A,,Region 7,"{""rings"": [[[-10369664.3401, 5079475.3587], [-...",0
3,4,1,4,S1,A,,Region 7,"{""rings"": [[[-10484599.0434, 4587810.239100002...",0
4,5,1,5,S1,H,,Region 2,"{""rings"": [[[-8616910.1151, 5226336.368500002]...",7
5,6,1,6,S1,H,,Region 5,"{""rings"": [[[-10012089.7077, 5478554.189499997...",7
6,7,1,7,S1,B,,Region 4,"{""rings"": [[[-8958521.5943, 4175960.8857000023...",1
7,8,1,8,S1,B,,Region 4,"{""rings"": [[[-9015988.946, 3958862.0018000007]...",1
8,9,1,9,S1,X,,Region 6,"{""rings"": [[[-11489105.403, 4372719.2205], [-1...",6
9,10,3,10,S2,X,,Region 5,"{""rings"": [[[-10694065.5447, 5502922.940700002...",6
