# Pre-requisites

## Python
1. Variables
2. Control flow
    1. If else statements
    2. For loops
    3. Logical operators
3. Data types
    1. Lists
    2. Dictionary
    3. Sets
4. Function declaration, argument passing
5. File handling
6. Exception
7. Object oriented programming

## GIS
1. Data types
2. Spatial referencing
3. Geoprocessing tools
4. Familiarity with arcigs interface

## Data
[Survey of India Administrative Boundary Database](https://onlinemaps.surveyofindia.gov.in/Digital_Product_Show.aspx)

# Introduction to arcpy

Python library offered by esri for manipulating geographic data.

Uses:
- Doing a repetitive step, automate geoprocessing workflows.
- Can build more application by combining it with other applications.
- Can build custom toolboxes.
- Can use arcgis api for python to manage data in arcgis online or portal.

# Environment settings
1. In many example code at ESRI documentation you can see this method being used. Example

`arcpy.env.workspace="somepath"`

2. This helps to specify the workspace which helps in co-ordinate system, path to input and output file location.

# Tool and non tool

1. Tools and non tools can looked as methods available inside python. Basic difference is you can find a dedicated toolbox for a tool.
2. Example tool: https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/copy.htm
3. Example non-tool: https://pro.arcgis.com/en/pro-app/latest/arcpy/functions/exists.htm
4. Tools return a result object which may be a path, number or boolean

# Data Preprocessing

In [1]:
import arcpy

# Environment settings
arcpy.env.workspace = r"C:\Users\nomitrawat\Documents\Training\Data\Administrative Boundary Database"
arcpy.overwriteOutput = True

# Example of a non-tool method that returns boolean
if arcpy.Exists("STATE_BOUNDARY.shp"):
    print("Found the feature class")

# Example of a tool that returns a path
result = arcpy.management.Copy("STATE_BOUNDARY.shp", "STATE_BOUNDARY_COPY.shp")

print("The output is at the following path:")
print(result)

Found the feature class
The out is at the following path:
C:\Users\nomitrawat\Documents\Training\Data\Administrative Boundary Database\STATE_BOUNDARY_COPY.shp


# Some key points

1. Modular division similar to how toolboxes are divided.
2. Can write standalone scripts in python.
3. Use conda to clone the arcpy environment and build applications on top of it.
4. [Usefull link for how to access arcgis interpreter in visual studio code and using conda to clone the environment](https://resources.esri.ca/getting-technical/how-to-configure-visual-studio-code-with-arcgis-pro-s-python-environment)

# Working with data

1. Inspect the attribute table for "STATE_BOUNDARY_COPY.shp".
2. Manually identifying all the errors tedious.
3. Use cursors.

# Searching for errors using cursor
1. Go through each row in the attribute table.
2. Three types
    1. Search
    2. Update
    3. Insert

In [1]:
# Contains fields that need to checked
fields, characters = ["OID@", "STATE"], {}

# Search cursor
with arcpy.da.SearchCursor(result, fields) as cursor:
    # Each row in the attribute table
    for row in cursor:
        feature_id, state_name = row[0], row[1]
        
        for letter in state_name:
            
            #Skip
            #Empty spaces, "&", Commas, Brackets
            if letter == " " or letter == "&" or letter == "(" or letter == ")" or letter == ",":
                continue
            
            # Check whether the letter is non-alphabet
            if not letter.isalpha():
                if letter not in characters:
                    characters[letter]=[]
                
                characters[letter]+=[feature_id]

# Just printing non-alphabets and feature ids
for character, feature_ids in characters.items():
    print(f"Non alaphabet character: {character}")
    print("Occurs at the following Object ID")
    for index in range(len(feature_ids)):
        if index == len(feature_ids)-1:
            print(feature_ids[index])
        else:
            print(feature_ids[index], end = " ,")
        

NameError: name 'result' is not defined

# Updating errors

We will update all the instances of errors using an update cursor.

In [3]:
# Update cursor using the same fields as declared above
with arcpy.da.UpdateCursor(result, fields) as cursor:
    for row in cursor:
        state_name = row[1]
        new_state_name = ""
        
        for letter in state_name:
            if letter == ">":
                new_state_name += "A"
            elif letter == "|":
                new_state_name += "I"
            # The state chattisgarh has a lower case "t"
            elif letter.islower():
                new_state_name += letter.upper()
            else:
                new_state_name += letter
        
        row[1] = new_state_name
        
        # Don't forget to update the row
        cursor.updateRow(row)

# Some other usefull methods

1. Listing fields in the feature class.
2. Listing all the fields in the workspace/geodatabase.
3. Checking spatial reference

In [24]:
fields = arcpy.ListFields("STATE_BOUNDARY_COPY.shp")
print("Fields present :")
for field in fields:
    print(field.name)
    
feature_classes = arcpy.ListFeatureClasses()
print("Feature classes present :")
for feature_class in feature_classes:
    print(feature_class)

spatial_reference = arcpy.Describe("STATE_BOUNDARY_COPY.shp").spatialReference
print("Name of the spatial reference:")
print(spatial_reference.name)

Fields present :
FID
Shape
STATE
State_LGD
Shape_Leng
Shape_Area
Feature classes present :
DISTRICT_BOUNDARY.shp
DISTRICT_HQ.shp
MAJOR_TOWNS.shp
STATE_BOUNDARY.shp
STATE_BOUNDARY_COPY.shp
STATE_HQ.shp
SUBDISTRICT_BOUNDARY.shp
Name of the spatial reference:
LCC_WGS84


# Exercise
1. Check the following feature classes for errors and correct them if any:
    1. DISTRICT_BOUNDARY.shp
    2. SUBDISTRICT_BOUNDARY.shp

2. Delete the geometries having disputed boundaries.