# Introduction
 In this assignment, you will do exercises on arcpy.Exists(), arcpy.Walk(), arcpy.da.Describe(), python list comprehension, arcpy.da.SearchCursor, and arcpy.da.InsertCursor(). In each section, a code block is provided but not ready to run. You need to add this notebook to ArcGIS Pro, modify each block to make it runable, keep the output message , and write an explanation of the code block. 

 Data preparation: 
 Use the same zip file from assignment 4 for this exercise. The zip file contains a geodatabase and a folder with shapefles. Download the data from the [data](../data) folder of the github website. Extract the zip file and use the geodatabase and the folder containing the shapefiles accordingly.  

## Read a text file for data and write them into a feature class using InsertCursor

- This code uses the points.csv file in the same folder of the notebook 
- Modify the geodatabase_path variable to make it work and write the "points" feature class in your geodatabase


In [3]:
import arcpy
import csv

# Prompt for user input
csv_file_path = r"C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\points.csv"
geodatabase_path = r"C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb"
feature_class_name = "points"

# Determine the number of fields and field names from the CSV header
with open(csv_file_path, 'r') as csv_file:
    csv_reader = csv.reader(csv_file)
    header = next(csv_reader)
    num_fields = len(header)

# Create a new point feature class
sr = arcpy.SpatialReference(4326)  
arcpy.CreateFeatureclass_management(geodatabase_path, feature_class_name, "POINT", spatial_reference=sr)

# Add fields to the feature class based on the CSV header
field_info = arcpy.ListFields(geodatabase_path + "/" + feature_class_name)
field_names = [field.name for field in field_info]
for field in header:
    if field not in field_names:
        arcpy.AddField_management(geodatabase_path + "/" + feature_class_name, field, "DOUBLE")

# Create a cursor for inserting point features with additional fields
cursor_fields = ["SHAPE@X", "SHAPE@Y"] + header
with arcpy.da.InsertCursor(geodatabase_path + "/" + feature_class_name, cursor_fields) as cursor:
    # Read data from the CSV file and create point features with attributes
    with open(csv_file_path, 'r') as csv_file:
        csv_reader = csv.reader(csv_file)
        
        # Skip the header row (if present)
        next(csv_reader, None)
        
        for row in csv_reader:
            x = float(row[0])
            y = float(row[1])
            additional_fields = row
            cursor.insertRow((x, y) + tuple(additional_fields))

print(f"New point feature class '{feature_class_name}' created in '{geodatabase_path}' with {num_fields} fields.")

New point feature class 'points' created in 'C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb' with 5 fields.


**Edit this block to answer these three questions**

### Q1: Line by line, explain what was done in the code. (20 pnts)

1. imports the `arcpy` module
2. imports the `csv` module
3. blank line for readability
4. comment explaining the function of the following code block to the user
5. defines `csv_file_path` and assigns it the path to a user-specified CSV file
6. defines `geodatabase_path` and assigns it the path to user-specified geodatabase file
7. defines `feature_class_name` as the string "points"
8. blank line for readability
9. comment explaining the function of the following code block to the user
10. uses a `with` statement to open `csv_file_path` in read mode
11. creates a `csv_reader` object using the `csv.reader()` function from the `csv` module
12. retrieves the header row of the CSV using the `next()` function
13. calculates the number of fields by getting the length of the `header` list using the `len()` function
14. blank line for readability
15. comment explaining the function of the following code block to the user
16. creates the spatial reference object `sr` using the `arcpy.SpatialReference` class and assigns it the spatial reference for WGS 84, which corresponds to the EPSG code 4326
17. the `arcpy.CreateFeatureclass_management()` function uses the previously defined objects to create a new feature class in the specified geodatabase
18. blank line for readability
19. comment explaining the function of the following code block to the user
20. uses the `arcpy.ListFields()` function to create a list named `field_info` of `Field` objects from the feature class
21. creates a list of field names by extracting the `name` attribute from each `Field` object in `field_info`
22. uses a `for` loop to iterate over each field name in `header`
23. checks if the field from the `header` does not already exist in the `field_names` list
24. uses the `arcpy.AddField_management()` function to add the field to the feature class if it is missing and assigns it the data type "DOUBLE"
25. blank line for readability
26. comment explaining the function of the following code block to the user
27. creates a list of fields that will include the x and y coordinates (`SHAPE@X`, `SHAPE@Y`) for points and all of the field names from the `header` list
28. uses a `with` statement to open an `InsertCursor` from the `arcpy.da` module to insert new points into the feature class
29. comment explaining the function of the following code block to the user
30. uses a `with` statement to open `csv_file_path` in read mode
31. creates a `csv_reader` object using the `csv.reader()` function from the `csv` module
32. blank line for readability
33. comment explaining the function of the following code block to the user
34. uses `next()` to skip the header row in `csv_reader`, if present
35. blank line for readability
36. uses a `for` loop to iterate over each row in `csv_reader`
37. extracts and converts the x-coordinate from the first element of `row` to a float data type
38. extracts and converts the y-coordinate from the second element of `row` to a float data type
39. assigns the entire `row` to `additional_fields` for including all additional attribute data
40. uses the `insertRow` method of the `cursor` object to write the x and y coordinates along with all of the additional fields to the feature class
41. blank line for readability
42. uses the `print()` function to inform the user that the new point feature class was created successfully and provides details about the feature class
   
### Q2: What does the code 4326 represent? (5 pnts)
- The EPSG code for the spatial reference WGS 84

### Q3: Explain what are `SHAPE@X` and `SHAPE@Y` (5 pnts)
- `SHAPE@X` and `SHAPE@Y`are geometry tokens specific to `arcpy.da`. They are used to access or assign the x and y coordinates of point geometries when working with cursors

## Use arcpy.Exist()

- This code checks the existence of a specified dataset within an ArcGIS workspace.
- Fill the dataset_name and workspace_path variables with the database and the feature class name (points) from the last block
- Print a message indicating whether the dataset exists or not.

In [6]:
import arcpy
import os

# Prompt for user input
dataset_name = "points"
workspace_path = r"C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb"
fullname = os.path.join(workspace_path,dataset_name)
# Check if the dataset exists
if arcpy.Exists(fullname):
    print(f"The dataset '{dataset_name}' exists in the workspace.")
else:
    print(f"The dataset '{dataset_name}' does not exist in the workspace.")

The dataset 'points' exists in the workspace.


**Edit this block to answer the question**

### Q4: Line by line describe what was done by the code in the block above. (10 pnts) 

1. imports the `arcpy` module
2. imports the `os` module
3. blank line for readability
4. comment explaining the function of the following code block to the user
5. defines `dataset_name` as the string "points"
6. defines `workspace_path` and assigns it the path to a user-specified default geodatabase
7. uses the `os.path.join()` function to concatenate `workspace_path` and `dataset_name`, resulting in the full path to the dataset (`fullname`)
8. comment explaining the function of the following code block to the user
9. the `if` statement uses the `arcpy.Exists()` function to check if the dataset "points" exists at the filepath specified by `fullname` and triggers the `print()` statement below if it does
10. uses the `print()` function to inform the user that the dataset exists in the provided geodatabase
11. the `else:` statement triggers the `print()` function below if the `if` condition is not met
12. uses the `print()` function to inform the user that the dataset does not exist in the provided geodatabase


## Use arcpy.Walk()

- This code uses arcpy.Walk() to iterate through all feature datasets in the geodatabase you just used.
- Modify the name of "workspace" to make the code work
- Run the code to list all the feature classes within each dataset.


In [7]:
import arcpy

# Define the workspace
workspace = r"C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb" # note that you need to put a full path name for this workspace, even running it in ArcGIS Pro. 

# Use arcpy.Walk() to iterate through feature datasets
for dirpath, dirnames, filenames in arcpy.da.Walk(workspace, datatype="FeatureClass"):
    for filename in filenames:
        print(f"Feature Class in {dirpath}: {filename}")

Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: addresses
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: base
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: buildings
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: facilities
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: historical_landmarks
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: hospitals
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: parks
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: sidewalks
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: trees
Feature Class in C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ex8\Austin_data.gdb: points


**Edit this block to answer the question**

### Q5: line by line, describe what was done by the code in the block above. (10 pnts)

1. imports the arcpy module
2. blank line for readability
3. comment explaining the function of the following code block to the user
4. defines `workspace` and assigns it the path to a user-specified geodatabase 
5. blank line for readability
6. comment explaining the function of the following code block to the user
7. uses a `for` loop with `arcpy.da.Walk()` to iterate through the workspace, retrieving directories, subdirectories, and feature classes (`dirpath`, `dirnames`, and `filenames`)
8. uses a nested `for` loop to iterate over each `filename` (FeatureClass) within the current directory
9. uses the `print()` function to display the directory path (`dirpath`) and filename (`filename`) for each FeatureClass in the workspace

## Use List Comprehension

- The following block uses list comprehension to generate a list of .shp files in a specified folder.
- Choose the folder name from assignment 4 data/paris subfolder where many shapefiles are located and use it for folder_path
- Run the code to print the list of file names


- In the second block, the code combines the os.walk() function with the list comprehension to  list all shapefiles in a folder including subfolders



In [1]:
import os

# Specify the folder path
folder_path = r"C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ModelBuilder_data\ParisData"

# Use list comprehension to generate a list of .shp files
shp_files = [file for file in os.listdir(folder_path) if file.endswith(".shp")]

# Print the list of .shp files
print("Shapefiles in the folder:")
for shp_file in shp_files:
    print(shp_file)

Shapefiles in the folder:
Metro_Entrances.shp
Warehouses.shp


In [2]:
import os
# Specify the folder path
folder_path = r"C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\ModelBuilder_data\ParisData"

print("Shapefiles in the folder:")
for root, dirs, files in os.walk(folder_path):
    shp_files = [file for file in files if file.endswith(".shp")]
    for shp_file in shp_files:
            print(shp_file)

Shapefiles in the folder:
Metro_Entrances.shp
Warehouses.shp
Stores.shp
Parks.shp
Metro_Lines.shp
Metro_LinesAOI.shp
Metro_Stations.shp
Streets.shp


**Edit this block to answer the question**

Q6: Describe what was done by the code in first blocks above. (10 pnts)

- After specifying the folder path, the code uses `os.listdir` to list all files in the specified folder and a list comprehension with a conditional `if file.endswith(".shp")` to filter for shapefiles. The filtered list of shapefiles is then printed line by line using a `for` loop. This allows the user to see all the shapefiles in the specified folder.


Q7: Describe what was done by the code in second blocks above. (10 pnts)

- After specifying the folder path, the code uses a `for` loop with `os.walk` to iterate through the specified folder and its subfolder. For each folder or subfolder, it filters the list of files to find shapefiles using list comprehension with a conditional `if file.endswith(".shp")`. The filtered list of shapefiles is then printed line by line using a `for` loop. Unline `os.listdir`, `os.walk` does not just list the contents of the specified folder, but any subfolders that it contains.

## Use arcpy.da.SearchCursor

- This code uses arcpy.da.SearchCursor to extract attribute information from a feature class.
- Enter the name of the "points" feature class and the field(s) from the feature class to extract.
- Run and display the extracted data.


In [6]:
import arcpy

# Prompt for user input
fc_path = r"C:\Users\cecil\Documents\ArcGIS\Projects\GEOG4057\GEOG4057.gdb\Metro_Stations_Project_RandomPoints"
fields_to_extract = ["NAME", "ID_STATION", "ID_LINE"]

# Use arcpy.da.SearchCursor to extract data
with arcpy.da.SearchCursor(fc_path, fields_to_extract) as cursor:
    print("Extracted Data:")
    for row in cursor:
        print([row[i] for i in range(len(fields_to_extract))])

Extracted Data:
['Saint-Jacques', 131, 72]
['Montparnasse-Bienvenue', 135, 72]
['Boissière', 143, 72]
['Porte de Clignancourt', 74, 70]
['Mouton-Duvernet', 96, 70]


**Edit this block to answer the question**

Q8: Line by line, explain what was done in the code. 10 pnts 

1. imports the arcpy module
2. blank line for readability
3. comment explaining the function of the following code block to the user
4. defines `fc_path` and assigns it the path to a user-specified feature class
5. defines `fields_to_extract` and assigns it a list of user-specified fields to extract from the feature class
6. blank line for readability
7. comment explaining the function of the following code block to the user
8. uses a `with` statement to open a `SearchCursor` from the `arcpy.da` module, enabling access to the user-specified fields of the feature class
9. uses the `print()` function to display a heading "Extracted Data:"
10. uses a `for` loop to iterate through the rows in the `SearchCursor` results
11. uses the `print()` function to display the values of the user-specified fields for each row of the feature class

## Use addField and field Calculator

- Run the following code block (with "points" feature class added to the last map before you switched into the notebook interface)

In [7]:
import arcpy
fc = "points"
newfieldName = "all"
arcpy.AddField_management(fc, newfieldName, "DOUBLE")
expression = "sum(!Field1!,!Field2!,!Field3!)"
codeblock = """
def sum(*fields):
    sum = 0
    for field in fields:
        sum += field
    return sum
"""
arcpy.CalculateField_management(fc, newfieldName, expression, "", codeblock)

**Edit this block to answer the question**

Q8: line by line, describe what was done by the code in the block above. 10 pnts

1. imports the `arcpy` module
2. defines `fc` and assigns it to the feature class "points" in the current Map
3. defines `newfieldName` and assigns it the name "all"
4. calls `AddField_management()` from `arcpy` to add a new field in the feature class (`fc`), naming it (`newfieldName`), and assigning it the datatype "DOUBLE"
5. defines `expression` as the string `sum(!Field1!,!Field2!,!Field3!)` which specifies how the new field will be calculated
6. begins a code block athat defines the logic for the calculation
7. defines a function named `sum` that can accept multiple fields as arguments (`*fields`)
8. creates a variable `sum` and assigns it the initial value of `0` to store the cumulative total
9. uses a `for` loop to iterate through each field passed to the `sum` function
10. adds the value of each field to the running total (`sum`)
11. returns the final computed sum after all the field values have been processed
12. closes the code block with `"""`
13. calls the `CalculateField_management` module from `arcpy` to fill in the rows of the newly created field (`newfieldName`) with the results of calculation defined by `expression` and `codeblock`


Q9: Open the attribute table of "points" and check if the attribute table has a new filed "all" and correct values. Right-click the "all" field and click field calculator. Describe what you see in the field calculator interface. Compare the python code versus the field calculator interface. 10 pnts

- The Field Calculator interface has input fields for the Input Table (the dataset being modified), the Field Name (either an existing field or a new one to calculate values for), and Expression Type (the coding language to use, such as Python or Arcade). Below these, there are fields for the Expression (the formula or calculation) and an optional Code Block (for defining custom functions).
- Because we already calculated this field in our notebook, the fields have been pre-populated with the corresponding values:
    - The Input Table is set to the feature class `"points"`
    - The Field Name is set to the new field `"all"`
    - The Expression is populated with `"sum(!Field1!,!Field2!,!Field3!)"`
    - The Code Block contains the custom Python function used to calculate the sum of the specified fields