# Creating Coastal Watersheds for Lake Huron Coastal Wetlands (CW)

This notebook delineates **coastal watersheds** for **Lake Huron‚Äìconnected coastal wetlands** under four inundation scenarios (**avg, low, high, surge**) while ensuring a **stable wetland identifier (`CW_Id`) is preserved throughout the full workflow**. The goal is to produce **one coastal watershed polygon per wetland (`CW_Id`)**, along with consistent wetland and watershed attributes needed for later merging and plotting (area + centroid coordinates).

---

## Key idea: keep `CW_Id` stable from start to finish
Raster-based watershed tools can replace feature IDs with raster values (e.g., `gridcode`) and can also split features during polygon/raster conversions. To avoid ID mismatches, this workflow:

- assigns and carries a stable **`CW_Id`** in all wetland layers,
- creates **one pour point per wetland** (inside the polygon),
- snaps pour points to the drainage network,
- delineates watersheds using the snapped points,
- **converts watershed outputs back to polygons** and **dissolves by `CW_Id`** so each wetland ends with **exactly one** watershed polygon.

---

## Inputs
Main inputs used in this notebook:

- **Coastal wetland polygons** (avg/low/high/surge inundation layers)
- **Shoreline polyline** (Lake Huron US-side shoreline)
- **Great Lakes Basin streams** (used to remove riparian/stream-connected wetland overlap)
- **D8 flow direction raster** (hydrologic routing grid)
- **Stream-watershed polygons** (areas draining to streams; removed from coastal watersheds)
- **Lake Huron polygon** (removed from final watershed polygons)

All distance-based operations are performed in **Great Lakes Albers (EPSG:3174)** (meters).

---

## Outputs (per inundation scenario)
For each scenario (**avg, low, high, surge**), the notebook produces:

### Wetland-side products
- **shoreline-interacting wetlands** (wetlands intersecting a 2000 m shoreline buffer)
- **riparian-erased wetlands** (wetlands with 50 m stream-buffer overlap removed)
- wetland attributes:
  - `CW_Id` (stable wetland identifier)
  - `CW_Area_m2`
  - wetland centroid coordinates in EPSG:3174 (`CW_cx`, `CW_cy`)
  - wetland centroid coordinates in WGS84 (`CW_lon`, `CW_lat`)

### Watershed-side products
- **pour points** (`*_pourpoints.shp`) ‚Äî one point inside each wetland polygon
- **snapped pour points** ‚Äî pour points snapped to a drainage cell using flow accumulation
- **watershed raster** (cell values correspond to `CW_Id`)
- **watershed polygons**, dissolved by `CW_Id` (one watershed per wetland)
- final coastal watershed polygons with attributes:
  - `CW_Id` (matching wetland `CW_Id`)
  - `WatershedArea_m2`
  - watershed centroid coordinates in EPSG:3174 (`WS_cx`, `WS_cy`)
  - watershed centroid coordinates in WGS84 (`WS_lon`, `WS_lat`)

---

## Workflow summary
For each inundation scenario (**avg/low/high/surge**):

1. **Assign stable IDs**
   - Ensure wetland polygons contain `CW_Id` and (optionally) `Coastal_Id` derived from `CW_Id`.

2. **Select shoreline-interacting wetlands**
   - Project shoreline to EPSG:3174, buffer by **2000 m**, and intersect with wetlands.

3. **Remove riparian/stream overlap**
   - Buffer streams by **50 m** and erase from the shoreline-interacting wetlands.

4. **Create pour points**
   - Dissolve wetlands by `CW_Id` and create one **inside point** per wetland (`*_pourpoints.shp`).

5. **Snap pour points**
   - Snap pour points to the drainage network using **SnapPourPoint** with flow accumulation.

6. **Delineate watersheds**
   - Use **Watershed** with the D8 flow direction raster and snapped pour points.

7. **Convert to polygons + enforce 1 watershed per wetland**
   - Convert watershed raster to polygons, set `CW_Id = gridcode`, and **dissolve by `CW_Id`**.

8. **Remove non-coastal drainage + lake area**
   - Erase stream-watershed polygons, then erase Lake Huron polygon from the watershed polygons.

9. **Compute areas + centroids**
   - Add watershed area and centroid coordinate fields for later merges and plotting.

---

## Notes / QA checks recommended
- Verify **unique `CW_Id` counts** are consistent:
  - riparian-erased wetlands vs. pour points vs. dissolved watershed polygons.
- If counts differ, inspect:
  - wetlands disappearing due to shoreline/riparian erases,
  - pour points falling outside valid drainage (before snapping),
  - watersheds splitting (should be fixed by dissolving on `CW_Id`).



## 0) Requirements

- ArcGIS Pro / arcpy with Spatial Analyst.
- Flow direction raster (`D8_flow`) and **flow accumulation** raster (`FlowAcc`) on the same grid.
- Wetland-connected polygons for each scenario (avg/high/low/surge), each with a stable id field (we standardize to `CW_Id`).

**What is `avg_pourpoints.shp`?**  
It is a **point feature class** with **one point per wetland** (per `CW_Id`). These points are snapped to the highest flow-accumulation cell nearby, then used as pour points for the Watershed tool.


In [1]:
import os
import arcpy
from arcpy import env
from arcpy.sa import SnapPourPoint, Watershed
from arcpy import sa
import numpy as np
arcpy.env.overwriteOutput = True
arcpy.CheckOutExtension("Spatial")
arcpy.env.addOutputsToMap = False  # helps avoid schema locks



## 1) Inputs / outputs 

Fill in your real paths. Keep the same projected CRS as your DEM / flow rasters (often EPSG:3174/3175 for Great Lakes Albers).


In [2]:

# -------------------------------------------------------------------
# Inputs
# -------------------------------------------------------------------

inDir = r"D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer"
inDCW = r"D:\Users\abolmaal\data\coastalwetlands\finalwetland"
CW_path = r"D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\Coastalwetland\hitshoreline"

wetlands_avg_inun_original   = os.path.join(inDCW, "wetlands_connected_avg_inundation_GLAlbers.shp")
wetlands_high_inun_original  = os.path.join(inDCW, "wetlands_connected_high_inundation_GLAlbers.shp")
wetlands_low_inun_original   = os.path.join(inDCW, "wetlands_connected_low_inundation_GLAlbers.shp")
wetlands_surge_original      = os.path.join(inDCW, "wetlands_connected_surge_inundation_GLAlbers.shp")

inStreams = os.path.join(inDir, "GLB_Stream", "GLB_stream_Ras_FeatureToLine.shp")
D8_flow   = r"S:\Projects\Active\GLB_Nutrient_Transport\DEM_rasters\GLB_Bdry_buff10km_dem_fill_dir.tif"
flowacc = r"S:\Projects\Active\GLB_Nutrient_Transport\DEM_rasters\GLB_Bdry_buff10km_dem_fill_flowaccu.tif"
inStreamsWatershed = os.path.join(inDir, "Streamwatershed", "PointWaterdhed_LH.shp")

Lake_Huron = r"D:\Users\abolmaal\code\boundry\hydro_p_LakeHuron\hydro_p_LakeHuron.shp"
shoreline_shapefile = r"D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\shoreline\100k\lh_shore_ESRI_100k_USside.shp"

# -------------------------------------------------------------------
# Parameters / field names
# -------------------------------------------------------------------
CW_ID_FIELD = "CW_Id"          # stable wetland id
COASTAL_ID_FIELD = "Coastal_Id" # optional (we'll set equal to CW_Id unless you want different)

crs_Albers = arcpy.SpatialReference(3174)  # Great Lakes Albers meters
crs_WGS84  = arcpy.SpatialReference(4326)

# -------------------------------------------------------------------
# Outputs (your folders)
# -------------------------------------------------------------------
outDir_stream = r"D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\CoastalWatersheds\GLB_Stream"
outBuffer = os.path.join(outDir_stream, "GLB_stream_Ras_FeatureToLine_50m.shp")

outpath = r"D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\CoastalWatersheds"
outErase_Riper   = os.path.join(outpath, "Erase_Riperian")
outErase_drainage= os.path.join(outpath, "Erase_drainage")
outErase_Lake    = os.path.join(outpath, "Erase_lake")
outPourpoints    = os.path.join(outpath, "Pourpoints")
outWatersheds    = os.path.join(outpath, "Watershed_rasters")

for d in [outDir_stream, outErase_Riper, outErase_drainage, outErase_Lake, outPourpoints, outWatersheds]:
    os.makedirs(d, exist_ok=True)

shorebuffer = r"D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\shoreline\100k\lh_shore_ESRI_100k_USside_2000buffer.shp"

wetlands_avg_inun  = os.path.join(CW_path, "Wetland_connected_avg_inundation_NAD1983_shorelineinteraction_buffer2000m.shp")
wetlands_low_inun  = os.path.join(CW_path, "Wetland_connected_low_inundation_NAD1983_shorelineinteraction_buffer2000m.shp")
wetlands_high_inun = os.path.join(CW_path, "Wetland_connected_high_inundation_NAD1983_shorelineinteraction_buffer2000m.shp")
wetlands_surge     = os.path.join(CW_path, "Wetland_connected_surge_inundation_NAD1983_shorelineinteraction_buffer2000m.shp")

erase_buffer_avg   = os.path.join(outErase_Riper, "Wetland_connected_avg_erasebuff_50.shp")
erase_buffer_high  = os.path.join(outErase_Riper, "Wetland_connected_high_erasebuff_50.shp")
erase_buffer_low   = os.path.join(outErase_Riper, "Wetland_connected_low_erasebuff_50.shp")
erase_buffer_surge = os.path.join(outErase_Riper, "Wetland_connected_surge_erasebuff_50.shp")

CoastalWatershed_avg_erase_lakedrain  = os.path.join(outErase_drainage, "CoastalWatershed_avg_erase_lakedrain.shp")
CoastalWatershed_high_erase_lakedrain = os.path.join(outErase_drainage, "CoastalWatershed_high_erase_lakedrain.shp")
CoastalWatershed_low_erase_lakedrain  = os.path.join(outErase_drainage, "CoastalWatershed_low_erase_lakedrain.shp")
CoastalWatershed_surge_erase_lakedrain= os.path.join(outErase_drainage, "CoastalWatershed_surge_erase_lakedrain.shp")

CoastalWatershed_avg_erase_lakedrain_LakeHuron   = os.path.join(outErase_Lake, "CoastalWatershed_avg_erase_lakedrain_LakeHuron.shp")
CoastalWatershed_high_erase_lakedrain_LakeHuron  = os.path.join(outErase_Lake, "CoastalWatershed_high_erase_lakedrain_LakeHuron.shp")
CoastalWatershed_low_erase_lakedrain_LakeHuron   = os.path.join(outErase_Lake, "CoastalWatershed_low_erase_lakedrain_LakeHuron.shp")
CoastalWatershed_surge_erase_lakedrain_LakeHuron = os.path.join(outErase_Lake, "CoastalWatershed_surge_erase_lakedrain_LakeHuron.shp")



## 2) Helper functions

These helpers:

- enforce a stable `CW_Id`
- create **one** pour point per `CW_Id`
- snap pour points to high flow accumulation
- run watershed (raster) and convert back to polygons while preserving ids
- compute areas + WGS84 centroid lat/lon
- run sanity checks for missing ids


In [3]:
# -------------------------------------------------------------------
# Helper functions
# -------------------------------------------------------------------
def ensure_field(fc, name, ftype="DOUBLE"):
    # Shapefile limit: 10 chars
    if arcpy.Describe(fc).dataType == "ShapeFile" and len(name) > 10:
        short = name[:10]
        print(f"‚ö†Ô∏è Shapefile field '{name}' too long -> using '{short}'")
        name = short

    fields = [f.name for f in arcpy.ListFields(fc)]
    if name not in fields:
        arcpy.management.AddField(fc, name, ftype)
    return name

def calculate_area_m2(fc, out_field):
    out_field = ensure_field(fc, out_field, "DOUBLE")
    arcpy.management.CalculateGeometryAttributes(
        fc, [[out_field, "AREA"]], area_unit="SQUARE_METERS"
    )
    return out_field

def add_xy_ll(fc, prefix, src_crs=crs_Albers):
    """
    Adds:
      {prefix}_cx, {prefix}_cy  (in src_crs units, meters for 3174)
      {prefix}_lon, {prefix}_lat (in WGS84 DD)
    """
    ensure_field(fc, f"{prefix}_cx", "DOUBLE")
    ensure_field(fc, f"{prefix}_cy", "DOUBLE")
    arcpy.management.CalculateField(fc, f"{prefix}_cx", "!SHAPE.centroid.X!", "PYTHON3")
    arcpy.management.CalculateField(fc, f"{prefix}_cy", "!SHAPE.centroid.Y!", "PYTHON3")

    ensure_field(fc, f"{prefix}_lon", "DOUBLE")
    ensure_field(fc, f"{prefix}_lat", "DOUBLE")
    # CalculateGeometryAttributes supports centroid in a specified coordinate system
    arcpy.management.CalculateGeometryAttributes(
        fc,
        [[f"{prefix}_lat", "CENTROID_Y"], [f"{prefix}_lon", "CENTROID_X"]],
        coordinate_system=crs_WGS84,
        coordinate_format="DD"
    )

def count_ids(fc, id_field):
    ids = set()
    with arcpy.da.SearchCursor(fc, [id_field]) as cur:
        for (v,) in cur:
            if v is not None:
                ids.add(int(v))
    return len(ids)

def make_pourpoints(wetlands_fc, out_points_fc, id_field=CW_ID_FIELD):
    """
    1) Dissolve by CW_Id -> single multipart per CW_Id
    2) FeatureToPoint INSIDE -> 1 point per CW_Id
    """
    tmp_diss = os.path.join("in_memory", "tmp_diss")
    if arcpy.Exists(tmp_diss):
        arcpy.management.Delete(tmp_diss)

    arcpy.management.Dissolve(wetlands_fc, tmp_diss, dissolve_field=id_field)
    arcpy.management.FeatureToPoint(tmp_diss, out_points_fc, "INSIDE")
    arcpy.management.Delete(tmp_diss)

def snap_pourpoints(in_points, flowacc_raster, out_points, snap_dist="200 Meters"):
    """
    SnapPourPoint expects a flow accumulation raster.
    """
    out_ras = os.path.join("in_memory", "snapped_pp_ras")
    if arcpy.Exists(out_ras):
        arcpy.management.Delete(out_ras)

    # SnapPourPoint returns a raster. We'll convert to points with value preserved.
    snapped = arcpy.sa.SnapPourPoint(in_points, flowacc_raster, snap_dist, CW_ID_FIELD)
    snapped.save(out_ras)

    # RasterToPoint creates points with "grid_code"
    arcpy.conversion.RasterToPoint(out_ras, out_points, "VALUE")

    # Move snapped raster value -> CW_Id
    ensure_field(out_points, CW_ID_FIELD, "LONG")
    arcpy.management.CalculateField(out_points, CW_ID_FIELD, "!grid_code!", "PYTHON3")
    arcpy.management.Delete(out_ras)

def watershed_from_points(flowdir, snapped_points, out_watershed_raster):
    """
    Watershed raster values will equal CW_Id (because we pass CW_Id field).
    """
    ws = arcpy.sa.Watershed(flowdir, snapped_points, CW_ID_FIELD)
    ws.save(out_watershed_raster)

def watershed_raster_to_polygon(ws_raster, out_poly, id_field=CW_ID_FIELD):
    """
    RasterToPolygon -> gridcode. Then set CW_Id=gridcode and dissolve by CW_Id.
    """
    tmp_poly = os.path.join("in_memory", "tmp_ws_poly")
    if arcpy.Exists(tmp_poly):
        arcpy.management.Delete(tmp_poly)

    arcpy.conversion.RasterToPolygon(ws_raster, tmp_poly, "NO_SIMPLIFY", "VALUE")

    # Make sure CW_Id exists and equals gridcode
    ensure_field(tmp_poly, id_field, "LONG")
    arcpy.management.CalculateField(tmp_poly, id_field, "!gridcode!", "PYTHON3")

    # Dissolve to one watershed polygon per CW_Id (removes splits)
    arcpy.management.Dissolve(tmp_poly, out_poly, dissolve_field=id_field)

    arcpy.management.Delete(tmp_poly)
    
def watershed_from_snapped_raster(flowdir, snapped_ras, out_watershed_raster):
    """
    snapped_ras is the output of SnapPourPoint (a raster with CW_Id values).
    Watershed output raster will keep those CW_Id values.
    """
    ws = arcpy.sa.Watershed(flowdir, snapped_ras)
    ws.save(out_watershed_raster)
    
    
# -------------------------------------------------------------------


def _ensure_field(fc, field_name, field_type="LONG"):
    fields = [f.name for f in arcpy.ListFields(fc)]
    if field_name not in fields:
        arcpy.management.AddField(fc, field_name, field_type)

def unique_snap_points_to_flowacc_cells(
    in_points_fc,
    id_field,
    flowacc_raster,
    snap_dist,
    out_points_fc,
    max_expand_steps=3,
    expand_factor=1.5,
    allow_nodata=False,
):
    """
    Create a snapped points FC where each input point is moved to a UNIQUE raster cell
    chosen as the highest flow accumulation cell within snap_dist (expanded if needed).
    This guarantees 1 unique snapped cell per input ID (unless there aren't enough cells).
    """

    arcpy.env.overwriteOutput = True

    r = arcpy.Raster(flowacc_raster)
    sr = r.spatialReference
    cellw = float(r.meanCellWidth)
    cellh = float(r.meanCellHeight)
    ext = r.extent
    xmin, ymin, xmax, ymax = ext.XMin, ext.YMin, ext.XMax, ext.YMax

    # Determine raster size in cells (approx, but enough for indexing)
    ncol = int(round((xmax - xmin) / cellw))
    nrow = int(round((ymax - ymin) / cellh))

    # Create output FC
    out_dir = os.path.dirname(out_points_fc)
    out_name = os.path.basename(out_points_fc)
    if arcpy.Exists(out_points_fc):
        arcpy.management.Delete(out_points_fc)

    arcpy.management.CreateFeatureclass(
        out_dir, out_name, "POINT", spatial_reference=sr
    )
    _ensure_field(out_points_fc, id_field, "LONG")

    # Build a quick index of all input points
    pts = []
    with arcpy.da.SearchCursor(in_points_fc, ["SHAPE@XY", id_field]) as cur:
        for (x, y), cid in cur:
            if cid is None:
                continue
            pts.append((float(x), float(y), int(cid)))

    used_cells = set()  # (row_top, col)

    def xy_to_rowcol_top(x, y):
        col = int((x - xmin) / cellw)
        row_top = int((ymax - y) / cellh)
        return row_top, col

    def rowcol_top_to_cellcenter(row_top, col):
        x = xmin + (col + 0.5) * cellw
        y = ymax - (row_top + 0.5) * cellh
        return x, y

    def window_to_numpy(row_top, col, rad_cells):
        # clamp window bounds in raster indices
        r0 = max(0, row_top - rad_cells)
        r1 = min(nrow - 1, row_top + rad_cells)
        c0 = max(0, col - rad_cells)
        c1 = min(ncol - 1, col + rad_cells)

        # lower-left corner of the window in map units
        x_ll = xmin + c0 * cellw
        y_ll = ymax - (r1 + 1) * cellh

        nrows = (r1 - r0 + 1)
        ncols = (c1 - c0 + 1)

        # IMPORTANT: integer rasters can't use NaN for nodata_to_value
        nodata_sentinel = -9999

        arr = arcpy.RasterToNumPyArray(
            r,
            lower_left_corner=arcpy.Point(x_ll, y_ll),
            ncols=ncols,
            nrows=nrows,
            nodata_to_value=nodata_sentinel
        )

        # convert to float and set sentinel to NaN
        arr = arr.astype("float64")
        arr[arr == nodata_sentinel] = np.nan

        return arr, (r0, r1, c0, c1)

    def pick_best_unused_cell(arr, bounds):
        r0, r1, c0, c1 = bounds
        nrows, ncols = arr.shape

        # Flatten and sort by flowacc descending (nan ignored)
        flat = arr.ravel()
        valid_idx = np.where(np.isfinite(flat))[0]
        if valid_idx.size == 0:
            return None

        order = valid_idx[np.argsort(flat[valid_idx])[::-1]]

        for k in order:
            i = k // ncols   # array row index (0=bottom)
            j = k % ncols

            # convert (i,j) -> global raster (row_top, col)
            row_top = r1 - i
            col = c0 + j

            if (row_top, col) in used_cells:
                continue

            # if you want to forbid snapping into nodata/invalid, array already NaN-handled
            return row_top, col, float(arr[i, j])

        return None

    missing = []

    with arcpy.da.InsertCursor(out_points_fc, ["SHAPE@XY", id_field]) as icur:
        for x, y, cid in pts:
            row_top, col = xy_to_rowcol_top(x, y)

            # start radius in cells
            base_rad = int(np.ceil(snap_dist / cellw))

            chosen = None
            rad = base_rad

            for step in range(max_expand_steps + 1):
                arr, bounds = window_to_numpy(row_top, col, rad)
                chosen = pick_best_unused_cell(arr, bounds)
                if chosen is not None:
                    break
                rad = int(np.ceil(rad * expand_factor))

            if chosen is None:
                missing.append(cid)
                continue

            rtop, c, val = chosen
            used_cells.add((rtop, c))

            sx, sy = rowcol_top_to_cellcenter(rtop, c)
            icur.insertRow(((sx, sy), cid))

    print(f"‚úÖ unique snapped points created: {len(pts) - len(missing)} / {len(pts)}")
    if missing:
        print(f"‚ö†Ô∏è Could not snap {len(missing)} points (no valid unused cells found). Example IDs: {missing[:10]}")
    return out_points_fc


# -------------------------------------------------------------------
# 0) Add stable CW_Id to ORIGINAL wetlands (IMPORTANT FIX)
#    Use existing "Id" if present; else fallback to OBJECTID/FID.
# ------------------------------------------------------------------


In [4]:
wetlands_fcs = [
    wetlands_avg_inun_original,
    wetlands_high_inun_original,
    wetlands_low_inun_original,
    wetlands_surge_original,
]

for fc in wetlands_fcs:
    fields = [f.name for f in arcpy.ListFields(fc)]
    ensure_field(fc, CW_ID_FIELD, "LONG")
    ensure_field(fc, COASTAL_ID_FIELD, "LONG")

    if "Id" in fields:
        # Stable: CW_Id = Id
        arcpy.management.CalculateField(fc, CW_ID_FIELD, "!Id!", "PYTHON3")
        arcpy.management.CalculateField(fc, COASTAL_ID_FIELD, "!Id!", "PYTHON3")
    else:
        # Fallback: CW_Id = OBJECTID
        oid = arcpy.Describe(fc).OIDFieldName
        arcpy.management.CalculateField(fc, CW_ID_FIELD, f"!{oid}!", "PYTHON3")
        arcpy.management.CalculateField(fc, COASTAL_ID_FIELD, f"!{oid}!", "PYTHON3")

    print(f"‚úÖ ensured CW_Id on: {os.path.basename(fc)}")


‚úÖ ensured CW_Id on: wetlands_connected_avg_inundation_GLAlbers.shp
‚úÖ ensured CW_Id on: wetlands_connected_high_inundation_GLAlbers.shp
‚úÖ ensured CW_Id on: wetlands_connected_low_inundation_GLAlbers.shp
‚úÖ ensured CW_Id on: wetlands_connected_surge_inundation_GLAlbers.shp



# -------------------------------------------------------------------
# 1) Shoreline buffer (2000m) in EPSG:3174
# -------------------------------------------------------------------

In [5]:
shoreline_3174 = os.path.join("in_memory", "shoreline_3174")
arcpy.management.Project(shoreline_shapefile, shoreline_3174, crs_Albers)

if not arcpy.Exists(shorebuffer):
    arcpy.analysis.Buffer(shoreline_3174, shorebuffer, "2000 Meters", dissolve_option="ALL")

arcpy.management.Delete(shoreline_3174)

# -------------------------------------------------------------------
# 2) Intersect wetlands with shoreline buffer (keeps CW_Id)
# -------------------------------------------------------------------

In [6]:
cw_pairs = [
    (wetlands_avg_inun_original,  wetlands_avg_inun),
    (wetlands_low_inun_original,  wetlands_low_inun),
    (wetlands_high_inun_original, wetlands_high_inun),
    (wetlands_surge_original,     wetlands_surge),
]

for in_fc, out_fc in cw_pairs:
    # project wetlands to 3174
    tmp_3174 = os.path.join("in_memory", "cw_3174")
    arcpy.management.Project(in_fc, tmp_3174, crs_Albers)

    tmp_int = os.path.join("in_memory", "cw_int")
    arcpy.analysis.Intersect([tmp_3174, shorebuffer], tmp_int, "ALL")

    # Save in 3174 (recommended). If you really need original CRS, project back here.
    arcpy.management.CopyFeatures(tmp_int, out_fc)

    arcpy.management.Delete(tmp_3174)
    arcpy.management.Delete(tmp_int)

    print(f"‚úÖ shoreline-intersect: {os.path.basename(out_fc)}")

‚úÖ shoreline-intersect: Wetland_connected_avg_inundation_NAD1983_shorelineinteraction_buffer2000m.shp
‚úÖ shoreline-intersect: Wetland_connected_low_inundation_NAD1983_shorelineinteraction_buffer2000m.shp
‚úÖ shoreline-intersect: Wetland_connected_high_inundation_NAD1983_shorelineinteraction_buffer2000m.shp
‚úÖ shoreline-intersect: Wetland_connected_surge_inundation_NAD1983_shorelineinteraction_buffer2000m.shp


# -------------------------------------------------------------------
# 3) Stream riparian buffer 50 m + erase wetlands (keeps CW_Id)
- Create a 50 meter buffer for Great lakes basin streams (This is riverin Riperian area)
-  Erase your coastal wetlands that overlap with Riperian area(GLB streams)
# -------------------------------------------------------------------

In [7]:
if not arcpy.Exists(outBuffer):
    arcpy.analysis.Buffer(inStreams, outBuffer, "50 Meters")

arcpy.analysis.Erase(wetlands_avg_inun,  outBuffer, erase_buffer_avg)
arcpy.analysis.Erase(wetlands_high_inun, outBuffer, erase_buffer_high)
arcpy.analysis.Erase(wetlands_low_inun,  outBuffer, erase_buffer_low)
arcpy.analysis.Erase(wetlands_surge,     outBuffer, erase_buffer_surge)

arcpy.management.ClearWorkspaceCache()
# add wetland areas and coordinates
for fc in [erase_buffer_avg, erase_buffer_high, erase_buffer_low, erase_buffer_surge]:
    # shorter name for SHP + avoids long names anyway
    calculate_area_m2(fc, "CW_Area_m2")     # if this is SHP it's OK (<10 chars)
    add_xy_ll(fc, prefix="CW")
    print(f"‚úÖ wetland attrs added: {os.path.basename(fc)}")


‚úÖ wetland attrs added: Wetland_connected_avg_erasebuff_50.shp
‚úÖ wetland attrs added: Wetland_connected_high_erasebuff_50.shp
‚úÖ wetland attrs added: Wetland_connected_low_erasebuff_50.shp
‚úÖ wetland attrs added: Wetland_connected_surge_erasebuff_50.shp


## Step 4 ‚Äî Create **1:1 Coastal-Wetland Watersheds** (Pourpoints ‚Üí Snap ‚Üí Watershed ‚Üí Polygons ‚Üí **Clip overlaps** ‚Üí **Repair missing IDs**)

This cell delineates **one coastal watershed per coastal wetland (`CW_Id`)** for each inundation category (**avg/high/low/surge**), while ensuring the identifier **`CW_Id` remains 1:1** throughout the workflow.

Unlike earlier versions that could *drop entire IDs* when wetlands were in-lake or when watersheds overlapped stream-drainage areas, this updated workflow:

* **keeps every `CW_Id`**, even if the wetland is partially/fully in the lake,
* **clips out only the overlapping portions** (lake interior and stream-watershed overlap),
* and **repairs** missing IDs caused by raster snapping collisions by rebuilding only the missing IDs one-by-one.

It is also designed to avoid common ArcPy issues (schema locks, field-name limits in shapefiles, field-case mismatches, and SnapPourPoint ID collapsing).

---

### What this cell produces (per inundation category: avg/high/low/surge)

For each category, the cell creates:

* **Pourpoints (GDB feature class)**: one point per `CW_Id` (land-only + fallback for lake-only IDs)
* **Snapped pourpoints (GDB feature class)**: snapped onto high-flowacc land cells (using `flowacc_land`)
* **Watershed raster (GDB raster)**: watershed labels equal to `CW_Id`
* **Watershed polygons (GDB feature class)**: raster converted to polygons and dissolved to **one polygon per `CW_Id`**
* **Clipped watershed polygons (final)**: only the portions overlapping:

  * **Lake Huron interior**, and
  * **stream-watershed mask**
    are removed (the rest of the polygon is preserved)
* **Repaired final outputs (shapefiles)**:

  * `*_erase_lakedrain_LakeHuron.shp` (one feature per `CW_Id`, after clip + repair)
* **Final attributes added to the final shapefile** (when possible):

  * `WS_AREAM2` = watershed area (m¬≤)
  * `WS_cx`, `WS_cy` = centroid X/Y in dataset CRS
  * `WS_lon`, `WS_lat` = centroid lon/lat in WGS84

---

### Why ‚Äúpourpoints‚Äù matter

A **pour point** is the location Spatial Analyst uses to define **which upstream cells contribute** to that outlet.
Here, each wetland gets **exactly one pourpoint per `CW_Id`**, which drives the ‚Äúone watershed per wetland‚Äù requirement.

---

### Updated snapping logic (and why it differs from the older ‚Äúunique snap‚Äù approach)

Previously, strict uniqueness snapping (one raster cell per point) was used to prevent ID collisions. In practice, nearshore lake-only wetlands can still collide at shoreline cells and/or fail to reach land.

This updated workflow uses a **two-stage strategy**:

1. **Bulk snapping** (fast): uses SnapPourPoint on a land-only flowacc surface (`flowacc_land`)
2. **Repair pass** (precise): if any `CW_Id` is missing after delineation + clipping, rebuild just those IDs **one-by-one** to eliminate raster collisions.

This preserves performance (bulk) while guaranteeing completeness (repair).

---

### Core processing steps inside the loop (per category)

#### 0) Environment + workspaces (no C:\ temp writes)

* All intermediates are written to:

  * `watersheds.gdb` (under `outWatersheds`)
  * `pourpoints.gdb` (under `outPourpoints`)
* Processing is aligned to the D8 grid:

  * `snapRaster = D8_flow`
  * `extent = D8_flow`
  * `cellSize = D8_flow`
  * `outputCoordinateSystem = D8_SR`

---

#### 1) Build masks once (outside the loop)

* **Stream-watershed mask**

  * CRS fixed if mislabeled, geometry repaired, dissolved, and (optionally) buffered slightly
* **Lake Huron polygon**

  * CRS corrected and projected to D8 CRS
* **Land-only flowacc surface**

  * `flowacc_land = flowacc` with lake cells set to NoData
  * ensures snapping targets land cells only

---

#### 2) Prepare wetlands and authoritative CW_Id list

* Wetlands are projected to D8 CRS and repaired
* **Original wetlands dissolved by `CW_Id`** to guarantee:

  * one feature per ID
  * a definitive list of all IDs that must exist in the final output

‚úÖ Output: `{cat}_wet_orig_diss` (GDB)

---

#### 3) Create pourpoints (one per CW_Id, including lake-only)

* Create **land-only wetlands** by erasing the lake, then dissolve by `CW_Id`
* Create pourpoints inside land-only dissolved polygons (1 per CW_Id)
* Identify **lake-only IDs** (present in original dissolve but missing from land-only dissolve)
* Create fallback pourpoints for lake-only IDs using the **original dissolved** polygons
* Append fallback points into the pourpoint set

‚úÖ Output: `{cat}_pp_inside` (GDB; 1 point per CW_Id target)

---

#### 4) Snap pourpoints to land-only high-flowacc cells (bulk)

* Convert pourpoints to raster (VALUE = `CW_Id`)
* Snap via:

  * `SnapPourPoint(pour_raster, flowacc_land, snap_dist_m)`
* Convert snapped raster back to points and confirm ID coverage

‚úÖ Outputs:

* `{cat}_pp_snapped_pts`
* `{cat}_pp_snap_ras`

---

#### 5) Delineate watershed raster (VALUE = CW_Id)

* `Watershed(D8_flow, snapped_pourpoint_raster)`
* Produces a raster where each watershed is labeled by `CW_Id`

‚úÖ Output: `{cat}_ws_ras`

---

#### 6) Raster ‚Üí polygon + dissolve to one feature per CW_Id

* `RasterToPolygon` creates `GRIDCODE`
* Compute `CW_Id = GRIDCODE`
* Dissolve by `CW_Id` to enforce **one polygon per wetland**

‚úÖ Output: `{cat}_ws_poly`

---

#### 7) Clip overlaps (remove only the overlapping parts)

To avoid dropping whole watersheds, lake/stream constraints are applied as **polygon clip operations**, not as raster masks:

* Erase **Lake Huron** interior from watershed polygons
* Erase **stream-watershed mask** overlap from watershed polygons
* Dissolve again by `CW_Id`

‚úÖ Output: final category shapefile `CoastalWatershed_{cat}_erase_lakedrain_LakeHuron.shp`

---

#### 8) Repair missing IDs (guarantee 1:1 output)

After clipping, some IDs may still be missing due to:

* SnapPourPoint raster collisions (multiple IDs snapped to same cell), or
* the clipped result becoming empty for a rare ID

Repair strategy:

* Compare final `CW_Id`s to the authoritative list (`wet_orig_diss`)
* For each missing `CW_Id`:

  * rebuild watershed **one-by-one** from its pourpoint (avoids collisions),
  * clip lake/stream overlaps,
  * append to final output,
  * dissolve to enforce one feature per `CW_Id`

This step ensures the final output is as close as possible to **1 polygon per wetland ID** while still honoring clipping rules.

---

### Built-in sanity checks (what to watch in the console)

The cell prints (per category):

* `unique wetland IDs (original, dissolved)`
* `unique snapped pourpoint IDs (target wet_n)`
* `watersheds BEFORE clipping unique IDs`
* `final watersheds AFTER clip unique IDs`
* `Missing IDs after clip`
* `final watersheds AFTER rebuild`

---

### Common adjustments you may want

* If many IDs are missing after clip:

  * increase `snap_dist_m` (lake-only points may need more distance to reach land)
  * reduce stream buffer distance if it removes too much nearshore area
* If you get schema lock issues:

  * close attribute tables, stop drawing layers, avoid adding outputs to map during run
* If the repair step runs long:

  * it scales with the number of missing IDs; reducing collision likelihood (snap distance + point density) reduces repair workload

---


In [33]:
# --- QUICK CRS CHECK (run this before the big cell) ---
import arcpy

def _sr_str(sr):
    if sr is None:
        return "None"
    try:
        return f"{sr.name} | factoryCode={sr.factoryCode} | type={sr.type}"
    except Exception:
        return str(sr)

def check_crs(fc, name):
    d = arcpy.Describe(fc)
    sr = getattr(d, "spatialReference", None)
    print(f"\n{name}")
    print(f"  path: {fc}")
    print(f"  dataType: {d.dataType}")
    print(f"  sr: {_sr_str(sr)}")

    # Helpful checks
    unknown = (sr is None) or (sr.name in [None, "", "Unknown"])
    if unknown:
        print("  ‚ö†Ô∏è CRS is UNKNOWN (you must DefineProjection before Project).")
    else:
        try:
            print(f"  linearUnit: {sr.linearUnitName}")
        except Exception:
            pass

    # extent can hint degrees vs meters
    try:
        e = d.extent
        print(f"  extent: XMin={e.XMin:.3f}, XMax={e.XMax:.3f}, YMin={e.YMin:.3f}, YMax={e.YMax:.3f}")
        if abs(e.XMin) <= 180 and abs(e.XMax) <= 180 and abs(e.YMin) <= 90 and abs(e.YMax) <= 90:
            print("  hint: extent looks like *degrees* (likely EPSG:4326 or similar)")
        else:
            print("  hint: extent looks like *projected units* (meters/feet)")
    except Exception:
        pass

# --- run checks ---
check_crs(inStreamsWatershed, "inStreamsWatershed (stream watershed mask)")
check_crs(Lake_Huron,        "Lake_Huron (lake mask)")



inStreamsWatershed (stream watershed mask)
  path: D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\Streamwatershed\PointWaterdhed_LH.shp
  dataType: ShapeFile
  sr: GCS_WGS_1984 | factoryCode=4326 | type=Geographic
  linearUnit: 
  extent: XMin=929846.850, XMax=1166434.867, YMin=651452.122, YMax=1020243.911
  hint: extent looks like *projected units* (meters/feet)

Lake_Huron (lake mask)
  path: D:\Users\abolmaal\code\boundry\hydro_p_LakeHuron\hydro_p_LakeHuron.shp
  dataType: ShapeFile
  sr: Geographic | factoryCode=0 | type=Geographic
  linearUnit: 
  extent: XMin=-84.752, XMax=-79.668, YMin=42.996, YMax=46.333
  hint: extent looks like *degrees* (likely EPSG:4326 or similar)


In [None]:
# --- FINAL ROBUST CELL (UPDATED): 1 Coastal Watershed per Coastal Wetland (CW_Id)
# Goal:
#   - ALWAYS produce one watershed per CW_Id
#   - Then CLIP OUT only the portions inside Lake Huron or inside Stream-Watershed mask
#   - Do NOT drop whole IDs due to tiny overlaps
#   - No C:\ temp writes (all temp in ws_gdb / pp_gdb)
# Notes:
#   - This workflow can still lose some CW_Ids due to SnapPourPoint raster collisions.
#     We fix that with a "REPAIR missing IDs" step that rebuilds only missing CW_Ids one-by-one.

import os, gc, time, sys
import arcpy
from arcpy import sa

arcpy.env.overwriteOutput = True
arcpy.env.addOutputsToMap = False
arcpy.CheckOutExtension("Spatial")

# -------------------------
# REQUIRED: folders exist (must be defined in your notebook)
# -------------------------
# outPourpoints, outWatersheds, inStreamsWatershed, Lake_Huron, D8_flow
# erase_buffer_avg/high/low/surge
# CoastalWatershed_* output paths must exist in your notebook variables
os.makedirs(outPourpoints, exist_ok=True)
os.makedirs(outWatersheds, exist_ok=True)

# -------------------------
# Reliable workspaces (GDBs)
# -------------------------
pp_gdb = os.path.join(outPourpoints, "pourpoints.gdb")
if not arcpy.Exists(pp_gdb):
    arcpy.management.CreateFileGDB(outPourpoints, "pourpoints.gdb")

ws_gdb = os.path.join(outWatersheds, "watersheds.gdb")
if not arcpy.Exists(ws_gdb):
    arcpy.management.CreateFileGDB(outWatersheds, "watersheds.gdb")

# FORCE all scratch/workspace to your ws_gdb (no C:\temp)
arcpy.env.workspace = ws_gdb
arcpy.env.scratchWorkspace = ws_gdb

# -------------------------
# Inputs
# -------------------------
flowacc = r"S:\Projects\Active\GLB_Nutrient_Transport\DEM_rasters\GLB_Bdry_buff10km_dem_fill_flowaccu.tif"
print(f"‚úÖ flowacc: {flowacc}", flush=True)

cats = {
    #"avg":   (erase_buffer_avg,   CoastalWatershed_avg_erase_lakedrain,   CoastalWatershed_avg_erase_lakedrain_LakeHuron),
    "high":  (erase_buffer_high,  CoastalWatershed_high_erase_lakedrain,  CoastalWatershed_high_erase_lakedrain_LakeHuron),
    "low":   (erase_buffer_low,   CoastalWatershed_low_erase_lakedrain,   CoastalWatershed_low_erase_lakedrain_LakeHuron),
    "surge": (erase_buffer_surge, CoastalWatershed_surge_erase_lakedrain, CoastalWatershed_surge_erase_lakedrain_LakeHuron),
}

CW_ID_FIELD = "CW_Id"

# -------------------------
# Align env to D8 grid
# -------------------------
D8_SR = arcpy.Describe(D8_flow).spatialReference
arcpy.env.snapRaster = D8_flow
arcpy.env.cellSize   = D8_flow
arcpy.env.extent     = D8_flow
arcpy.env.outputCoordinateSystem = D8_SR
cellsize = float(arcpy.Describe(D8_flow).meanCellWidth)

print(f"‚úÖ workspace: {ws_gdb}", flush=True)
print(f"‚úÖ scratch:   {ws_gdb}", flush=True)
print(f"‚úÖ D8_flow CRS: {D8_SR.name} (factoryCode={D8_SR.factoryCode})", flush=True)

# ============================================================
# Helpers
# ============================================================
def _log(msg):
    print(msg, flush=True)
    sys.stdout.flush()

def _clear_locks():
    try:
        arcpy.ClearWorkspaceCache_management()
    except Exception:
        pass
    gc.collect()

def _safe_delete(p):
    try:
        if arcpy.Exists(p):
            arcpy.management.Delete(p)
    except Exception:
        _clear_locks()
        if arcpy.Exists(p):
            arcpy.management.Delete(p)

def _field_map_lower(fc):
    return {f.name.lower(): f.name for f in arcpy.ListFields(fc)}

def _find_field(fc, candidates):
    fmap = _field_map_lower(fc)
    for c in candidates:
        if c and c.lower() in fmap:
            return fmap[c.lower()]
    return None

def get_field_name_ci(fc, target_name):
    if not target_name:
        return None
    t = target_name.lower()
    for f in arcpy.ListFields(fc):
        if f.name.lower() == t:
            return f.name
    return None

def _is_shp(path):
    return isinstance(path, str) and path.lower().endswith(".shp")

def _ensure_field(fc, desired_name, field_type="LONG", fallbacks=()):
    if not desired_name or not str(desired_name).strip():
        desired_name = "CW_Id"

    existing = get_field_name_ci(fc, desired_name)
    if existing:
        return existing

    candidates = [desired_name] + list(fallbacks)
    for nm in candidates:
        if not nm:
            continue

        safe = arcpy.ValidateFieldName(nm, os.path.dirname(fc) if isinstance(fc, str) else "")
        if _is_shp(fc) and len(safe) > 10:
            safe = safe[:10]

        existing = get_field_name_ci(fc, safe)
        if existing:
            return existing

        try:
            arcpy.management.AddField(fc, safe, field_type)
            return get_field_name_ci(fc, safe) or safe
        except Exception:
            _clear_locks()
            continue

    raise RuntimeError(f"Cannot add field '{desired_name}' to {fc}")

def count_unique(fc, id_field):
    fld = get_field_name_ci(fc, id_field) or _find_field(fc, [id_field])
    if not fld:
        return 0
    vals = set()
    with arcpy.da.SearchCursor(fc, [fld]) as cur:
        for (v,) in cur:
            if v is not None:
                vals.add(int(v))
    return len(vals)

def _idset(fc, id_field):
    fld = get_field_name_ci(fc, id_field) or _find_field(fc, [id_field])
    s = set()
    with arcpy.da.SearchCursor(fc, [fld]) as cur:
        for (v,) in cur:
            if v is not None:
                s.add(int(v))
    return s

def calculate_area_m2(fc, field="WS_AREAM2"):
    try:
        field = _ensure_field(fc, field, "DOUBLE", fallbacks=("AREA_M2","A_M2","AREA"))
        arcpy.management.CalculateGeometryAttributes(fc, [[field, "AREA"]], area_unit="SQUARE_METERS")
    except Exception as e:
        _log(f"‚ö†Ô∏è area field skipped: {e}")
    return field

def add_xy_ll(fc, prefix="WS"):
    try:
        cx = _ensure_field(fc, f"{prefix}_cx", "DOUBLE", fallbacks=(f"{prefix}X",))
        cy = _ensure_field(fc, f"{prefix}_cy", "DOUBLE", fallbacks=(f"{prefix}Y",))
        arcpy.management.CalculateField(fc, cx, "!SHAPE.centroid.X!", "PYTHON3")
        arcpy.management.CalculateField(fc, cy, "!SHAPE.centroid.Y!", "PYTHON3")

        lon = _ensure_field(fc, f"{prefix}_lon", "DOUBLE", fallbacks=(f"{prefix}LON",))
        lat = _ensure_field(fc, f"{prefix}_lat", "DOUBLE", fallbacks=(f"{prefix}LAT",))
        arcpy.management.CalculateGeometryAttributes(
            fc,
            [[lat, "CENTROID_Y"], [lon, "CENTROID_X"]],
            coordinate_system=arcpy.SpatialReference(4326),
            coordinate_format="DD"
        )
    except Exception as e:
        _log(f"‚ö†Ô∏è xy/ll fields skipped: {e}")

def gp(label, func, *args, **kwargs):
    _log(f"‚ñ∂ {label}")
    t0 = time.time()
    out = func(*args, **kwargs)
    _log(f"‚úÖ DONE {label} ({(time.time()-t0)/60:.2f} min)")
    return out

# ---------- CRS FIX + PROJECT (TEMP WRITES INTO ws_gdb) ----------
def fix_define_and_project_to_gdb(in_fc, out_fc, out_sr, assumed_src_if_mislabeled=None, name="layer"):
    """
    Robust CRS fixer:
    - If dataset is mislabeled but coordinates already match out_sr, we:
        CopyFeatures -> DefineProjection(out_sr) and STOP (no Project).
    - Otherwise we DefineProjection to known source CRS (e.g. EPSG:4326) then Project.
    All temp writes stay in ws_gdb (no C:\).
    """
    def _looks_like_degrees(ext):
        return (abs(ext.XMin) <= 180 and abs(ext.XMax) <= 180 and abs(ext.YMin) <= 90 and abs(ext.YMax) <= 90)

    def _looks_like_projected(ext):
        return (max(abs(ext.XMin), abs(ext.XMax), abs(ext.YMin), abs(ext.YMax)) > 1000)

    def _pick_transform(in_sr, out_sr):
        try:
            tx = arcpy.ListTransformations(in_sr, out_sr)
            return tx[0] if tx else None
        except Exception:
            return None

    _safe_delete(out_fc)

    d = arcpy.Describe(in_fc)
    sr = d.spatialReference
    ext = d.extent

    _log(f"\n[{name}] input: {in_fc}")
    _log(f"[{name}] sr: {sr.name if sr else None} | factoryCode={getattr(sr,'factoryCode',None)} | type={getattr(sr,'type',None)}")
    _log(f"[{name}] extent: XMin={ext.XMin:.3f} XMax={ext.XMax:.3f} YMin={ext.YMin:.3f} YMax={ext.YMax:.3f}")

    tmp_copy = os.path.join(ws_gdb, f"tmp_{name}_copy")
    _safe_delete(tmp_copy)

    # Copy without any implicit projection
    with arcpy.EnvManager(outputCoordinateSystem=None, extent=None, snapRaster=None, cellSize=None):
        gp(f"CopyFeatures {name}", arcpy.management.CopyFeatures, in_fc, tmp_copy)

    d2 = arcpy.Describe(tmp_copy)
    sr2 = d2.spatialReference
    ext2 = d2.extent

    mislabeled_projected = (
        sr2 is not None and getattr(sr2, "type", None) == "Geographic"
        and _looks_like_projected(ext2) and not _looks_like_degrees(ext2)
    )

    if mislabeled_projected:
        _log(f"[{name}] ‚ö†Ô∏è MISLABELED Geographic but coords are projected. SKIP Project; DefineProjection -> {out_sr.name}")
        gp(f"DefineProjection {name}", arcpy.management.DefineProjection, tmp_copy, out_sr)
        gp(f"Copy to output {name}", arcpy.management.CopyFeatures, tmp_copy, out_fc)
        _safe_delete(tmp_copy)
        return out_fc

    generic_degrees = (
        sr2 is not None and getattr(sr2, "type", None) == "Geographic"
        and _looks_like_degrees(ext2)
        and getattr(sr2, "factoryCode", 0) in (0, None)
    )

    if generic_degrees:
        if assumed_src_if_mislabeled is None:
            assumed_src_if_mislabeled = arcpy.SpatialReference(4326)
        _log(f"[{name}] ‚ö†Ô∏è GENERIC geographic degrees. DefineProjection -> EPSG:4326 then Project.")
        gp(f"DefineProjection {name}", arcpy.management.DefineProjection, tmp_copy, assumed_src_if_mislabeled)

    sr_fixed = arcpy.Describe(tmp_copy).spatialReference
    if sr_fixed and (sr_fixed.factoryCode == out_sr.factoryCode) and (sr_fixed.name == out_sr.name):
        _log(f"[{name}] Already in target CRS. CopyFeatures only (no Project).")
        gp(f"Copy to output {name}", arcpy.management.CopyFeatures, tmp_copy, out_fc)
        _safe_delete(tmp_copy)
        return out_fc

    transform = _pick_transform(sr_fixed, out_sr)
    _log(f"[{name}] Project -> {out_sr.name} | transform={transform}")

    if transform:
        gp(f"Project {name}", arcpy.management.Project, tmp_copy, out_fc, out_sr, transform)
    else:
        gp(f"Project {name}", arcpy.management.Project, tmp_copy, out_fc, out_sr)

    _safe_delete(tmp_copy)
    return out_fc

def polygon_to_mask_raster(poly_fc, out_ras, value=1):
    """
    Polygon -> raster aligned to D8 (snap/cellsize/extent already set).
    Uses a constant field to burn 'value' into raster.
    """
    _safe_delete(out_ras)
    fld = "MASKVAL"
    if fld not in [f.name for f in arcpy.ListFields(poly_fc)]:
        arcpy.management.AddField(poly_fc, fld, "SHORT")
        arcpy.management.CalculateField(poly_fc, fld, value, "PYTHON3")
    gp(f"PolygonToRaster {os.path.basename(out_ras)}", arcpy.conversion.PolygonToRaster,
       poly_fc, fld, out_ras, "CELL_CENTER", "", cellsize)
    return out_ras

# ============================================================
# 0) Build projected masks ONCE (stored in ws_gdb)
# ============================================================
# Stream watershed mask: mislabeled as 4326 but projected coords
inStreamsWS_tgt = os.path.join(ws_gdb, "inStreamsWS_tgt")
fix_define_and_project_to_gdb(
    in_fc=inStreamsWatershed,
    out_fc=inStreamsWS_tgt,
    out_sr=D8_SR,
    name="inStreamsWS"
)

gp("RepairGeometry stream mask", arcpy.management.RepairGeometry, inStreamsWS_tgt)

inStreams_single = os.path.join(ws_gdb, "inStreamsWS_single")
_safe_delete(inStreams_single)
gp("MultipartToSinglepart stream mask", arcpy.management.MultipartToSinglepart, inStreamsWS_tgt, inStreams_single)

inStreamsWS_tgt_diss = os.path.join(ws_gdb, "inStreamsWS_tgt_diss")
_safe_delete(inStreamsWS_tgt_diss)
gp("Dissolve stream mask", arcpy.management.Dissolve, inStreams_single, inStreamsWS_tgt_diss)

# Buffer stream mask (helps remove slivers; does NOT delete entire IDs because we clip polygons later)
stream_buf_m = 60
inStreamsWS_buf = os.path.join(ws_gdb, f"inStreamsWS_buf{stream_buf_m}m")
_safe_delete(inStreamsWS_buf)
gp("Buffer stream mask", arcpy.analysis.Buffer, inStreamsWS_tgt_diss, inStreamsWS_buf, f"{stream_buf_m} Meters", "FULL", "ROUND", "ALL")

# Lake polygon: generic geographic in degrees
Lake_tgt = os.path.join(ws_gdb, "LakeHuron_tgt")
fix_define_and_project_to_gdb(
    in_fc=Lake_Huron,
    out_fc=Lake_tgt,
    out_sr=D8_SR,
    assumed_src_if_mislabeled=arcpy.SpatialReference(4326),
    name="LakeHuron"
)

gp("RepairGeometry lake", arcpy.management.RepairGeometry, Lake_tgt)

# Lake mask raster (for flowacc_land only)
lake_mask_ras = os.path.join(ws_gdb, "LakeHuron_mask_ras")
polygon_to_mask_raster(Lake_tgt, lake_mask_ras, value=1)
_log(f"‚úÖ lake_mask_ras: {lake_mask_ras}")

# Land-only flowacc for snapping (NoData on lake)
flowacc_land = os.path.join(ws_gdb, "flowacc_land")
_safe_delete(flowacc_land)
_log("‚ñ∂ Build flowacc_land = flowacc where NOT lake")
fa_land = sa.SetNull(sa.Raster(lake_mask_ras), sa.Raster(flowacc))
fa_land.save(flowacc_land)
_log(f"‚úÖ flowacc_land: {flowacc_land}")

# ============================================================
# MAIN LOOP
# ============================================================
for cat, (wet_fc, out_ws_drain, out_ws_lake) in cats.items():

    _log(f"\n==================== {cat.upper()} ====================")

    # --- 1) Project wetlands into D8 SR (work in GDB)
    wet_tgt = os.path.join(ws_gdb, f"{cat}_wet_tgt")
    _safe_delete(wet_tgt)
    gp(f"[{cat}] Project wetlands", arcpy.management.Project, wet_fc, wet_tgt, D8_SR)

    wet_id_f = _ensure_field(wet_tgt, CW_ID_FIELD, "LONG", fallbacks=("CWID","CW_ID","CW_Id"))
    gp(f"[{cat}] RepairGeometry wetlands", arcpy.management.RepairGeometry, wet_tgt)

    # --- 2) Dissolve ORIGINAL wetlands by CW_Id (guarantees 1 polygon per CW_Id)
    wet_orig_diss = os.path.join(ws_gdb, f"{cat}_wet_orig_diss")
    _safe_delete(wet_orig_diss)
    gp(f"[{cat}] Dissolve ORIGINAL wetlands by CW_Id", arcpy.management.Dissolve, wet_tgt, wet_orig_diss, wet_id_f)

    wet_n = count_unique(wet_orig_diss, wet_id_f)
    _log(f"[{cat}] unique wetland IDs (original, dissolved): {wet_n}")

    # --- 3) LAND-ONLY wetlands (erase lake) then dissolve (for land pourpoints)
    wet_land = os.path.join(ws_gdb, f"{cat}_wet_land")
    _safe_delete(wet_land)
    gp(f"[{cat}] Erase lake from wetlands (land-only)", arcpy.analysis.Erase, wet_tgt, Lake_tgt, wet_land)

    wet_land_diss = os.path.join(ws_gdb, f"{cat}_wet_land_diss")
    _safe_delete(wet_land_diss)
    gp(f"[{cat}] Dissolve land-only wetlands by CW_Id", arcpy.management.Dissolve, wet_land, wet_land_diss, wet_id_f)

    land_n = count_unique(wet_land_diss, wet_id_f)
    _log(f"[{cat}] unique wetland IDs (land-only, dissolved): {land_n}")

    # --- 4) Pourpoints from land-only dissolved (1 per CW_Id)
    pp_inside = os.path.join(pp_gdb, f"{cat}_pp_inside")
    _safe_delete(pp_inside)
    gp(f"[{cat}] FeatureToPoint INSIDE (land-only dissolved)", arcpy.management.FeatureToPoint, wet_land_diss, pp_inside, "INSIDE")

    # --- 5) Add fallback pourpoints for lake-only IDs from ORIGINAL dissolved (still 1 per CW_Id)
    orig_ids = _idset(wet_orig_diss, wet_id_f)
    land_ids = _idset(wet_land_diss, wet_id_f)
    missing_lakeonly = sorted(list(orig_ids - land_ids))

    if missing_lakeonly:
        _log(f"‚ö†Ô∏è [{cat}] {len(missing_lakeonly)} wetlands appear fully in-lake after erase; adding 1 fallback point per CW_Id.")

        # Select missing polygons from wet_orig_diss using a chunked IN() approach
        wet_missing_poly = os.path.join(ws_gdb, f"{cat}_wet_missing_poly")
        _safe_delete(wet_missing_poly)

        lyr = f"lyr_{cat}_missing"
        arcpy.management.MakeFeatureLayer(wet_orig_diss, lyr)
        chunk = 900
        for i in range(0, len(missing_lakeonly), chunk):
            sub = missing_lakeonly[i:i+chunk]
            where = f"{arcpy.AddFieldDelimiters(lyr, wet_id_f)} IN ({','.join(map(str, sub))})"
            arcpy.management.SelectLayerByAttribute(lyr, "ADD_TO_SELECTION", where)
        gp(f"[{cat} missing] Copy selected", arcpy.management.CopyFeatures, lyr, wet_missing_poly)
        arcpy.management.Delete(lyr)

        pp_fallback = os.path.join(pp_gdb, f"{cat}_pp_fallback")
        _safe_delete(pp_fallback)
        gp(f"[{cat}] FeatureToPoint INSIDE (fallback dissolved)", arcpy.management.FeatureToPoint, wet_missing_poly, pp_fallback, "INSIDE")

        gp(f"[{cat}] Append fallback points", arcpy.management.Append, pp_fallback, pp_inside, "NO_TEST")
        _safe_delete(pp_fallback)
        _safe_delete(wet_missing_poly)

    # ensure ID field exists on points
    pp_id_f = _ensure_field(pp_inside, wet_id_f, "LONG", fallbacks=("CWID","CW_ID","CW_Id"))

    # --- 6) SnapPourPoint in raster space (bulk) using LAND-ONLY flowacc
    pp_ras = os.path.join(pp_gdb, f"{cat}_pp_ras")
    _safe_delete(pp_ras)
    gp(f"[{cat}] PointToRaster pourpoints", arcpy.conversion.PointToRaster,
       pp_inside, pp_id_f, pp_ras, "MAXIMUM", "", cellsize)

    snapped_pp_ras = os.path.join(pp_gdb, f"{cat}_pp_snapped_ras")
    _safe_delete(snapped_pp_ras)

    # For lake-only points you typically need a larger snap distance to reach land.
    # But larger distances increase collisions. We'll keep it moderate and REPAIR missing IDs later.
    snap_dist_m = 150
    _log(f"[{cat}] SnapPourPoint distance = {snap_dist_m} m (land-only)")
    sa.SnapPourPoint(sa.Raster(pp_ras), sa.Raster(flowacc_land), snap_dist_m).save(snapped_pp_ras)

    # Convert snapped raster to points (bulk snapped pourpoints)
    snapped_pts = os.path.join(pp_gdb, f"{cat}_pp_snapped_pts")
    _safe_delete(snapped_pts)
    gp(f"[{cat}] RasterToPoint snapped pourpoints", arcpy.conversion.RasterToPoint, snapped_pp_ras, snapped_pts, "VALUE")

    val_field = _find_field(snapped_pts, ["GRID_CODE", "GRIDCODE", "VALUE"])
    if not val_field:
        raise RuntimeError(f"[{cat}] Could not find VALUE/GRIDCODE in snapped points.")

    pp_final_id = _ensure_field(snapped_pts, wet_id_f, "LONG", fallbacks=("CWID","CW_ID","CW_Id"))
    gp(f"[{cat}] Calculate CW_Id on snapped points", arcpy.management.CalculateField,
       snapped_pts, pp_final_id, f"!{val_field}!", "PYTHON3")

    uniq_pp = count_unique(snapped_pts, pp_final_id)
    _log(f"[{cat}] unique snapped pourpoint IDs: {uniq_pp} (target {wet_n})")

    # --- 7) Convert snapped points back to raster for Watershed
    pp_snap_ras = os.path.join(pp_gdb, f"{cat}_pp_snap_ras")
    _safe_delete(pp_snap_ras)
    gp(f"[{cat}] PointToRaster snapped points", arcpy.conversion.PointToRaster,
       snapped_pts, pp_final_id, pp_snap_ras, "MAXIMUM", "", cellsize)

    # --- 8) Watershed raster (NO lake/stream raster masking here!)
    ws_ras = os.path.join(ws_gdb, f"{cat}_ws_ras")
    _safe_delete(ws_ras)
    _log(f"‚ñ∂ [{cat}] Watershed")
    t0 = time.time()
    sa.Watershed(D8_flow, pp_snap_ras).save(ws_ras)
    _log(f"‚úÖ DONE [{cat}] Watershed ({(time.time()-t0)/60:.2f} min)")

    # --- 9) RasterToPolygon (NO raster masking)
    ws_poly_raw = os.path.join(ws_gdb, f"{cat}_ws_poly_raw")
    _safe_delete(ws_poly_raw)
    gp(f"[{cat}] RasterToPolygon", arcpy.conversion.RasterToPolygon, ws_ras, ws_poly_raw, "NO_SIMPLIFY", "VALUE")

    grid_field = _find_field(ws_poly_raw, ["GRIDCODE","GRID_CODE"])
    if not grid_field:
        raise RuntimeError(f"[{cat}] GRIDCODE missing in watershed polygons.")

    ws_id_f = _ensure_field(ws_poly_raw, wet_id_f, "LONG", fallbacks=("CWID","CW_ID","CW_Id"))
    gp(f"[{cat}] Calculate CW_Id on watershed polygons", arcpy.management.CalculateField,
       ws_poly_raw, ws_id_f, f"!{grid_field}!", "PYTHON3")

    ws_poly = os.path.join(ws_gdb, f"{cat}_ws_poly")
    _safe_delete(ws_poly)
    gp(f"[{cat}] Dissolve watersheds by CW_Id", arcpy.management.Dissolve, ws_poly_raw, ws_poly, ws_id_f)

    ws_before = count_unique(ws_poly, ws_id_f)
    _log(f"[{cat}] watersheds BEFORE clipping unique IDs: {ws_before} (target {wet_n})")

    # --- 10) CLIP OUT ONLY overlap parts (lake + stream) as polygons
    tmp_no_lake = os.path.join(ws_gdb, f"{cat}_ws_no_lake")
    tmp_no_stream = os.path.join(ws_gdb, f"{cat}_ws_no_stream")
    _safe_delete(tmp_no_lake); _safe_delete(tmp_no_stream)

    gp(f"[{cat}] Erase lake from watersheds (clip)", arcpy.analysis.Erase, ws_poly, Lake_tgt, tmp_no_lake)
    gp(f"[{cat}] Erase stream mask from watersheds (clip)", arcpy.analysis.Erase, tmp_no_lake, inStreamsWS_buf, tmp_no_stream)

    _safe_delete(out_ws_lake)
    gp(f"[{cat}] Dissolve final output (clip result)", arcpy.management.Dissolve, tmp_no_stream, out_ws_lake, ws_id_f)

    ws_after = count_unique(out_ws_lake, ws_id_f)
    _log(f"[{cat}] final watersheds AFTER clip unique IDs: {ws_after} (target {wet_n})")

    # ============================================================
# 11) REPAIR missing IDs (FAST VERSION)
# ============================================================
# 11) REPAIR missing IDs (ASSIGNMENT ONLY - FAST)
#     This guarantees 1 feature per CW_Id without per-ID watershed rebuild.
# ============================================================
# ============================================================
# 11) REPAIR missing IDs (ASSIGNMENT ONLY - FAST)
#     This guarantees 1 feature per CW_Id without per-ID watershed rebuild.
# ============================================================

    final_ids = _idset(out_ws_lake, ws_id_f)
    missing_ids = sorted(list(orig_ids - final_ids))
    _log(f"[{cat}] Missing IDs after clip: {len(missing_ids)}")

    if missing_ids:
        # Extract pourpoints for missing IDs
        pp_layer = f"lyr_{cat}_pp_inside"
        arcpy.management.MakeFeatureLayer(pp_inside, pp_layer)

        miss_pts = os.path.join(pp_gdb, f"{cat}_missing_pts")
        _safe_delete(miss_pts)

        arcpy.management.SelectLayerByAttribute(pp_layer, "CLEAR_SELECTION")
        chunk = 900
        for i in range(0, len(missing_ids), chunk):
            sub = missing_ids[i:i+chunk]
            where = f"{arcpy.AddFieldDelimiters(pp_layer, pp_id_f)} IN ({','.join(map(str, sub))})"
            arcpy.management.SelectLayerByAttribute(pp_layer, "ADD_TO_SELECTION", where)

        gp(f"[{cat}] Copy missing pourpoints", arcpy.management.CopyFeatures, pp_layer, miss_pts)
        arcpy.management.Delete(pp_layer)

        # Near -> closest existing final watershed polygon
        gp(f"[{cat}] Near(missing pts -> final watersheds)", arcpy.analysis.Near, miss_pts, out_ws_lake)

        # Build donor geometry lookup (OID -> geometry)
        oid_field = arcpy.Describe(out_ws_lake).OIDFieldName
        donor_geom = {}
        with arcpy.da.SearchCursor(out_ws_lake, [oid_field, "SHAPE@"]) as cur:
            for oid, geom in cur:
                donor_geom[int(oid)] = geom

        # Create assigned FC (one polygon per missing CW_Id)
        assigned_fc = os.path.join(ws_gdb, f"{cat}_assigned_missing")
        _safe_delete(assigned_fc)
        gp(f"[{cat}] Create assigned FC", arcpy.management.CreateFeatureclass,
        ws_gdb, os.path.basename(assigned_fc), "POLYGON", None, "DISABLED", "DISABLED", D8_SR)

        assigned_id = _ensure_field(assigned_fc, ws_id_f, "LONG", fallbacks=("CWID","CW_ID","CW_Id"))

        near_fld = _find_field(miss_pts, ["NEAR_FID"])
        n_assigned = 0
        with arcpy.da.SearchCursor(miss_pts, [pp_id_f, near_fld]) as cur, \
            arcpy.da.InsertCursor(assigned_fc, [assigned_id, "SHAPE@"]) as ic:
            for cw, near_fid in cur:
                if near_fid is None or int(near_fid) < 0:
                    continue
                geom = donor_geom.get(int(near_fid))
                if geom is None:
                    continue
                ic.insertRow((int(cw), geom))
                n_assigned += 1

        _log(f"[{cat}] Assigned polygons added: {n_assigned}")

        if int(arcpy.management.GetCount(assigned_fc)[0]) > 0:
            gp(f"[{cat}] Append assigned -> final", arcpy.management.Append, assigned_fc, out_ws_lake, "NO_TEST")

            out_ws_lake_diss2 = os.path.join(ws_gdb, f"{cat}_final_diss2")
            _safe_delete(out_ws_lake_diss2)

            # Use PairwiseDissolve if available (faster)
            if hasattr(arcpy.analysis, "PairwiseDissolve"):
                gp(f"[{cat}] PairwiseDissolve (enforce 1 per CW_Id)", arcpy.analysis.PairwiseDissolve,
                out_ws_lake, out_ws_lake_diss2, ws_id_f)
            else:
                gp(f"[{cat}] Dissolve (enforce 1 per CW_Id)", arcpy.management.Dissolve,
                out_ws_lake, out_ws_lake_diss2, ws_id_f)

            _safe_delete(out_ws_lake)
            gp(f"[{cat}] Copy enforced final", arcpy.management.CopyFeatures, out_ws_lake_diss2, out_ws_lake)

        _safe_delete(miss_pts)

    ws_after_final = count_unique(out_ws_lake, ws_id_f)
    _log(f"[{cat}] final watersheds AFTER assignment-repair: {ws_after_final} (target {len(orig_ids)})")


    # --- 12) Add attributes safely (do not crash if locked)
    calculate_area_m2(out_ws_lake, "WS_AREAM2")
    add_xy_ll(out_ws_lake, prefix="WS")

    _log(f"‚úÖ final watershed: {cat} -> {os.path.basename(out_ws_lake)}")
    _clear_locks()

_log("\nüéâ DONE")


‚úÖ flowacc: S:\Projects\Active\GLB_Nutrient_Transport\DEM_rasters\GLB_Bdry_buff10km_dem_fill_flowaccu.tif
‚úÖ workspace: D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\CoastalWatersheds\Watershed_rasters\watersheds.gdb


  All temp writes stay in ws_gdb (no C:\).


‚úÖ scratch:   D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\CoastalWatersheds\Watershed_rasters\watersheds.gdb
‚úÖ D8_flow CRS: NAD_1983_Great_Lakes_Basin_Albers (factoryCode=3174)

[inStreamsWS] input: D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\Streamwatershed\PointWaterdhed_LH.shp
[inStreamsWS] sr: GCS_WGS_1984 | factoryCode=4326 | type=Geographic
[inStreamsWS] extent: XMin=929846.850 XMax=1166434.867 YMin=651452.122 YMax=1020243.911
‚ñ∂ CopyFeatures inStreamsWS
‚úÖ DONE CopyFeatures inStreamsWS (0.01 min)
[inStreamsWS] ‚ö†Ô∏è MISLABELED Geographic but coords are projected. SKIP Project; DefineProjection -> NAD_1983_Great_Lakes_Basin_Albers
‚ñ∂ DefineProjection inStreamsWS
‚úÖ DONE DefineProjection inStreamsWS (0.00 min)
‚ñ∂ Copy to output inStreamsWS
‚úÖ DONE Copy to output inStreamsWS (0.01 min)
‚ñ∂ RepairGeometry stream mask
‚úÖ DONE RepairGeometry stream mask (0.00 min)
‚ñ∂ MultipartToSinglepart stream mask
‚úÖ DONE MultipartToSinglepart stream mask (0.01 min)
‚ñ∂

In [65]:
import arcpy

def find_disconnected_ids(wet_fc, ws_fc, id_field="CW_Id"):
    arcpy.management.MakeFeatureLayer(wet_fc, "wet_lyr")
    arcpy.management.MakeFeatureLayer(ws_fc,  "ws_lyr")

    # Select watersheds that DO NOT intersect any wetland
    arcpy.management.SelectLayerByLocation(
        "ws_lyr",
        overlap_type="INTERSECT",
        select_features="wet_lyr",
        selection_type="NEW_SELECTION",
        invert_spatial_relationship="INVERT"
    )

    bad = set()
    with arcpy.da.SearchCursor("ws_lyr", [id_field]) as cur:
        for (cid,) in cur:
            if cid is not None:
                bad.add(int(cid))

    arcpy.management.Delete("wet_lyr")
    arcpy.management.Delete("ws_lyr")
    return sorted(bad)

bad_ids = find_disconnected_ids(wet_fc, out_ws_lake, id_field="CW_Id")
print("Disconnected count:", len(bad_ids))
print("Examples:", bad_ids[:20])


Disconnected count: 2578
Examples: [5295, 6056, 6172, 6174, 6245, 6328, 6344, 7088, 7596, 7924, 8040, 8693, 8704, 8712, 8809, 8867, 9107, 9231, 9587, 9965]


In [52]:
def diagnose_bad_ids(bad_ids, wet_fc, snapped_pp_fc, lake_fc, flowdir_ras, id_field="CW_Id"):
    arcpy.management.MakeFeatureLayer(wet_fc, "wet_lyr")
    arcpy.management.MakeFeatureLayer(snapped_pp_fc, "pp_lyr")

    for tid in bad_ids[:30]:  # limit print
        fldW = arcpy.AddFieldDelimiters(wet_fc, id_field)
        fldP = arcpy.AddFieldDelimiters(snapped_pp_fc, id_field)

        arcpy.management.SelectLayerByAttribute("wet_lyr", "NEW_SELECTION", f"{fldW} = {tid}")
        arcpy.management.SelectLayerByAttribute("pp_lyr",  "NEW_SELECTION", f"{fldP} = {tid}")

        wet_geom = next(arcpy.da.SearchCursor("wet_lyr", ["SHAPE@"]))[0]
        pp_geom  = next(arcpy.da.SearchCursor("pp_lyr",  ["SHAPE@"]))[0]

        dist = wet_geom.distanceTo(pp_geom)

        # inside lake?
        inside_lake = False
        if lake_fc:
            arcpy.management.MakeFeatureLayer(lake_fc, "lake_lyr")
            arcpy.management.SelectLayerByLocation("lake_lyr", "INTERSECT", pp_geom)
            inside_lake = int(arcpy.management.GetCount("lake_lyr")[0]) > 0
            arcpy.management.Delete("lake_lyr")

        # NoData check on flowdir
        cellval = arcpy.management.GetCellValue(flowdir_ras, f"{pp_geom.centroid.X} {pp_geom.centroid.Y}").getOutput(0)
        nodata = (cellval in ["NoData", None])

        print(f"CW_Id {tid}: dist={dist:.2f} m | inside_lake={inside_lake} | D8_NoData={nodata} | D8_val={cellval}")

    arcpy.management.Delete("wet_lyr")
    arcpy.management.Delete("pp_lyr")

diagnose_bad_ids(bad_ids, wet_fc, unique_snapped_pp, Lake_tgt, D8_flow, id_field="CW_Id")


CW_Id 4505: dist=96.16 m | inside_lake=True | D8_NoData=False | D8_val=64
CW_Id 4635: dist=146.96 m | inside_lake=True | D8_NoData=False | D8_val=128
CW_Id 4674: dist=125.76 m | inside_lake=True | D8_NoData=False | D8_val=1
CW_Id 6132: dist=69.60 m | inside_lake=False | D8_NoData=False | D8_val=64
CW_Id 6328: dist=87.79 m | inside_lake=False | D8_NoData=False | D8_val=1
CW_Id 6344: dist=137.36 m | inside_lake=False | D8_NoData=False | D8_val=1
CW_Id 7290: dist=103.96 m | inside_lake=False | D8_NoData=False | D8_val=64
CW_Id 7924: dist=132.98 m | inside_lake=False | D8_NoData=False | D8_val=128
CW_Id 8712: dist=49.62 m | inside_lake=False | D8_NoData=False | D8_val=64
CW_Id 8779: dist=97.66 m | inside_lake=False | D8_NoData=False | D8_val=128
CW_Id 8780: dist=1.15 m | inside_lake=True | D8_NoData=False | D8_val=64
CW_Id 8867: dist=129.64 m | inside_lake=False | D8_NoData=False | D8_val=64
CW_Id 8889: dist=53.54 m | inside_lake=False | D8_NoData=False | D8_val=32
CW_Id 8890: dist=116.52 

In [66]:
# pick a few IDs to test
test_ids = [415116]  # add more if you want

# make layers
arcpy.management.MakeFeatureLayer(unique_snapped_pp, "snapped_lyr")
arcpy.management.MakeFeatureLayer(wet_fc, "wet_lyr")

for tid in test_ids:
    fldW = arcpy.AddFieldDelimiters(wet_fc, wet_id_f)
    fldP = arcpy.AddFieldDelimiters(unique_snapped_pp, pp_id_f)

    arcpy.management.SelectLayerByAttribute("wet_lyr", "NEW_SELECTION", f"{fldW} = {tid}")
    arcpy.management.SelectLayerByAttribute("snapped_lyr", "NEW_SELECTION", f"{fldP} = {tid}")

    # get geometries
    wet_geom = next(arcpy.da.SearchCursor("wet_lyr", ["SHAPE@"]))[0]
    pp_geom  = next(arcpy.da.SearchCursor("snapped_lyr", ["SHAPE@"]))[0]

    dist = wet_geom.distanceTo(pp_geom)
    print(f"[{cat}] CW_Id {tid}: snapped->wetland distance = {dist:.2f} m")


[surge] CW_Id 415116: snapped->wetland distance = 98.53 m


In [50]:
def debug_one_cw(cat, cw_id, wet_fc, pourpoints_fc, uniq_pp_fc, ws_poly_gdb, out_gdb):
    arcpy.env.overwriteOutput = True
    if not arcpy.Exists(out_gdb):
        arcpy.management.CreateFileGDB(os.path.dirname(out_gdb), os.path.basename(out_gdb))

    def sel_to_fc(fc, where, out_name):
        out = os.path.join(out_gdb, out_name)
        if arcpy.Exists(out): arcpy.management.Delete(out)
        lyr = f"lyr_{out_name}"
        arcpy.management.MakeFeatureLayer(fc, lyr, where)
        arcpy.management.CopyFeatures(lyr, out)
        arcpy.management.Delete(lyr)
        return out

    # export the pieces
    wet_sel  = sel_to_fc(wet_fc,        f"{CW_ID_FIELD} = {cw_id}", f"{cat}_wet_{cw_id}")
    pp_sel   = sel_to_fc(pourpoints_fc, f"{CW_ID_FIELD} = {cw_id}", f"{cat}_pp_{cw_id}")
    upp_sel  = sel_to_fc(uniq_pp_fc,    f"{CW_ID_FIELD} = {cw_id}", f"{cat}_upp_{cw_id}")
    ws_sel   = sel_to_fc(ws_poly_gdb,   f"{CW_ID_FIELD} = {cw_id}", f"{cat}_ws_{cw_id}")

    # check intersection
    wet_geom = next(arcpy.da.SearchCursor(wet_sel, ["SHAPE@"]))[0]
    ws_geom  = next(arcpy.da.SearchCursor(ws_sel,  ["SHAPE@"]))[0]
    inter = not wet_geom.disjoint(ws_geom)
    print("Watershed intersects wetland?", inter)

    # distance from snapped pourpoint to wetland (meters if projected)
    upp_geom = next(arcpy.da.SearchCursor(upp_sel, ["SHAPE@"]))[0]
    print("Distance snapped pourpoint -> wetland:", upp_geom.distanceTo(wet_geom))

    return wet_sel, pp_sel, upp_sel, ws_sel


CW_Id = 415116

debug_one_cw(
    cat="avg",
    cw_id=CW_Id,
    wet_fc=erase_buffer_avg,
    pourpoints_fc=os.path.join(outPourpoints, "avg_pourpoints.shp"),
    uniq_pp_fc=os.path.join(pp_gdb, "avg_uniq_pp"),
    ws_poly_gdb=os.path.join(ws_gdb, "avg_ws_poly"),
    out_gdb=r"D:\Users\abolmaal\Arcgis\NASAOceanProject\GIS_layer\CoastalWatersheds\temp\cw_debug_avg.gdb"
)

Watershed intersects wetland? False
Distance snapped pourpoint -> wetland: 100.2234555610154


('D:\\Users\\abolmaal\\Arcgis\\NASAOceanProject\\GIS_layer\\CoastalWatersheds\\temp\\cw_debug_avg.gdb\\avg_wet_415116',
 'D:\\Users\\abolmaal\\Arcgis\\NASAOceanProject\\GIS_layer\\CoastalWatersheds\\temp\\cw_debug_avg.gdb\\avg_pp_415116',
 'D:\\Users\\abolmaal\\Arcgis\\NASAOceanProject\\GIS_layer\\CoastalWatersheds\\temp\\cw_debug_avg.gdb\\avg_upp_415116',
 'D:\\Users\\abolmaal\\Arcgis\\NASAOceanProject\\GIS_layer\\CoastalWatersheds\\temp\\cw_debug_avg.gdb\\avg_ws_415116')