## 🗂️ IndexMapr
### From field to folder—automatically mapped.
#### IndexMapr is a streamlined GIS tool that automatically organizes site photos by street and house number—creating a spatially-aware folder structure from any table. Designed for field data workflows, this tool reads your survey data (CSV, Excel, or feature class), builds a clean hierarchy of folders by location, and downloads associated images into the right place.

#### Whether you’re tracking installations, inspections, or field notes, Image Indexer turns messy tables into map-friendly media libraries—with zero stress.

In [None]:
import pandas as pd
import requests
import urllib.parse
import pathlib
import arcpy
import re
from tqdm.notebook import tqdm

"""
🗺️ IndexMapr
------------------------------------------------
IndexMapr is a geospatial utility designed for organizing field photos using structured data. 
Given a CSV, Excel file, or a table from an ArcGIS feature class, it builds a folder hierarchy by street
and house number, detects valid image URLs, and downloads those images into the appropriate subfolders.

Why it matters: In fieldwork and asset management, properly sorted images save time, reduce error,
and enhance reporting clarity. This tool bridges the gap between field data and usable outputs
in mapping workflows.

Features:
- Accepts multiple input formats (CSV, Excel, feature class)
- Detects image URL columns automatically
- Organizes outputs as: [Group] > [Street] > [House]
- Simple progress tracking with `tqdm`

❗ Note: Direct image hosting is preferred. Avoid platforms like Google Drive that serve HTML pages instead of raw image bytes.

Ideal for: planners, surveyors, field crews, GIS analysts, and anyone who has wrestled with folders full of unlabeled photos.
"""

"""
🧠 What This Tool Does — Plain English Summary
------------------------------------------------
This notebook lets you take a spreadsheet (or ArcGIS table) full of site photos and related info, and turn it
into an organized folder system. You pick where the output should go. It reads your table, figures out
which columns are images, and automatically downloads those photos into folders sorted by street and house.

For example, all houses on "Emmer Place" will go into their own folder inside a parent one called "E streets."
If the street starts with numbers like "128th", it groups those under "120s."

This saves time for teams that need quick access to field documentation — like utility inspections or tree inventory photos.
"""

# 🎯 USER INPUTS ----------------------------------------------------
input_path_raw = input("📁 Enter path to your input table (CSV, Excel, or feature class): ").strip()
input_path = input_path_raw.replace('\\', '/')  # Normalize for ArcGIS compatibility
output_dir = pathlib.Path(input("📂 Enter output folder path: ").strip())

# 📦 LOAD TABLE ------------------------------------------------------
"""
📄 Data Input Flexibility
------------------------------------------------
This tool supports multiple input types: CSV files, Excel workbooks (.xlsx or .xls),
and tables from ArcGIS geodatabases (feature classes). This flexibility makes it adaptable
for both GIS and non-GIS workflows.
"""

if input_path.lower().endswith('.csv'):
    df = pd.read_csv(input_path)
elif input_path.lower().endswith(('.xlsx', '.xls')):
    df = pd.read_excel(input_path)
elif '.gdb/' in input_path or '.gdb\\' in input_path:
    if '.gdb/' in input_path:
        ws_part, table_name = input_path.split('.gdb/', 1)
        workspace = ws_part + '.gdb'
    else:
        ws_part, table_name = input_path.split('.gdb\\', 1)
        workspace = ws_part + '.gdb'
    arcpy.env.workspace = workspace
    print(f"🧭 Loading feature class from GDB: {table_name}")
    arr = arcpy.da.TableToNumPyArray(table_name, '*')
    df = pd.DataFrame({field: arr[field].tolist() for field in arr.dtype.names})
elif arcpy.Exists(input_path):
    print(f"🧭 Loading feature class from path: {input_path}")
    arr = arcpy.da.TableToNumPyArray(input_path, '*')
    df = pd.DataFrame({field: arr[field].tolist() for field in arr.dtype.names})
else:
    raise ValueError("❌ Couldn't load input table. Double-check your path.")

# 🧹 CLEANUP & CHECKS ------------------------------------------------
col_map = {col.lower(): col for col in df.columns}
if 'street' not in col_map or 'house' not in col_map:
    raise KeyError("❌ Required columns 'Street' and 'House' are missing.")
df.rename(columns={col_map['street']: 'Street', col_map['house']: 'House'}, inplace=True)
df['Street'] = df['Street'].astype(str)

url_cols = [col for col in df.columns if df[col].astype(str).str.lower().str.startswith(('http://', 'https://')).any()]
print("🔗 Detected URL columns:", url_cols)

# 🗺️ GROUP LOGIC ------------------------------------------------------
def group_key(name):
    m = re.match(r"(\d+)", name)
    if m:
        return f"{int(m.group(1))//10*10}s"
    return f"{name[0].upper()} streets"

# 📂 FOLDER SETUP -----------------------------------------------------
"""
📁 Folder Structure Logic
------------------------------------------------
This tool creates a two-level folder grouping before reaching the house number:
1. **Group Folder**: Based on the first letter (e.g., "E streets") or numeric tens (e.g., "120s") from the street name.
   - Streets like *Emmer Court* and *Edinborough Way* go under **E streets**.
   - Streets like *128th St W* go under **120s**.
2. **Street Folder**: One folder per unique street name.
3. **House Folder**: Each house number gets its own folder within the street folder.

Example:
  _OUTPUT_FOLDER/120s/128th St W/7018/...
  _OUTPUT_FOLDER/E streets/Edinborough Way/12681/...
"""

for g in set(df['Street'].apply(group_key)):
    (output_dir / g).mkdir(parents=True, exist_ok=True)

# 📸 DOWNLOAD PHOTOS --------------------------------------------------
"""
📸 Image Download Logic
------------------------------------------------
For each row, the tool checks each URL column for a valid link (starts with http/https).
Each image is saved into its associated house folder using the column name as the filename.
If a URL points to an image (like `.jpg` or `.png`), the extension is preserved. If no extension
is found, `.jpg` is assumed. This helps ensure that the filenames stay traceable to the data
column they came from—useful for analysis, reporting, or auditing.
"""

print("\n🚀 Processing records and downloading images...")
for row in tqdm(df.itertuples(index=False), total=len(df), desc='Processing'):
    g = group_key(row.Street)
    street_dir = output_dir / g / row.Street
    street_dir.mkdir(parents=True, exist_ok=True)
    house_dir = street_dir / str(row.House)
    house_dir.mkdir(exist_ok=True)

    for col in url_cols:
        url = getattr(row, col)
        if isinstance(url, str) and url.lower().startswith(('http://', 'https://')):
            parsed = urllib.parse.urlparse(url)
            ext = pathlib.Path(parsed.path).suffix or '.jpg'
            filename = f"{col}{ext}"
            target = house_dir / filename
            try:
                r = requests.get(url, timeout=15)
                r.raise_for_status()
                with open(target, 'wb') as f:
                    f.write(r.content)
            except Exception as e:
                print(f"⚠️ Failed to download from {url}: {e}")

print("\n✅ All done! Your images are organized by group > street > house.")


📁 Enter path to your input table (CSV, Excel, or feature class):  C:\Users\leahe\Desktop\mgis\_SPRING25\GEOCOMP_GradPrj\GEOCOMP_GradPrj.gdb\DoFormsData
📂 Enter output folder path:  C:\Users\leahe\Desktop\mgis\_SPRING25\GEOCOMP_GradPrj\_OUTPUT_TESTING


🧭 Loading feature class from GDB: DoFormsData
🔗 Detected URL columns: ['meter_before_installation', 'meter_after_installation', 'sump_pump', 'sump_pump_discharge', 'customer_signature', 'minode']

🚀 Processing records and downloading images...


Processing:   0%|          | 0/10 [00:00<?, ?it/s]


✅ All done! Your images are organized by group > street > house.
