# CSV Table Ingestion

This notebook automates the ingestion of CSV files from a Unity Catalog volume into Delta tables.

**Process:**
* Scans a volume directory for CSV files
* Reads each CSV with automatic schema inference
* Creates a separate table for each file
* Handles existing tables gracefully

In [0]:
# Define the source volume path
volume_path = "/Volumes/mkr_gcp_sandbox_euw3/default/source_vol/tables/"

# List all CSV files in the volume
csv_files = [f for f in dbutils.fs.ls(volume_path) if f.name.endswith('.csv')]

print(f"Found {len(csv_files)} CSV files to ingest:\n")

# Ingest each CSV into its own table
for file_info in csv_files:
    # Extract table name from filename (remove .csv extension)
    table_name = file_info.name.replace('.csv', '')[24:]
    file_path = file_info.path
    
    print(f"Ingesting: {file_info.name}")
    print(f"  → Table: {table_name}")
    
    # Read CSV with header and infer schema
    df = spark.read.format("csv") \
        .option("header", "true") \
        .option("inferSchema", "true") \
        .load(file_path)
    
    # Write to table (creates table, errors if exists)
    try:
        df.write.mode("OVERWRITE").saveAsTable(table_name)
        row_count = df.count()
        print(f"  ✓ Created table with {row_count} rows\n")
    except Exception as e:
        if "already exists" in str(e).lower():
            print(f"  ⚠ Table already exists, skipping\n")
        else:
            print(f"  ✗ Error: {e}\n")
            raise

print("CSV ingestion process completed!")

## Ingestion Results

The cell above:
* Lists all CSV files found in the source volume
* Creates a table for each file using the filename (without `.csv` extension)
* Uses `OVERWRITE` mode to replace existing data
* Reports row counts for each successfully created table
* Skips files if tables already exist or reports errors

**Note:** Table names are derived from the CSV filenames, with the first 24 characters removed (timestamp prefix).