# Unity Catalog External Locations - Quick Reference

## What is an External Location?

An **External Location** is a Unity Catalog object that:
* **Registers** an external storage path (e.g., Azure ADLS, AWS S3) with Unity Catalog
* **Maps** Unity Catalog table permissions to cloud storage access permissions
* **Does NOT create or modify** the actual storage - it only registers metadata

**Analogy**: Like giving Unity Catalog a key and address to your existing storage house.

---

## When Do You Need External Locations?

### ✅ Required for External Tables
```python
# Using custom storage path = Need External Location first
df.write.option("path", "abfss://raw@storage.dfs.core.windows.net/points") \
    .saveAsTable("main.demo.points_raw")
```

```sql
-- SQL with LOCATION clause = Need External Location first
CREATE TABLE points_raw
USING DELTA
LOCATION 'abfss://raw@storage.dfs.core.windows.net/points';
```

### ❌ NOT Required for Managed Tables
```python
# No path specified = Unity Catalog manages storage automatically
df.write.saveAsTable("main.demo.points_raw")
```

---

## Setup Process (One-Time)

### Step 1: Check Existing Storage Credentials
```sql
SHOW STORAGE CREDENTIALS;
```

### Step 2: Check Existing External Locations
```sql
SHOW EXTERNAL LOCATIONS;
```

### Step 3: Create External Location (if needed)
```sql
CREATE EXTERNAL LOCATION IF NOT EXISTS raw_container
URL 'abfss://raw@trimblegeospatialdemo.dfs.core.windows.net/'
WITH (STORAGE CREDENTIAL `trimble_adls_credential`)
COMMENT 'Raw point cloud data storage';
```

### Step 4: Verify Creation
```sql
DESCRIBE EXTERNAL LOCATION raw_container;
```

---

## External Table vs Managed Table

| Feature | External Table | Managed Table |
|---------|----------------|---------------|
| **Storage Control** | You control path | Unity Catalog manages |
| **External Location** | ✅ Required | ❌ Not required |
| **DROP TABLE** | Deletes metadata only | Deletes data + metadata |
| **Use Case** | Shared storage, existing data | Standard data warehouse |
| **Setup Complexity** | Higher (need External Location) | Lower (automatic) |

---

## Production Workflow

### One-Time Setup (Admin)
```sql
-- Register all storage containers as External Locations
CREATE EXTERNAL LOCATION raw_container
  URL 'abfss://raw@storage.dfs.core.windows.net/' ...;

CREATE EXTERNAL LOCATION processed_container
  URL 'abfss://processed@storage.dfs.core.windows.net/' ...;

CREATE EXTERNAL LOCATION aggregated_container
  URL 'abfss://aggregated@storage.dfs.core.windows.net/' ...;
```

### Daily Development (Data Engineers)
```python
# ✅ Directly use registered paths
df.write.option("path", "abfss://raw@.../points") \
    .saveAsTable("main.demo.points_raw")

df.write.option("path", "abfss://processed@.../bronze") \
    .saveAsTable("main.demo.bronze_points")
```

---

## Key Takeaways

1. **External Location = Registration, NOT Creation**
   * Your storage already exists
   * External Location just tells Unity Catalog "this path is trusted"

2. **One-Time Setup**
   * Register External Locations once
   * Use them forever for all tables in that path

3. **Rule of Thumb**
   * Using `LOCATION` or `.option("path")` → Need External Location
   * No path specified → Managed Table (no External Location needed)

4. **Current Setup (Trimble Geospatial Demo)**
   * ✅ `raw_container` → `abfss://raw@trimblegeospatialdemo.dfs.core.windows.net/`
   * ✅ `processed_container` → `abfss://processed@...`
   * ✅ `aggregated_container` → `abfss://aggregated@...`
   * ✅ Storage Credential: `trimble_adls_credential`

---

## Common Commands

```sql
-- List all external locations
SHOW EXTERNAL LOCATIONS;

-- Get details of specific location
DESCRIBE EXTERNAL LOCATION raw_container;

-- Check permissions
SHOW GRANTS ON EXTERNAL LOCATION raw_container;

-- Drop external location (does NOT delete storage data)
DROP EXTERNAL LOCATION IF EXISTS old_location;
```