# Delta Sharing: Secure Data Sharing Across Organizations

**Delta Sharing** is an open protocol developed by Databricks for secure data sharing with other organizations regardless of the computing platforms they use. It allows you to share data "live" without copying it to another system.

### Key Features:
1.  **Open Protocol:** It is not limited to Databricks. You can share data with users using Power BI, Tableau, Pandas, Apache Spark, or any system that supports the Delta Sharing open protocol.
2.  **Live Data:** It shares the data as it exists in your Data Lake. There is no replication or movement of data.
3.  **Secure & Governed:** Managed centrally via **Unity Catalog**. You can audit who is accessing what data.
4.  **Cross-Cloud:** Share data from AWS to Azure, or Azure to GCP easily.

### Two Modes of Sharing
1.  **Databricks-to-Databricks:**
    *   Optimized experience.
    *   The recipient is another Databricks Workspace (even in a different account/cloud).
    *   Authentication is handled automatically via **Sharing Identifiers**.
2.  **Open Sharing (Token-Based):**
    *   For recipients not using Databricks.
    *   Uses a secure token and a credential file (`.share` profile).
    *   The recipient uses an open-source client (e.g., Python `delta-sharing` library) to read data.

---

### Core Terminology in Unity Catalog
| Object | Description |
| :--- | :--- |
| **Provider** | The entity sharing the data. In the recipient's workspace, this object represents the source of the data. |
| **Recipient** | The entity receiving the data. In the provider's workspace, this object represents who you are sharing with. |
| **Share** | A logical container (like a folder) that groups the tables, views, and volumes you want to share. |

## Prerequisites & Setup

### 1. Permissions
To manage Delta Sharing, you typically need **Metastore Admin** privileges or specific delegations:
*   `CREATE SHARE`
*   `CREATE RECIPIENT`
*   `CREATE PROVIDER`

### 2. Enable Delta Sharing on Metastore
If you are sharing data **outside** your Databricks account (e.g., to an external client or a different organization), you must explicitly enable Delta Sharing at the Metastore level.

**Steps:**
1.  Go to **Account Console** -> **Catalog**.
2.  Select your Metastore.
3.  Check **"Allow Delta Sharing with parties outside your organization"**.
4.  Set a default token lifetime (optional).

## Workflow: Databricks-to-Databricks Sharing

In this scenario, we share data from a **Provider Workspace (Source)** to a **Recipient Workspace (Target)**.

### Step 1: Provider Creates a "Share"
First, the data provider creates a Share object and adds assets (Tables/Volumes) to it.

In [None]:
-- %sql
-- 1. Create a Share
CREATE SHARE IF NOT EXISTS bronze_share
COMMENT 'Share containing raw bronze data tables';

-- 2. Add Tables to the Share
-- Note: You need SELECT permission on the table to add it.
ALTER SHARE bronze_share ADD TABLE dev.bronze.sales_raw;
ALTER SHARE bronze_share ADD TABLE dev.bronze.customers;

-- You can also share Volumes (Files)
ALTER SHARE bronze_share ADD VOLUME dev.bronze.raw_files_volume;

-- Check what is inside the share
SHOW ALL IN SHARE bronze_share;

### Step 2: Get Recipient's Sharing Identifier
To securely link the two metastores, the **Recipient** must provide their unique identifier.

**On the Recipient Workspace:**
1.  Go to **Catalog Explorer** -> **Delta Sharing**.
2.  Click the Organization name at the top right.
3.  Copy the **Sharing Identifier**.
    *   Format looks like: `aws:region:guid` or `azure:region:guid`.

### Step 3: Provider Creates a "Recipient"
The provider registers the recipient using the identifier obtained in the previous step.

In [None]:
-- %sql
-- Replace the string below with the actual Identifier provided by the Recipient
CREATE RECIPIENT IF NOT EXISTS external_marketing_team
USING ID 'azure:centralindia:908b9a77-1b41-411a-8b54-424eb0574d6d'
COMMENT 'External marketing agency on Azure Databricks';

### Step 4: Grant Access
Now, the Provider grants the Recipient access to the specific Share.

In [None]:
-- %sql
GRANT SELECT ON SHARE bronze_share TO RECIPIENT external_marketing_team;

-- Verify the grant
SHOW GRANTS ON SHARE bronze_share;

---
## Recipient Side: Accessing the Data

Once the Provider has granted access, the **Recipient** will see the Provider in their workspace under **Delta Sharing > Shared with me**.

To query the data, the Recipient must create a **Catalog** from the Share.

In [None]:
-- %sql
-- 1. View available providers
SHOW PROVIDERS;

-- 2. View shares available from a specific provider
SHOW SHARES IN PROVIDER `ease-with-data-provider`;

-- 3. Create a Catalog from the Share
-- This makes the shared data appear just like a local database in Unity Catalog
CREATE CATALOG IF NOT EXISTS shared_bronze_data
USING SHARE `ease-with-data-provider`.bronze_share;

-- 4. Query the data
SELECT * FROM shared_bronze_data.sales_raw LIMIT 10;

## Workflow: Open Sharing (Non-Databricks Recipient)

If the recipient is NOT on Databricks (e.g., they use generic Spark, Pandas, or Power BI), the flow changes slightly at the **Recipient Creation** step.

1.  **Create Recipient (Token Mode):** Instead of `USING ID`, you just create a name.
2.  **Generate Activation Link:** Databricks generates a URL to download a credential file (`config.share`).
3.  **Share File:** You securely send this file to the recipient.

**SQL for Open Sharing Recipient:**

In [None]:
-- %sql
-- 1. Create Recipient without an ID (implies Token based)
CREATE RECIPIENT IF NOT EXISTS powerbi_users;

-- 2. Grant access to share
GRANT SELECT ON SHARE bronze_share TO RECIPIENT powerbi_users;

-- 3. Retrieve the activation link to send to the user
DESCRIBE RECIPIENT powerbi_users;
-- The output contains an 'activation_link'. Open this URL to download the credential file.

### Python Example: Reading Open Share
The recipient can use Python to read the data using the downloaded profile file.

```python
# pip install delta-sharing

import delta_sharing

# Path to the downloaded profile file
profile_file = "config.share"

# Create a Sharing Client
client = delta_sharing.SharingClient(profile_file)

# List all shared tables
print(client.list_all_tables())

# Load a specific table as a Pandas DataFrame
table_url = profile_file + "#bronze_share.default.sales_raw"
df = delta_sharing.load_as_pandas(table_url)

display(df)

## Summary
*   **Delta Sharing** allows real-time data access without ETL or copying.
*   **Databricks-to-Databricks** sharing is seamless via Unity Catalog IDs.
*   **Open Sharing** uses tokens/profile files for broad compatibility.
*   The data remains in the Provider's storage account; the Recipient only gets read access.