
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>

# 2.1 DEMO: Implementing Delta Sharing (Databricks-to-Databricks)

## Overview
In this demo, we will implement Delta Sharing to securely share data between Databricks workspaces. This demonstration is split into two parts:

**Provider Notebook:** We will create a Unity Catalog with sample customer and sales data, configure Delta Sharing by creating a share, set up a recipient organization, and generate an activation link for secure access.

**Recipient Notebook (This Notebook):** The receiving organization (using Databricks) will mount the share in a local catalog and query the shared tables directly without copying data, and perform analytics using the shared datasets.

This demo showcases the Databricks-to-Databricks (D2D) sharing pattern, where both the provider and recipient use Databricks workspaces. You'll see how Delta Sharing enables secure, live data sharing without data duplication or complex ETL processes.

## Background

**Scenario:**
You are a in the __West Division__ of a fictitious compant __*"Acme Corp"*__.  An administrator in the __East Division__ has shared their customer data with you. In this notebook, you'll set up the recipient side of Delta Sharing by mounting the Delta Share to a catalog to enable immediate access without copying or moving data.

In [0]:
%run ./Includes/Demo-Setup-2

## Step 1: View Available Providers

Execute the command below to show all of the providers you have access to on the metastore assigned to your workspace.  

‚ö†Ô∏è The recipient user mounting the share needs to have the `USE PROVIDER` privilege for the provider OR needs to be a __Metastore Admin__ for the metastore assigned to the recipient workspace.

In [0]:
-- show all providers available (that the calling user has USE PROVIDER privilege on)
SHOW PROVIDERS;

In [0]:
-- describe a provider
DESCRIBE PROVIDER acme_corp;

## Step 2: Show the Shares Available in the Provider

Execute the command below to show all of the shares available from the provider.  

In [0]:
-- show all of the shares available in a provider
SHOW SHARES IN PROVIDER acme_corp;

## Step 3: Mount the Share to a Catalog

On the **recipient side**, a privileged user must create a catalog from the share to make the shared data accessible. This catalog acts as a container for the shared tables and allows you to query them using standard SQL.

**Permissions required (recipient side):**
- Metastore admin, OR
- User with both `CREATE CATALOG` and `USE PROVIDER` privileges, OR  
- User with `CREATE CATALOG` privilege and ownership of the provider object

The catalog created from a share has a catalog type of "Delta Sharing" and provides read-only access to the shared tables. All shared data appears under the standard three-level namespace: `catalog.schema.table`.

**Note:** If the recipient workspace is using Databricks Community Edition (Free), these permissions are automatically granted.


In [0]:
-- mount the remote share to a local catalog (recipient side)
-- this creates a Delta Sharing catalog that provides read-only access to shared tables
CREATE CATALOG IF NOT EXISTS recipient_retail_catalog
USING SHARE acme_corp.internal_retail;

You will see your mounted share in __Delta Shares Received__ in the __Catalog__ view or in __Catalog Explorer__, the schema(s) - __retail__ in this case -  are inherited from the Delta Share.
<br />
<br />
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://github.com/stackql/databricks-data-sharing-and-collaboration/blob/main/images/recipient%20-%20delta%20shares%20received.png?raw=true"
    alt="Get Provider Name"
  >
</div>
<br />
<br />

# 2.1 DEMO: Implementing Delta Sharing (Databricks-to-Databricks)

## Recipient Workspace Setup

**Learning Objectives:**
- Activate a Delta Share using an activation link
- Create a catalog from the shared data
- Query shared tables
- Understand recipient permissions and limitations

**Scenario:**
You are a data analyst at a partner organization who has been granted access to customer and sales data via Delta Sharing.

In [0]:
%run ../Includes/00-recipient-setup

## Step 1: Activate the Share

**Prerequisites:** You must have received an activation link from the provider.

The activation link looks like:
```
https://accounts.cloud.databricks.com/activation?activationCode=...
```

**To activate:**
1. Click on the activation link in your browser
2. Log in to your Databricks workspace
3. Accept the share
4. Note the share name (you'll use it to create a catalog)

## Step 2: Create a Catalog from the Share

After activating the share, create a catalog in your workspace that references the shared data.

In [0]:
%python
-- Create a catalog from the activated share
-- Replace 'share_name' with the actual share name from the activation
CREATE CATALOG IF NOT EXISTS ${c.recipient_catalog}
USING SHARE `<provider-workspace-name>`.${c.share_name}
COMMENT 'Catalog for accessing shared customer and sales data';

-- Note: Update <provider-workspace-name> with the actual provider workspace name
-- Example: `e2-demo-west`.user_customer_share

In [0]:
%python
-- Verify the catalog was created
SHOW CATALOGS LIKE '${c.recipient_catalog}';

## Step 3: Explore the Shared Data

Let's explore what's available in the shared catalog.

In [0]:
%python
-- List all schemas in the shared catalog
SHOW SCHEMAS IN ${c.recipient_catalog};

In [0]:
%python
-- List all tables in the shared schema
SHOW TABLES IN ${c.shared_schema};

## Step 4: Query Shared Tables

Now let's query the shared data. Note that you have read-only access.

In [0]:
%python
-- Query the shared customers table
SELECT * FROM ${c.shared_customers}
ORDER BY customer_id;

In [0]:
%python
-- Query the shared sales transactions table
SELECT * FROM ${c.shared_sales}
ORDER BY transaction_date DESC;

## Step 5: Perform Analytics on Shared Data

Let's run some analytical queries to demonstrate the power of Delta Sharing.

In [0]:
%python
-- Customer segmentation analysis
SELECT 
  customer_segment,
  COUNT(*) as customer_count,
  COUNT(DISTINCT country) as countries
FROM ${c.shared_customers}
GROUP BY customer_segment
ORDER BY customer_count DESC;

In [0]:
%python
-- Sales analysis by customer
SELECT 
  c.customer_name,
  c.customer_segment,
  c.country,
  COUNT(s.transaction_id) as total_transactions,
  SUM(s.total_amount) as total_revenue
FROM ${c.shared_customers} c
JOIN ${c.shared_sales} s ON c.customer_id = s.customer_id
GROUP BY c.customer_name, c.customer_segment, c.country
ORDER BY total_revenue DESC;

In [0]:
%python
-- Revenue by region and product
SELECT 
  region,
  product_name,
  SUM(quantity) as total_quantity,
  SUM(total_amount) as total_revenue
FROM ${c.shared_sales}
GROUP BY region, product_name
ORDER BY region, total_revenue DESC;

## Step 6: Create Local Views or Tables

Recipients can create local views or materialize shared data into their own tables for further analysis.

In [0]:
%python
# Create a local schema for our analysis
local_catalog = "main"  # or your default catalog
local_schema = f"{username}_analysis"

spark.sql(f"CREATE SCHEMA IF NOT EXISTS {local_catalog}.{local_schema}")
print(f"Created schema: {local_catalog}.{local_schema}")

In [0]:
%python
-- Create a view combining shared data
CREATE OR REPLACE VIEW main.${c.username}_analysis.customer_revenue_summary AS
SELECT 
  c.customer_id,
  c.customer_name,
  c.customer_segment,
  c.country,
  COUNT(s.transaction_id) as transaction_count,
  SUM(s.total_amount) as lifetime_value,
  MAX(s.transaction_date) as last_purchase_date
FROM ${c.shared_customers} c
LEFT JOIN ${c.shared_sales} s ON c.customer_id = s.customer_id
GROUP BY c.customer_id, c.customer_name, c.customer_segment, c.country;

## Step 7: Understanding Limitations

As a recipient, there are some limitations to be aware of:

In [0]:
%python
-- This will FAIL - recipients have read-only access
-- INSERT INTO ${c.shared_customers} VALUES (999, 'Test', 'test@test.com', 'USA', current_date(), 'Test');

In [0]:
%python
-- This will FAIL - cannot modify shared tables
-- UPDATE ${c.shared_customers} SET customer_segment = 'VIP' WHERE customer_id = 1;

In [0]:
%python
-- This will FAIL - cannot delete from shared tables
-- DELETE FROM ${c.shared_customers} WHERE customer_id = 1;

## Summary

‚úÖ **What we accomplished:**

1. Activated a Delta Share using the activation link
2. Created a catalog in our workspace from the share
3. Explored the shared schemas and tables
4. Queried shared data with full SQL capabilities
5. Performed analytical queries joining shared tables
6. Created local views based on shared data
7. Understood recipient limitations (read-only access)

**Key Benefits:**
- üöÄ **No Data Copying**: Query data directly from the provider
- üîí **Secure**: Provider controls access and can revoke at any time
- ‚ö° **Real-time**: Always access the latest data from the provider
- üí∞ **Cost-effective**: No storage costs for recipients
- üîß **Standard SQL**: Use familiar SQL syntax for queries

**Key Concepts:**
- **Activation**: One-time process to accept a share
- **Read-only Access**: Recipients can query but not modify shared data
- **Live Data**: Changes in provider tables are immediately visible
- **Local Materialization**: Recipients can create local copies if needed