# Sharing data between organizations using Databricks

With Databricks' Unity Catalog and Delta Sharing, sharing data within an organization is much easier.

We often reference this as Sharing Data from Databricks to Databricks (D2D).

All you need to do is to provide your metastore id to the organization sharing the data, and they'll be able to grant you access directly, without having to worry about credential files or security.

<img width="1px" src="https://ppxrzfxige.execute-api.us-west-2.amazonaws.com/v1/analytics?category=governance&org_id=3590757798436003&notebook=%2F04-share-data-within-databricks&demo_name=delta-sharing-airlines&event=VIEW&path=%2F_dbdemos%2Fgovernance%2Fdelta-sharing-airlines%2F04-share-data-within-databricks&version=1">

In [0]:
%run ./_resources/00-setup $reset_all_data=false

## Step 1: Receiver needs to share its metastore ID

To get access to your provider's data, you need to send your metastore ID. This can be retrieved very easily.

As a **Receiver**, send your metastore ID to the provider

In [0]:
SELECT current_metastore();

## Step 2: Provider creates the recipient using the receiver metastore id

The data provider can now easily create a recipient using this metastore id:

As a **Receiver**, send your metastore ID to the provider

In [0]:
-- Start with the share creation
CREATE SHARE IF NOT EXISTS dbdemos_my_d2d_share COMMENT 'My share containing data for other organization';
-- For the demo we'll grant ownership to all users. Typical deployments would have admin groups or similar.
-- ALTER SHARE dbdemos_my_d2d_share OWNER TO `account users`;
-- Add our tables (as many as you want, see previous notebook for more details)
-- Note that we can turn on Change Data Feed on the table and share (Note:  this not yet supported with serverless/managed storage)
-- ALTER TABLE dbdemos_sharing_airlinedata.lookupcodes SET TBLPROPERTIES (delta.enableChangeDataFeed = true);
ALTER SHARE dbdemos_my_d2d_share ADD TABLE main.dbdemos_sharing_airlinedata.lookupcodes ; --WITH CHANGE DATA FEED

-- Create the recipient using the metastore id shared by the receiver (see previous cell)
CREATE RECIPIENT IF NOT EXISTS dbdemos_databricks_to_databricks_demo USING ID 'aws:us-west-2:<the_receiver_recipient>' COMMENT 'Recipient for my external customer using Databricks';
-- Grant select access to the share
GRANT SELECT ON SHARE dbdemos_my_d2d_share TO RECIPIENT dbdemos_databricks_to_databricks_demo;

## Step 3: accept and mount the share as a receiver

As a receiver, we can now see the data listed as a provider. It'll appear as `PROVIDER`.

In [0]:
SHOW PROVIDERS;

In [0]:
DESC PROVIDER `dbdemos_databricks_to_databricks_demo`

In [0]:
SHOW SHARES IN PROVIDER `dbdemos_databricks_to_databricks_demo`

To make the data available to all your organization, all you now need to do as Metastore Admin is to add a new catalog using this share.

You'll then be able to GRANT permission as you'd do with any other table, and start querying the data directly:

In [0]:
CREATE CATALOG IF NOT EXISTS USING SHARE `dbdemos_databricks_to_databricks_demo`.dbdemos_my_d2d_share;

In [0]:
SELECT * FROM  `dbdemos_databricks_to_databricks_demo`.dbdemos_my_d2d_share.lookupcodes

## Subscribing to Change Data Feed
If your data is being updated or deleted, you'll likely want to share the increments so that the external organization can access them.

A typical use-case is GDPR deletion: you want to make sure other organizations also capture this information so that they can DELETE the data downstream.

To do so, you can simply use Delta Lake `table_changes()` capability on top of your share (see [the documentation](https://docs.databricks.com/delta/delta-change-data-feed.html) for more details): 

Note that as a provider, you need to turn on CDF at the table level before:

`ALTER TABLE dbdemos_sharing_airlinedata.lookupcodes SET TBLPROPERTIES (delta.enableChangeDataFeed = true);`<br/>
`ALTER SHARE my_share ADD TABLE dbdemos_sharing_airlinedata.lookupcodes WITH CHANGE DATA FEED;`

**Note: this isn't supported yet with Databricks managed storage - stay tuned as we'll update the content accordingly**

In [0]:
SELECT * FROM table_changes('dbdemos_databricks_to_databricks_demo.dbdemos_my_d2d_share.lookupcodes', 2, 4)


# Conclusion
To recap, Delta Sharing is a cloud- and platform-agnostic solution to share your data with external consumers. 

It's simple (pure SQL), open (can be used on any system) and scalable.

All recipients can access your data, using Databricks or any other system on any Cloud.

Delta Sharing enables critical use cases around Data Sharing and Data Marketplace. 

When combined with Databricks' Unity Catalog, it's the perfect tool to accelerate your Datamesh deployment and improve your data governance.

[Back to Overview]($./01-Delta-Sharing-presentation)