# Implementing a global data governance and security with Unity Catalog

<img style="float: right; margin-top: 30px" width="500px" src="https://raw.githubusercontent.com/databricks-demos/dbdemos-resources/refs/heads/main/images/manufacturing/lakehouse-iot-turbine/team_flow_emily.png" />

Data governance and security is hard when it comes to a complete Data Platform. SQL GRANT on tables isn't enough and security must be enforced for multiple data assets (dashboards, Models, files etc).

Our Data has been saved as Delta Table by our Data Engineering team.  The next step is to secure this data while allowing cross team to access it:

* Data Engineers / Jobs can read and update the main data/schemas (ETL part)
* Data Scientists can read the final tables and update their features tables
* Data Analyst have READ access to the Data Engineering and Feature Tables and can ingest/transform additional data in a separate schema.
* Data is masked/anonymized dynamically based on each user access level

This is made possible by Unity Catalog. When tables are saved in the Unity Catalog, they can be made accessible to the entire organization, cross-workpsaces and cross users:
* Fined grained ACL
* Audit log
* Data lineage
* Data exploration & discovery
* Sharing data with external organization (Delta Sharing)

In [0]:
%run ../_resources/00-setup $reset_all_data=false

In [0]:
SELECT CURRENT_CATALOG();

In [0]:
-- Our tables are available under our catalog:
SHOW TABLES;

In [0]:
-- Let's grant our ANALYSTS a SELECT permission:
-- Note: make sure you created an analysts and dataengineers group first.
GRANT SELECT ON TABLE iot_turbine.dev.sensor_bronze TO `analysts`;
GRANT SELECT ON TABLE iot_turbine.dev.sensor_hourly TO `analysts`;
GRANT SELECT ON TABLE iot_turbine.dev.historical_turbine_status TO `analysts`;

-- We'll grant an extra MODIFY to our Data Engineer:
GRANT SELECT, MODIFY ON SCHEMA iot_turbine.dev TO `dataengineers`;

In [0]:
SHOW GRANT ON SCHEMA iot_turbine.dev;


## Dynamically filtering data base on current user, row and column-level filtering:

Unity Catalog can be used to filter data and return different results based on who is querying it. Let's pretend we're based in Chicago, and we want the `parts` table to only return the parts available in the Chicago location as this is where we operate.

In [0]:
-- create the table matchying the users and the country/location
CREATE OR REPLACE TABLE parts_users_country_permission (email STRING, country STRING);

INSERT INTO parts_users_country_permission (email, country)
VALUES (current_user(), 'America/Chicago'),
       ('john@mycompany.com', 'America/Honolulu'),
       ('lea@mycompany.com', 'America/Denver');

In [0]:
-- Create our new protected view:
CREATE OR REPLACE VIEW parts_secured AS
  SELECT CASE 
          WHEN is_account_group_member('iot_admin') THEN EAN  -- allow admin to see all
          ELSE '***' -- filter other users, they won't be able to see the EAN
         END as EAN,
         p.* EXCEPT (EAN)
  FROM parts p 
  INNER JOIN parts_users_country_permission u -- Get the country/location permission table
  ON p.stock_location = u.country 
  AND (u.email = current_user() OR is_account_group_member('iot_admin'));

-- Let's test our secured view. We'll only see the 'America/Chicago' parts, and the EAN will be filtered.
SELECT * FROM parts_secured;


## Going further with Data governance & security

By bringing all your data assets together, Unity Catalog let you build a complete and simple governance to help you scale your teams.

Unity Catalog can be leveraged from simple GRANT to building a complete datamesh organization.

<img src="https://github.com/QuentinAmbard/databricks-demo/raw/main/product_demos/uc/lineage/lineage-table.gif" style="float: right; margin-left: 10px"/>

### Fine-grained ACL

Need more advanced control? You can chose to dynamically change your table output based on the user permissions: `dbdemos.intall('uc-01-acl')`

### Secure external location (S3/ADLS/GCS)

Unity Catatalog let you secure your managed table but also your external locations:  `dbdemos.intall('uc-02-external-location')`

### Lineage 

UC automatically captures table dependencies and let you track how your data is used, including at a row level: `dbdemos.intall('uc-03-data-lineage')`

This leat you analyze downstream impact, or monitor sensitive information across the entire organization (GDPR).


### Audit log

UC captures all events. Need to know who is accessing which data? Query your audit log:  `dbdemos.intall('uc-04-audit-log')`

This leat you analyze downstream impact, or monitor sensitive information across the entire organization (GDPR).

### Upgrading to UC

Already using Databricks without UC? Upgrading your tables to benefit from Unity Catalog is simple:  `dbdemos.intall('uc-05-upgrade')`