# Dynamic views: Securing data at the row level using Databricks Unity Catalog

**Note: using dynamic view was the solution before Row level and column-level masking with SQL FUNCTIONS**

**We recommend using the previous [01-Row-Column-access-control]($./01-Row-Column-access-control) notebook over adding dynamic views when possible.**

As seen in the previous notebook, Unity Catalog let you grant table ACL using standard SQL GRANT on all the objects (CATALOG, SCHEMA, TABLE)

But this alone isn't enough. UC let you create more advanced access pattern to dynamically filter your data based on who query it.

This is usefull to mask sensitive PII information, or restrict access to a subset of data without having to create and maintain multiple tables.

*Note that Unity Catalog will provide more advanced data masking capabilities in the future, this demo covers what can be done now.*

*Note: This is currently only supported with shared cluster (python/SQL). Single node requires access to the underlying view*

See the [documentation](https://docs.databricks.com/security/access-control/table-acls/object-privileges.html#dynamic-view-functions) for more details.

<!-- Collect usage data (view). Remove it to disable collection. View README for more details.  -->
<img width="1px" src="https://ppxrzfxige.execute-api.us-west-2.amazonaws.com/v1/analytics?category=governance&org_id=3759185753378633&notebook=%2F02-%5Blegacy%5D-UC-Dynamic-view&demo_name=uc-01-acl&event=VIEW&path=%2F_dbdemos%2Fgovernance%2Fuc-01-acl%2F02-%5Blegacy%5D-UC-Dynamic-view&version=1">

### A cluster has been created for this demo
To run this demo, just select the cluster `dbdemos-uc-01-acl-maynard` from the dropdown menu ([open cluster configuration](https://adb-3759185753378633.13.azuredatabricks.net/#setting/clusters/0601-075249-kvumx2hn/configuration)). <br />
*Note: If the cluster was deleted after 30 days, you can re-create it with `dbdemos.create_cluster('uc-01-acl')` or re-install the demo: `dbdemos.install('uc-01-acl')`*

## Cluster setup for UC

<img src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/uc/clusters_shared.png?raw=true" width="600" style="float: right"/>

To be able to run this demo, make sure you create a cluster with the security mode enabled.

1. Go in the compute page, create a new cluster

2. Under "Access mode", select "Shared"

In [0]:
%run ./_resources/00-setup

## Current user and is member (group)

Databricks has 2 functions: `current_user()` and `is_account_group_member()`.

Theses functions can be used to dynamically get the user running the query and knowing if the user is member of a give group.

In [0]:
SELECT current_user();

In [0]:
-- Note: The account should have been setup by adding all users to the ANALYST_USA group
SELECT is_account_group_member('account users'), is_account_group_member('ANALYST_USA'), is_account_group_member('ANALYST_FR');

## Dynamic Views: Restricting data to a subset based on a field

We'll be using the previous customers table. Let's review it's content first.

*Note: Make sure you run the [previous notebook]($00-UC-Table-ACL) first*

In [0]:
SELECT * FROM customers

As you can see, this table has a `country`field. We want to be able to restrict the table access based in this country.

Data Analyst and Data Scientists in USA can only access the local Dataset, same for the FR team.

### Using groups
One option to do that would be to create groups in the Unity Catalog. We can name the groups as the concatenation of `CONCAT("ANALYST_", country)`:
* `ANALYST_FR`
* `ANALYST_USA`. 
* `ANALYST_SPAIN`. 

You can then add a view with `CASE ... WHEN` statement based on your groups to define when the data can be accessed.

See the [documentation](https://docs.databricks.com/security/access-control/table-acls/object-privileges.html#dynamic-view-functions) for more details on that.

But what makes the `is_member()` function powerful is that you can combine it with a column. Let's see how we can use it to dynamically check access based on the row.
 
We'll create a field named `group_name` as the concatenation of ANALYST and the country, and then for each value check if the current user is a member of this group:

In [0]:
-- as ANALYST from the USA (ANALYST_USA group), each USA row are now at "true"
SELECT is_account_group_member(group_name), * FROM (
  SELECT CONCAT("ANALYST_", country) AS group_name, country, id, firstname FROM customers)

As you can see, we are not admin on any of these group.
We can create a view securiting this data and only grant our analyst access to this view: 

In [0]:
CREATE OR REPLACE VIEW customer_dynamic_view  AS (
  SELECT * FROM customers as customers WHERE is_account_group_member(CONCAT("ANALYST_", country))
);
-- Then grant select access on the view only
GRANT SELECT ON VIEW customer_dynamic_view TO `account users`;

Because we're not part of any group, we won't have access to the data. Users being in the `ANALYST_FR` group will have a filter to access only the FR country.

All we have to do now is add our users to the groups to be able to have access

In [0]:
-- We should be part of the ANALYST_USA group. As result, we now have a row-level filter applied in our secured view and we only see the USA country:
select * from customer_dynamic_view

## Dynamic Views & data masking

The country example was a first level of row-level security implementation. We can implement more advances features using the same pattern.

Let's see how Dynamic views can also be used to add data masking. For this example we'll be using the `current_user()` functions.

Let's create a table with all our current analyst permission including a GDPR permission flag: `analyst_permissions`.

This table has 3 field:

* `analyst_email`: to identify the analyst (we could work with groups instead)
* `country_filter`: we'll filter the dataset based on this value
* `gdpr_filter`: if true, we'll filter the PII information from the table. If not set the user can see all the information

*Of course this could be implemented with the previous `is_account_group_member()` function instead of individual users information being saved in a permission tale.*

Let's query this table and check our current user permissions. As you can see I don't have GDPR filter enabled and a filter on FR is applied for me in the permission table we created.

In [0]:
select * from analyst_permissions where analyst_email = current_user()

In [0]:
CREATE OR REPLACE VIEW customer_dynamic_view_gdpr AS (
  SELECT 
  id ,
  creation_date,
  country,
  gender,
  age_group,
  CASE WHEN country.gdpr_filter=1 THEN sha1(firstname) ELSE firstname END AS firstname,
  CASE WHEN country.gdpr_filter=1 THEN sha1(lastname)  ELSE lastname  END AS lastname,
  CASE WHEN country.gdpr_filter=1 THEN sha1(email)     ELSE email     END AS email
  FROM 
    customers as customers INNER JOIN 
    analyst_permissions country  ON country_filter=country
  WHERE 
    country.analyst_email=current_user() 
);
-- Then grant select access on the view only
GRANT SELECT ON VIEW customer_dynamic_view_gdpr TO `account users`;


## Querying the secured view
Let's now query the view. Because I've a filter on `COUNTRY=FR`and `gdpr_filter=0`, I'll see all the FR customers information. 

In [0]:
%sql select * from customer_dynamic_view_gdpr 

Let's now change my permission. We'll enable the `gdpr_filter` flag and change our `country_filter` to USA.

As you can see, requesting the same secured view now returns all the USA customers, and PII information has been obfuscated:

In [0]:
UPDATE analyst_permissions SET country_filter='USA', gdpr_filter=1 where analyst_email=current_user();

select * from customer_dynamic_view_gdpr ;

## Conclusion

As we've seen, data masking and filtering can be implemented at a row level using groups, users and even extra table that you can use to manage more advanced permissions.

You're now ready to deploy the Lakehouse for your entire organisation, securing data based on your own governance, ensuring PII regulation and governance.