## Row level access control 

<img src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/uc/acls/table_uc_rls.png?raw=true" width="200" style="float: right; margin-top: 20; margin-right: 20" alt="databricks-demos"/>

Row-level security allows you to automatically **hide a subset of your rows** based on who is attempting to query it, without having to maintain any seperate copies of your data.

A typical use-case would be to filter out rows based on your country or Business Unit : you only see the data (financial transactions, orders, customer information...) pertaining to your region, thus preventing you from having access to the entire dataset.

While this filter can be applied at the user / principal level, it is recommended to implement access policies using groups instead.
<br style="clear: both"/>

<img src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/uc/acls/table_uc_cls.png?raw=true" width="200" style="float: right; margin-top: 20; margin-right: 20; margin-left: 20" alt="databricks-demos"/>

## Column Level access control 

Similarly, column-level access control helps you **mask or anonymise the data that is in certain columns** of your table, depending on the user or service principal that is trying to access it. This is typically used to mask or remove sensitive PII informations from your end users (email, SSN...).

In [0]:
%run ./_resources/00-setup


## 1. Prepare the demo:
To see the desired results for this demo, this notebook assumes that the user 
-  __is__ a member of groups `ANALYST_USA` and `region_admin_SPAIN`
-  is __not__ a member of groups `bu_admin` and `fr_analysts`

If you are not a member of these groups, add yourself (or ask an admin) via workspace admin console:

__Workspace settings / Identity and access / Groups__

In [0]:
SELECT 
  assert_true(is_account_group_member('account users')),
  assert_true(is_account_group_member('ANALYST_USA')),
  assert_true(not is_account_group_member('bu_admin'));


In [0]:
-- Cleanup any row-filters or masks that may have been added in previous runs of the demo:
ALTER TABLE customers DROP ROW FILTER;
ALTER TABLE customers ALTER COLUMN address DROP MASK;

In [0]:
SELECT * FROM customers;

In [0]:
SELECT DISTINCT(country) FROM customers;


##  2. Row-level access control

<img src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/uc/acls/table_uc_rls.png?raw=true" width="200" style="float: right; margin-left: 20; margin-right: 20" alt="databricks-demos"/>

In this part of the demo, we will show you how you can enforce a policy where an analyst can only access data related to customers in their country.


To capture the current user and check their membership to a particular group, Databricks provides you with 2 built-in functions: 
- `current_user()`.
- `is_account_group_member()`.

In [0]:
-- Get the current user (for informational purposes):
SELECT current_user(), is_account_group_member('account users');

### 2.1. Define the access rule:

To declare an access control rule, you will need to create a SQL function that returns a **boolean**.
Unity Catalog will then hide the row if the function returns `False`.

Inside your SQL function, you can define different conditions and implement complex logic to create this boolean return value. (e.g :  `IF(condition)-THEN(view)-ELSE`)

Note that columns within whatever table that this function will be applied on, can also be referred to inside the function's conditions. You can do so by using parameters.

Here, we will apply the following logic :

1. if the user is a `bu_admin` group member, then they can access data from all countries. (we will use `is_account_group_member('group_name')` we saw earlier).

2. if the user is not a `bu_admin` group member, we'll restrict access to only the rows pertaining to regions `US` as our default regions. All other customers will be hidden!

In [0]:
CREATE OR REPLACE FUNCTION region_filter(region_param STRING)
RETURN 
  is_account_group_member('bu_admin') OR region_param LIKE 'US%';

SELECT region_filter('USA'), region_filter('SPAIN');

### 2.2. Apply the access rule:

With our rule function declared, all that's left to do is apply it on a table and see it in action!
A simple `SET ROW FILTER` followed by a call to the function is all it takes.

**Note: if this is failing, make sure you're using a Shared Cluster!**

In [0]:
-- country will be the column sent as parameter to our SQL function (region_param).
ALTER TABLE customers SET ROW FILTER region_filter ON (country);

In [0]:
SELECT * FROM customers;

In [0]:
SELECT DISTINCT(country) FROM customers;

In [0]:
-- Drop the current filter:
ALTER TABLE customers DROP ROW FILTER;

-- Confirming that we can once again see all countries:
SELECT DISTINCT(country) FROM customers;

### 2.3 More advanced dynamic filters:
Let's imagine we have a few regional user groups defined as : `ANALYST_USA`, `ANALYST_SPAIN`, etc... and we want to use these groups to *dynamically* filter on a country value. 

This can easily be done by checking the group based on the region value.

In [0]:
CREATE OR REPLACE FUNCTION region_filter_dynamic(country_param STRING)
RETURN
  is_account_group_member('bu_admin') -- bu_admin can access all regions
  OR is_account_group_member(CONCAT('ANALYST_', country_param)); --regional admins can access only if the region

ALTER TABLE customers SET ROW FILTER region_filter_dynamic ON (country);

SELECT region_filter_dynamic('USA'), region_filter_dynamic('SPAIN');

In [0]:
SELECT DISTINCT(country) FROM customers;

## 3. Column-level access control:

Note: In this demo we have only one column mask function to apply. In real life, you may want to apply different column masks on different columns within the same table.

### 3.1. Define the access rule (masking PII data):

<img src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/uc/acls/table_uc_cls.png?raw=true" width="200" style="float: right; margin-top: 20; margin-left: 20; margin-right: 20" alt="databricks-demos"/>

Declaring a rule to implement column-level access control is very similar to what we did earlier for our row-level access control rule.

In this example, we'll create a SQL function with the following `IF-THEN-ELSE` logic:

- if the current user is member of the group `bu_admin`, then return the column value as-is (here `ssn`).
- if not, mask it completely with a constant string (here `****`).

In [0]:
CREATE OR REPLACE FUNCTION simple_mask(column_value STRING)
RETURNS STRING
RETURN
  CASE WHEN is_account_group_member('bu_admin')
        THEN column_value
       ELSE 'REDACTED' 
  END;

### 3.2. Apply the access rule:

To change things a bit, instead of applying a rule on an existing table, we'll demonstrte here how we can apply a rule upon the creation of a new table.

In [0]:
CREATE OR REPLACE TABLE patient_ssn (
  name STRING,
  ssn STRING MASK simple_mask
);

INSERT INTO patient_ssn
VALUES ("Jane Doe", "111-11-1111"), ("Joe Doe", "222-33-4444");

In [0]:
SELECT * FROM patient_ssn;

## 4. Combine RL and CL access control:

<img src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/uc/acls/table_uc_rlscls.png?raw=true" width="200" style="float: right; margin-top: 20; margin-left: 20; margin-right: 20" alt="databricks-demos"/>

Let's go back to our customer table. As we apply it, let's make it's target the 'address' column!

In [0]:
ALTER TABLE customers
  ALTER COLUMN address SET MASK simple_mask;

In [0]:
SELECT * FROM customers;

## 5. Change the definition of the access control rules:
If the business ever decides to change a rule's conditions or the way they want the data to be returned in response to these conditions, it is easy to adapt with Unity Catalog.

Since the function is the central element, all you need to do is update it and the effects will automatically be reflected on all the tables that it has been attached to.

In this example, we'll rewrite our `simple_mask` column mask function and change the way we anonymse data from the rather simplistic `****`, to using the built-in sql `MASK` function ([see documentation](https://docs.databricks.com/sql/language-manual/functions/mask.html))

In [0]:
CREATE OR REPLACE FUNCTION simple_mask (maskable_param STRING)
RETURN
  IF(is_account_group_member('bu_admin'), maskable_param, MASK(maskable_param, '*', '*'));

In [0]:
SELECT * FROM customers;

## 6. Dynamic access rules with lookup data:

So we've seen how through functions, Unity Catalog give us the flexibility to overwrite the definition of an access rule but also combine multiple rules on a single table to ultimately implement complex multi-dimensional access control on our data.

Let's take a step further by adding an intermediate table describing a permission model. We'll use this table to **lookup pre-defined mappings** of users to their corresponsing data, on which we'll base the bahavior of our access control function.

### 6.1. Create the mapping data table:

In an organization where we have a user group for each supported language, we went ahead and mapped in this table each of these groups to their corresponding countries.

- The members of the `ANALYST_USA` are thus mapped to data for `USA` or `CANADA`.
- The members of the `ANALYST_SPAIN` are mapped to data for `SPAIN`, `MEXICO` or `ARGENTINA`.

In our case, our user belongs to `ANALYST_USA`.

In [0]:
CREATE TABLE IF NOT EXISTS map_country_group (
  identity_group STRING,
  countries ARRAY<STRING>
);

INSERT OVERWRITE map_country_group
VALUES ('ANALYST_FR', ARRAY('FR', 'BELGIUM', 'CANADA', 'SWITZERLAND')),
       ('ANALYST_SPAIN',  ARRAY('SPAIN', 'MEXICO', 'ARGENTINA')),
       ('ANALYST_USA', ARRAY('USA', 'CANADA'));

SELECT * FROM map_country_group;

In [0]:
-- Query the map_country_group table to see how current user is mapped:
SELECT * FROM map_country_group 
WHERE is_account_group_member(identity_group); 

### 6.2. Define the access rule with lookup data:

Let's now update our dynamic row filter function to call this new table-lookup approach.

- If the current user is in the group `bu_admin`, they will be able to see all rows.
- If the user is in another group which has a row in the `map_country_group` table, allow access to rows for the corresponding countries.

Spark optimizer will **execute** that as **an efficient JOIN** between the **map_country_group** table and **your main table**. You can check the query execution in Spark SQL UI for more details.

In [0]:
CREATE OR REPLACE FUNCTION region_filter_dynamic(region_param STRING)
RETURN 
  is_account_group_member('bu_admin') 
  OR EXISTS (SELECT 1 FROM map_country_group
             WHERE is_account_group_member(identity_group) 
             AND array_contains(countries, region_param));

In [0]:
SELECT DISTINCT(country) FROM customers;

## 7. Dissociate a rule from a table:
This dissociation of the rule from the objects you apply it to, also allows you to stop applying it on the table of your choice at any time, all without:
- Impacting the other tables this rule is attached to.
- Discontinuing the other rules that are also applied to your table.

In [0]:
-- Removing the column mask on 'address' from the 'customers' table:
ALTER TABLE customers ALTER COLUMN address DROP MASK;

In [0]:
-- Dropping the row filter:
ALTER TABLE customers DROP ROW FILTER;