
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>


# Controlling Access to Data
In this demo, we explore the capabilities of Databricks' metastore, focusing on fine-grained access control through column masking, row filtering, and dynamic views. We will learn to analyze the structure and components of the metastore, implement SQL queries to examine catalogs, schemas, tables, and views, and control access to data objects. Through practical exercises, we will delve into techniques such as column masking to obscure sensitive information, row filtering to selectively retrieve data based on criteria, and dynamic views for conditional access control.

### Learning Objectives
By the end of this demo, you will be able to:
1. Analyze the structure and components of a metastore.
2. Implement SQL queries to analyze current catalogs, schemas, tables, and views within a classroom setup.
3. Implement row and column security techniques such as column masking and row filtering using SQL functions.
4. Develop user-defined functions to perform column masking and row filtering based on specific criteria.
5. Design dynamic views to protect columns and rows by applying functions conditional on user identity or group membership.

## Prerequisites
In order to follow along with this demo, you will need:
* Account administrator capabilities
* Cloud resources to support the metastore
* Have metastore admin capability in order to create and manage a catalog

## REQUIRED - SELECT SERVERLESS

Before executing cells in this notebook, please select Serverless cluster in the lab. Be aware that **Serverless** is enabled by default.

Follow these steps to select the Serverless cluster:

- Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.

## A. Classroom Setup

Run the following cell to configure your working environment for this course. It will also set your default catalog to your specific catalog and the schema to the schema name shown below using the `USE` statements.
<br></br>


```
USE CATALOG <your catalog>;
USE SCHEMA <your catalog>.<schema>;
```

**NOTE:** The `DA` object is only used in Databricks Academy courses and is not available outside of these courses. It will dynamically reference the information needed to run the course.

In [0]:
%run ./Includes/Classroom-Setup-4

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


Created the silver table and vw_gold view in your catalog labuser11086062_1754032747 with the example schema.
Set the default catalog to labuser11086062_1754032747.
Set the default schema to example.


0,1
Your Unity Catalog name:,
Your Default Schema:,


Let's check the current catalog and schema

In [0]:
SELECT current_catalog(), current_schema();

current_catalog(),current_schema()
labuser11086062_1754032747,example


## Task 1: Controlling Access to Data

In this section we're going to configure permissions on data objects we created. To keep things simple, we will be granting privileges to everyone. If you're working with a group, you can have others in the group test your work by attempting to access your data objects.

### 1.1 Generate an SQL Query to Access the Table

By default, if you run this command as written in your notebook, it will execute successfully because you are querying a version of the view that you own. However, to properly test the expected failure scenario, you should attempt to run the command on a view owned by someone else.

To do this, replace the view name with the fully qualified name of a view that belongs to another user, using Unity Catalog's three-layer namespace format.

Example: `SELECT * FROM someone_elses_catalog.schema.view;`

In [0]:
SELECT * 
FROM vw_gold;

mrn,name,avg_heartrate,date
52804177,Lynn Russell,63.48006043547143,2020-02-01T00:00:00Z
40580129,,54.329958376700006,2020-02-01T00:00:00Z
65300842,Samuel Hughes,1327.5316620643855,2020-02-01T00:00:00Z
52804177,,92.5136468131,2020-02-01T00:00:00Z
40580129,Nicholas Spears,54.0142113348625,2020-02-01T00:00:00Z
65300842,,1000052.1354807864,2020-02-01T00:00:00Z


If someone else were to run this query, this would currently fail since no privileges have been granted yet. Only you (the owner) can access the table at the current time.

By default, no permissions are implied by the metastore. In order to access any data objects, users need appropriate permissions for the data object in question (a view, in this case), as well as all containing elements (the schema and catalog).

Unity Catalog's security model accommodates two distinct patterns for managing data access permissions:

1. Granting permissions in masses by taking advantage of Unity Catalog's privilege inheritance.
1. Explicitly granting permissions to specific objects. This pattern is quite secure, but involves more work to set up and administer.

We'll explore both approaches to provide an understanding of how each one works.

### 1.2 Inherited Privileges

As we've seen, securable objects in Unity Catalog are hierarchical, and privileges are inherited downward. Using this property makes it easy to set up default access rules for your data. Using privilege inheritance, let's build a permission chain that will allow anyone to access the *gold* view.

#### **NOTE:** You will encounter a **`PERMISSION_DENIED`** error because you are working in a shared training workspace. You do not have permission to provide users access to your catalog.

In [0]:
%python
spark.sql(f"GRANT MANAGE ON CATALOG {DA.catalog_name} TO `account users`")

[0;31m---------------------------------------------------------------------------[0m
[0;31mPy4JJavaError[0m                             Traceback (most recent call last)
File [0;32m<command-525414864206626>, line 1[0m
[0;32m----> 1[0m spark[38;5;241m.[39msql([38;5;124mf[39m[38;5;124m"[39m[38;5;124mGRANT MANAGE ON CATALOG [39m[38;5;132;01m{[39;00mDA[38;5;241m.[39mcatalog_name[38;5;132;01m}[39;00m[38;5;124m TO `account users`[39m[38;5;124m"[39m)

File [0;32m/databricks/spark/python/pyspark/instrumentation_utils.py:47[0m, in [0;36m_wrap_function.<locals>.wrapper[0;34m(*args, **kwargs)[0m
[1;32m     45[0m start [38;5;241m=[39m time[38;5;241m.[39mperf_counter()
[1;32m     46[0m [38;5;28;01mtry[39;00m:
[0;32m---> 47[0m     res [38;5;241m=[39m func([38;5;241m*[39margs, [38;5;241m*[39m[38;5;241m*[39mkwargs)
[1;32m     48[0m     logger[38;5;241m.[39mlog_success(
[1;32m     49[0m         module_name, class_name, function_name, time[38;5;2

If someone else were to attempt to run the query from earlier again, the query would succeed because all the appropriate permissions are in place. That is:

* `USE CATALOG` on the catalog
* `USE SCHEMA` on the schema
* `SELECT` on the view

All of these permissions were granted at the catalog level with one single statement. As convenient as this is, there are some very important things to keep in mind with this approach:

* The grantee (everyone, in this case) now has the `SELECT` privilege on **all** applicable objects (that is, tables and views) in **all** schemas within the catalog
* This privilege will also be extended to any future tables/views, as well as any future schemas that appear within the catalog

While this can be very convenient for granting access to hundreds or thousands of tables, we must be very careful how we set this up when using privilege inheritance because it's much easier to grant permissions to the wrong things accidentally. Also keep in mind the above approach is extreme. A slightly less permissive compromise can be made, while still leveraging privilege inheritance, with the following two grants. Note, you don't need to run these statements; they're merely provided as an example to illustrate the different types of privilege structures you can create that take advantage of inheritance.
<br></br>
```
GRANT USE CATALOG ON CATALOG ${clean_username} TO `account users`;
GRANT USE SCHEMA,SELECT ON CATALOG ${clean_username}.example TO `account users`
```
Basically, this pushes the `USE SCHEMA` and `SELECT` down a level, so that grantees only have access to all applicable objects in the *example* schema.


### 1.3 Revoking Privileges

No data governance platform would be complete without the ability to revoke previously issued grants. In preparation for testing the next approach to granting privileges, let's unwind what we just did using **`REVOKE`**.

#### **NOTE:** You will encounter a **`PERMISSION_DENIED`** error because you are working in a shared training workspace. You do not have permission to provide users access to your catalog.

In [0]:
%python
spark.sql(f"REVOKE USE CATALOG,USE SCHEMA,SELECT ON CATALOG {DA.catalog_name} FROM `account users`")

[0;31m---------------------------------------------------------------------------[0m
[0;31mPy4JJavaError[0m                             Traceback (most recent call last)
File [0;32m<command-525414864206629>, line 1[0m
[0;32m----> 1[0m spark[38;5;241m.[39msql([38;5;124mf[39m[38;5;124m"[39m[38;5;124mREVOKE USE CATALOG,USE SCHEMA,SELECT ON CATALOG [39m[38;5;132;01m{[39;00mDA[38;5;241m.[39mcatalog_name[38;5;132;01m}[39;00m[38;5;124m FROM `account users`[39m[38;5;124m"[39m)

File [0;32m/databricks/spark/python/pyspark/instrumentation_utils.py:47[0m, in [0;36m_wrap_function.<locals>.wrapper[0;34m(*args, **kwargs)[0m
[1;32m     45[0m start [38;5;241m=[39m time[38;5;241m.[39mperf_counter()
[1;32m     46[0m [38;5;28;01mtry[39;00m:
[0;32m---> 47[0m     res [38;5;241m=[39m func([38;5;241m*[39margs, [38;5;241m*[39m[38;5;241m*[39mkwargs)
[1;32m     48[0m     logger[38;5;241m.[39mlog_success(
[1;32m     49[0m         module_name, class_name, f

### 1.4 Explicit Privileges

Using explicit privilege grants, let's build a permission chain that will allow anyone to access the *gold* view.

**NOTE:** For user's to access a schema within a catalog you will also have to grant `USE CATALOG` on the catalog. This will not work in this shared training environment. You do not have permission to share your catalog.

In [0]:
%python
## USE CATALOG will return an error. You can grant access to objects you own within your catalog like SCHEMAS and VIEWS.
#spark.sql(f"GRANT USE CATALOG ON CATALOG {clean_username} TO `account users`")

## You own the schema and view and can grant access. You do not own the catalog.
spark.sql(f"GRANT USE SCHEMA ON SCHEMA {DA.catalog_name}.example TO `account users`")
spark.sql(f"GRANT SELECT ON VIEW {DA.catalog_name}.example.vw_gold TO `account users`")

DataFrame[]

With these grants in place, if anyone else were to query the view again, the query still succeeds because all the appropriate permissions are in place; we've just taken a very different approach to establishing them.

This seems more complicated. One statement from earlier has been replaced with three, and this only provides access to a single view. Following this pattern, we'd have to do an additional **`SELECT`** grant for each additional table or view we wanted to permit. But this complication comes with the benefit of security. Now, users can only read the *gold* view, but nothing else. There's no chance they could accidentally get access to some other object. So this is very explicit and secure, but one can imagine it would be very cumbersome when dealing with lots of tables and views.

### Views vs. Tables

We've explored two different approaches to managing permissions, and we now have permissions configured such that anyone can access the *gold* view, which processes and displays data from the *silver* table. 

But suppose someone else were to try to directly access the *silver* table. This could be accomplished by replacing *gold* in the previous query with *silver*.

With explicit privileges in place, the query would fail. How then, does the query against the *gold* view work? Because the view's **owner** has appropriate privileges on the *silver* table (through ownership). This property gives rise to interesting applications of views in table security, which we cover in the next section.

## Task 2: Row and Column Security
Column masks and row filters are techniques used in Databricks to implement fine-grained access control. These methods involve adding additional metadata to tables to specify functions that either mask column values or filter rows based on specific conditions.

To implement column masking, functions are created for each column that needs to be masked. These user-defined functions \(UDFs\) contain the logic to conditionally mask column values.

Row filters, on the other hand, allow you to apply a filter to a table so that only rows meeting certain criteria are returned in subsequent queries.

While column masking requires a separate function for each masked column, row filtering only requires a single function to filter any number of rows.

In both cases, the masking or filtering function is evaluated at query runtime, replacing references to the target column with the results of the function.

### 2.1 Column Masking
Let us implement column masking on the **silver** table and analyze it.

#### 2.1.1 Query the Table before Masking
Let us analyze the **silver** table before applying a column mask.

In [0]:
SELECT * 
FROM silver

device_id,mrn,name,time,heartrate
23,40580129,Nicholas Spears,2020-02-01T00:01:58Z,54.0122153343
17,52804177,Lynn Russell,2020-02-01T00:02:55Z,92.5136468131
37,65300842,Samuel Hughes,2020-02-01T00:08:58Z,52.1354807863
23,40580129,Nicholas Spears,2020-02-01T00:16:51Z,54.6477014191
17,52804177,Lynn Russell,2020-02-01T00:18:08Z,95.033344842
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,57.3391541312
23,40580129,Nicholas Spears,2020-02-01T00:31:58Z,56.6165053697
17,52804177,Lynn Russell,2020-02-01T00:32:56Z,94.8134313932
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,56.2469995332
23,40580129,Nicholas Spears,2020-02-01T00:46:57Z,54.8372685558


#### 2.1.2 Create a Function to Perform Column Masking


Check to see if you are a member of *metastore_admins*. View the results. Notice you are not part of *metastore_admins*.

The `is_account_group_member()` function returns *true* if the session (connected) user is a direct or indirect member of the specified group at the account level. In this example the function returns *false* since you are not a member.

View the [is_account_group_member function documentation](https://docs.databricks.com/en/sql/language-manual/functions/is_account_group_member.html) for more information.

In [0]:
SELECT is_account_group_member('metastore_admins');

is_account_group_member(metastore_admins)
False


Let us create the function **mrn_mask** to redact the **mrn** column.

In [0]:
CREATE OR REPLACE FUNCTION mrn_mask(mrn STRING)
  RETURN CASE WHEN is_member('metastore_admins') 
    THEN mrn 
    ELSE 'REDACTED' 
  END;

#### 2.1.3 Alter the Table to Apply the Mask
Let us alter the **silver** table to apply the mask function to redact the mrn column.

In [0]:
ALTER TABLE silver 
  ALTER COLUMN mrn 
  SET MASK mrn_mask;

#### 2.1.4 Query the Table with Masking
Let us analyze the **silver** table after applying the column mask. Notice that the **mrn** column is now redacted since you are not part of the group.

In [0]:
SELECT * 
FROM silver;

device_id,mrn,name,time,heartrate
23,REDACTED,Nicholas Spears,2020-02-01T00:01:58Z,54.0122153343
17,REDACTED,Lynn Russell,2020-02-01T00:02:55Z,92.5136468131
37,REDACTED,Samuel Hughes,2020-02-01T00:08:58Z,52.1354807863
23,REDACTED,Nicholas Spears,2020-02-01T00:16:51Z,54.6477014191
17,REDACTED,Lynn Russell,2020-02-01T00:18:08Z,95.033344842
37,REDACTED,Samuel Hughes,2020-02-01T00:23:58Z,57.3391541312
23,REDACTED,Nicholas Spears,2020-02-01T00:31:58Z,56.6165053697
17,REDACTED,Lynn Russell,2020-02-01T00:32:56Z,94.8134313932
37,REDACTED,Samuel Hughes,2020-02-01T00:38:54Z,56.2469995332
23,REDACTED,Nicholas Spears,2020-02-01T00:46:57Z,54.8372685558


#### 2.1.5 Alter the Table to Drop the Mask
Let us alter the **silver** table to drop the mask function.

In [0]:
ALTER TABLE silver 
  ALTER COLUMN mrn DROP MASK;

#### 2.1.6 Query the Table after Removing the Mask
Let us analyze the silver table after removing the column mask. Notice that the **mrn** column is not redacted anymore.

In [0]:
SELECT * 
FROM silver;

device_id,mrn,name,time,heartrate
23,40580129,Nicholas Spears,2020-02-01T00:01:58Z,54.0122153343
17,52804177,Lynn Russell,2020-02-01T00:02:55Z,92.5136468131
37,65300842,Samuel Hughes,2020-02-01T00:08:58Z,52.1354807863
23,40580129,Nicholas Spears,2020-02-01T00:16:51Z,54.6477014191
17,52804177,Lynn Russell,2020-02-01T00:18:08Z,95.033344842
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,57.3391541312
23,40580129,Nicholas Spears,2020-02-01T00:31:58Z,56.6165053697
17,52804177,Lynn Russell,2020-02-01T00:32:56Z,94.8134313932
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,56.2469995332
23,40580129,Nicholas Spears,2020-02-01T00:46:57Z,54.8372685558


#### 2.1.7 Drop the Mask Function
Let us drop the **mrn_mask** function that we had created earlier.

In [0]:
DROP FUNCTION IF EXISTS mrn_mask;

### 2.2 Row Filtering
Let us implement row filtering on the **silver** table and analyze it.

#### 2.2.1 Query the Table before Row Filtering
View the **silver** table with the **device_id** sorted. Notice that *30* rows are returned with **device_id** values ranging from *17* to *37*.

In [0]:
SELECT * 
FROM silver
ORDER BY device_id DESC;

device_id,mrn,name,time,heartrate
37,65300842,Samuel Hughes,2020-02-01T00:08:58Z,52.1354807863
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,57.3391541312
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,56.2469995332
37,65300842,,2020-02-01T00:08:58Z,1000052.1354807864
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,7.0
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,9000.0
37,65300842,,2020-02-01T00:08:58Z,1000052.1354807864
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,90.0
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,30.0
23,40580129,Nicholas Spears,2020-02-01T00:01:58Z,54.0122153343


#### 2.2.2 Create a Function to Perform Row Filtering


Check to see if you are a member of *admin*. View the results. Notice you are not part of *admin*.

In [0]:
SELECT is_account_group_member('admin')

is_account_group_member(admin)
False


Let us create a function **device_filter** to filter out rows whose **device_id** is less than 30 if the user is not part of the group *admin*.

In [0]:
CREATE OR REPLACE FUNCTION device_filter(device_id INT)
  RETURN IF(IS_ACCOUNT_GROUP_MEMBER('admin'), true, device_id < 30);

#### 2.2.3 Alter the Table to Apply the Row Filter
Let us alter the **silver** table to apply the row filter function to filter out rows whose device_id is less than 30.

In [0]:
ALTER TABLE silver 
SET ROW FILTER device_filter ON (device_id);

#### 2.2.4 Query the Table with Row Filtering
Let us analyze the **silver** table after applying the row filter. Notice only *21* rows are returned where **device_id** values are less than *30*.

In [0]:
SELECT * 
FROM silver
ORDER BY device_id DESC;

device_id,mrn,name,time,heartrate
23,40580129,Nicholas Spears,2020-02-01T00:01:58Z,54.0122153343
23,40580129,Nicholas Spears,2020-02-01T00:16:51Z,54.6477014191
23,40580129,Nicholas Spears,2020-02-01T00:31:58Z,56.6165053697
23,40580129,Nicholas Spears,2020-02-01T00:46:57Z,54.8372685558
23,40580129,,2020-02-01T00:01:58Z,54.0122153343
23,40580129,,2020-02-01T00:16:51Z,54.6477014191
23,40580129,Nicholas Spears,2020-02-01T00:31:58Z,6.0
23,40580129,Nicholas Spears,2020-02-01T00:46:57Z,66.0
23,40580129,,2020-02-01T00:01:58Z,54.0122153343
23,40580129,,2020-02-01T00:16:51Z,54.6477014191


#### 2.2.5 Alter the Table to Drop the Row Filter
Let us alter the **silver** table to drop the row filter function.

In [0]:
ALTER TABLE silver DROP ROW FILTER;

#### 2.2.6 Query the Table after Removing the Row Filter
Let us analyze the **silver** table after removing the row filter. Notice that all 30 rows are returned.

In [0]:
SELECT * 
FROM silver
ORDER BY device_id DESC

device_id,mrn,name,time,heartrate
37,65300842,Samuel Hughes,2020-02-01T00:08:58Z,52.1354807863
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,57.3391541312
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,56.2469995332
37,65300842,,2020-02-01T00:08:58Z,1000052.1354807864
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,7.0
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,9000.0
37,65300842,,2020-02-01T00:08:58Z,1000052.1354807864
37,65300842,Samuel Hughes,2020-02-01T00:23:58Z,90.0
37,65300842,Samuel Hughes,2020-02-01T00:38:54Z,30.0
23,40580129,Nicholas Spears,2020-02-01T00:01:58Z,54.0122153343


#### 2.2.7 Drop the Row Filter Function
Let us drop the **device_filter** function that we had created earlier.

In [0]:
DROP FUNCTION IF EXISTS device_filter;

## Task 3: Protecting Columns and Rows with Dynamic Views

Now, let's explore dynamic views, an alternative approach to securing rows and columns. While they have been available in Databricks for some time, they are no longer the primary method for controlling access to rows and columns. However, they still serve a purpose in certain scenarios.

We have seen that Unity Catalog's treatment of views provides the ability for views to protect access to tables; users can be granted access to views that manipulate, transform, or obscure data from a source table, without needing to provide direct access to the source table.

Dynamic views provide the ability to do fine-grained access control of columns and rows within a table, conditional on the principal running the query. Dynamic views are an extension to standard views that allow us to do things like:
* partially obscure column values or redact them entirely
* omit rows based on specific criteria

Access control with dynamic views is achieved through the use of functions within the definition of the view. These functions include:
* **`current_user()`**: returns the email address of the user querying the view
* **`is_account_group_member()`**: returns TRUE if the user querying the view is a member of the specified group
* **`is_member()`**: returns TRUE if the user querying the view is a member of the specified workspace-local group

Note: Databricks generally advises against using the **`is_member()`** function in production, since it references workspace-local groups and hence introduces a workspace dependency into a metastore that potentially spans multiple workspaces.

### 3.1 Redacting columns

Suppose we want everyone to be able to see aggregated data trends from the *gold* view, but we don't want to disclose patient PII to everyone. Let's redefine the view to redact the *mrn* and *name* columns, so that only members of *metastore_admins* can see it, using the **`is_account_group_member()`** function.

#### 3.1.1 Recreate the View
Let us recreate a **gold** view while redacting the mrn and name columns.

In [0]:
SELECT is_account_group_member('metastore_admins')

is_account_group_member(metastore_admins)
False


In [0]:
CREATE OR REPLACE VIEW vw_gold AS
SELECT
  CASE WHEN
    is_account_group_member('metastore_admins') THEN mrn 
    ELSE 'REDACTED'
  END AS mrn,
  CASE WHEN
    is_account_group_member('metastore_admins') THEN name
    ELSE 'REDACTED'
  END AS name,
  MEAN(heartrate) avg_heartrate,
  DATE_TRUNC("DD", time) date
FROM silver
GROUP BY mrn, name, DATE_TRUNC("DD", time);

#### 3.1.2 Re-issue Grant Access to View
We'll re-issue the grant since the above statement replaced the previous object and thus any grants applied directly on the object would have been lost.

In [0]:
GRANT SELECT ON VIEW vw_gold TO `account users`

#### 3.1.3 Query the View
Now let's query the view.

In [0]:
SELECT * 
FROM vw_gold

mrn,name,avg_heartrate,date
REDACTED,REDACTED,63.48006043547143,2020-02-01T00:00:00Z
REDACTED,REDACTED,54.329958376700006,2020-02-01T00:00:00Z
REDACTED,REDACTED,1327.5316620643855,2020-02-01T00:00:00Z
REDACTED,REDACTED,92.5136468131,2020-02-01T00:00:00Z
REDACTED,REDACTED,54.0142113348625,2020-02-01T00:00:00Z
REDACTED,REDACTED,1000052.1354807864,2020-02-01T00:00:00Z


Does this output surprise you?

As the owner of the view and table, you do not need any privileges to access these objects, yet when querying the view, we see redacted columns. This is because of the way the view is defined. As a regular user (one who is not a member of the **`metastore_admins`** group), the *mrn* and *name* columns are redacted.

### 3.2 Restrict Rows

Now let's suppose we want a view that, rather than aggregating and redacting columns, simply filters out rows from the source. Let's  apply the same **`is_account_group_member()`** function to create a view that passes through only rows whose *device_id* is less than 30. Row filtering is done by applying the conditional as a **`WHERE`** clause.

#### 3.2.1 Recreate the View
Let us recreate a **gold** view while filtering out the rows from the source.

In [0]:
CREATE OR REPLACE VIEW vw_gold AS
SELECT
  mrn,
  time,
  device_id,
  heartrate
FROM silver
WHERE
  CASE WHEN
    is_account_group_member('metastore_admins') THEN TRUE
    ELSE device_id < 30
  END;

#### 3.2.2 Re-issue Grant Access to View
We'll re-issue the grant since the above statement replaced the previous object and thus any grants applied directly on the object would have been lost.

In [0]:
-- Re-issue the grant --
GRANT SELECT ON VIEW vw_gold TO `account users`

#### 3.2.3 Query the View
Now let's query the view.

In [0]:
SELECT * 
FROM vw_gold
ORDER BY device_id DESC

mrn,time,device_id,heartrate
40580129,2020-02-01T00:01:58Z,23,54.0122153343
40580129,2020-02-01T00:16:51Z,23,54.6477014191
40580129,2020-02-01T00:31:58Z,23,56.6165053697
40580129,2020-02-01T00:46:57Z,23,54.8372685558
40580129,2020-02-01T00:01:58Z,23,54.0122153343
40580129,2020-02-01T00:16:51Z,23,54.6477014191
40580129,2020-02-01T00:31:58Z,23,6.0
40580129,2020-02-01T00:46:57Z,23,66.0
40580129,2020-02-01T00:01:58Z,23,54.0122153343
40580129,2020-02-01T00:16:51Z,23,54.6477014191


Nine records are omitted. Those records contained values for *device_id* that were caught by the filter. Only members of **`metastore_admins`** would see those.

### 3.3 Data Masking
One final use case for dynamic views is data masking, or partially obscuring data. This is fairly common practice (for example, displaying the last 4 digits of a credit card number, or the last two digits of a phone number). Masking is similar in principle to redaction except we are displaying some of the data rather than displaying none of it. And for this simple example, we'll leverage the *dbacademy_mask()* user-defined function that we created earlier to mask the *mrn* column.

#### 3.3.1 Recreate the View
Let us recreate a **gold** view and apply data masking on the **mrn** column. We'll also re-issue the grant since the below statement will replace the previous object and thus any grants applied previously on the object will be lost.

In [0]:
-- Create function
CREATE OR REPLACE FUNCTION dbacademy_mask(x STRING)
  RETURNS STRING
  RETURN CONCAT(LEFT(x, 2) , REPEAT("*", LENGTH(x) - 2));


-- Create view
CREATE OR REPLACE VIEW vw_gold AS
SELECT
  CASE WHEN
    is_account_group_member('metastore_admins') THEN mrn
    ELSE dbacademy_mask(mrn)
  END AS mrn,
  time,
  device_id,
  heartrate
FROM silver
WHERE
  CASE WHEN
    is_account_group_member('metastore_admins') THEN TRUE
    ELSE device_id < 30
  END;


-- Re-issue the grant --
GRANT SELECT ON VIEW vw_gold TO `account users`;

#### 3.3.2 Query the View
Now let's query the view.

In [0]:
SELECT * 
FROM vw_gold
ORDER BY device_id DESC;

mrn,time,device_id,heartrate
40******,2020-02-01T00:01:58Z,23,54.0122153343
40******,2020-02-01T00:16:51Z,23,54.6477014191
40******,2020-02-01T00:31:58Z,23,56.6165053697
40******,2020-02-01T00:46:57Z,23,54.8372685558
40******,2020-02-01T00:01:58Z,23,54.0122153343
40******,2020-02-01T00:16:51Z,23,54.6477014191
40******,2020-02-01T00:31:58Z,23,6.0
40******,2020-02-01T00:46:57Z,23,66.0
40******,2020-02-01T00:01:58Z,23,54.0122153343
40******,2020-02-01T00:16:51Z,23,54.6477014191


For us, all values in the **mrn** column will be masked.

## Conclusion
In this demo, we delved into the functionalities of Databricks' metastore, emphasizing fine-grained access control through techniques such as column masking, row filtering, and dynamic views. By analyzing the structure and components of the metastore, implementing SQL queries, and managing permissions on data objects, we gained a comprehensive understanding of metadata management and security measures. Through practical exercises, we learned to implement row and column security techniques, including creating user-defined functions for masking and filtering, and designing dynamic views for conditional access control based on user identity or group membership. This hands-on exploration equipped us with essential skills to effectively manage metadata and enforce security policies to safeguard sensitive data within a Databricks environment.


&copy; 2025 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="blank">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy" target="blank">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use" target="blank">Terms of Use</a> | 
<a href="https://help.databricks.com/" target="blank">Support</a>
