## Managing Data Security in the Hive Metastore

In Databricks, the Hive metastore has traditionally been used to manage metadata and enforce data governance. It acts as a local metadata store for tables, columns, partitions, and schemas within each workspace. Though limited to a single catalog (hive_metastore), it supports data access control at multiple levels, playing a key role in maintaining data security.

### Access Control in the Hive Metastore
Data access is managed through SQL-based commands such as `GRANT`, `REVOKE`, and `DENY`, allowing administrators to define who can view, modify, or manage data objects.

**Granting Permissions**

- Use the GRANT statement to assign privileges on data objects:
- `GRANT <privilege> ON <object-type> <object-name> TO <user>;`

### Object Privileges
| Privilege        | Description                                                      |
| ---------------- | ---------------------------------------------------------------- |
| `SELECT`         | Read/query data                                                  |
| `MODIFY`         | Insert, update, and delete data                                  |
| `CREATE`         | Create new objects (tables, views)                               |
| `READ_METADATA`  | View object structure and metadata                               |
| `USAGE`          | Required for using other privileges but has no effect on its own |
| `ALL PRIVILEGES` | Grants all of the above in one command                           |

<br>

---

### Types of Securable Objects
Permissions can be applied at different levels:
| Object Type | Scope Description                             |
| ----------- | --------------------------------------------- |
| CATALOG     | Access to the entire `hive_metastore` catalog |
| SCHEMA      | Access to a specific database/schema          |
| TABLE       | Access to managed or external tables          |
| VIEW        | Access to SQL views                           |
| FUNCTION    | Access to user-defined functions              |
| ANY FILE    | Access to underlying file storage             |


**Example:**

```
GRANT MODIFY ON TABLE product TO user1@example.com;
GRANT CREATE ON SCHEMA retaildb TO user1@example.com;
GRANT ALL PRIVILEGES ON SCHEMA retaildb TO user1@example.com;

```

### Role-Based Access Control (RBAC)
Privileges in the Hive Metastore are granted only by users with specific roles.
| Role                         | Can Grant Access To                            |
| ---------------------------- | ---------------------------------------------- |
| **Databricks Administrator** | All objects in the catalog + file system       |
| **Catalog Owner**            | All objects within the catalog                 |
| **Database Owner**           | All objects within a specific schema           |
| **Object Owner**             | Only the object (table/view/function) they own |


### Understanding the Hierarchy
1. **Catalog** (`hive_metastore`)
    - The root of the object tree. Only one catalog exists in the Hive metastore.
1. **Schema (Database)**
    - Logical grouping of objects like tables, views, and functions.
1. **Tables / Views / Functions**
    - Objects inside schemas where privileges can be assigned individually.

### Advanced Privilege Management
**🔓 REVOKE**

- Removes a previously granted privilege:
- `REVOKE <privilege> ON <object-type> <object-name> FROM <user>`

**🚫 DENY**
- Explicitly blocks access regardless of other privileges:
- `DENY <privilege> ON <object-type> <object-name> TO <user>`

📋 SHOW GRANTS
- Displays all privileges assigned on an object:
- `SHOW GRANTS ON <object-type> <object-name>`


In [0]:
-- Create the retail_db database if it does not exist
DROP DATABASE IF EXISTS hive_metastore.retail_db CASCADE;
CREATE DATABASE IF NOT EXISTS hive_metastore.retail_db
LOCATION 'dbfs:/tmp/db/retail_db.db';

In [0]:
-- Create the products table in the retail_db database
CREATE TABLE hive_metastore.retail_db.products
(product_id INT, product_name STRING, price DOUBLE, category STRING);

-- Insert sample data into the products table
INSERT INTO hive_metastore.retail_db.products
VALUES (101, "Laptop", 850.00, "Electronics"),
       (102, "Smartphone", 650.00, "Electronics"),
       (103, "Coffee Maker", 120.00, "Appliances"),
       (104, "Air Fryer", 140.00, "Appliances"),
       (105, "Desk Lamp", 45.00, "Furniture"),
       (106, "Office Chair", 210.00, "Furniture"),
       (107, "Tablet", 430.00, "Electronics");

In [0]:
-- Create a view to display only electronics products
CREATE VIEW hive_metastore.retail_db.electronics_products_vw
AS SELECT * FROM hive_metastore.retail_db.products WHERE category = 'Electronics';

In [0]:
-- Granting SELECT, MODIFY, READ_METADATA, and CREATE privileges on the schema to the retail_analysts group
GRANT SELECT, MODIFY, READ_METADATA, CREATE
ON SCHEMA hive_metastore.retail_db TO retail_analysts;

In [0]:
SHOW GRANTS ON SCHEMA hive_metastore.retail_db

In [0]:
-- users must have the USAGE privilege to perform any action on database objects
-- Granting USAGE privilege on the schema to the retail_analysts group
GRANT USAGE ON SCHEMA hive_metastore.retail_db TO retail_analysts;

In [0]:
-- Granting SELECT privilege on the view to the user
GRANT SELECT
ON VIEW hive_metastore.retail_db.electronics_products_vw TO `user2@pankajacksgmail.onmicrosoft.com`;

In [0]:
-- Reviewing assigned permissions on the view
SHOW GRANTS ON VIEW hive_metastore.retail_db.electronics_products_vw;

In [0]:
-- Denying SELECT and MODIFY privileges on the table to the user
DENY SELECT, MODIFY ON 
TABLE hive_metastore.retail_db.products 
TO `user1@pankajacksgmail.onmicrosoft.com`;

In [0]:
-- Revoking SELECT and MODIFY privileges on the table from the user
REVOKE SELECT, MODIFY ON 
TABLE hive_metastore.retail_db.products 
FROM `user1@pankajacksgmail.onmicrosoft.com`;