## Gravitino access control

This demo shows that authorizing the Hive Catalog through Gravitino and then using Spark to query this hive datasource authenticates the user's operation, allowing or denying the user's operation. 
The operating username in this demo is `root`. You can log in to the Apache Ranger admin service to see the permissions.

+ Apache Ranger admin service: http://localhost:6080/, the login user name is `admin` and the password is `rangerR0cks!`.
+ Apache Gravitino access control document: https://gravitino.apache.org/docs/latest/security/access-control

### Initail PySpark

In [None]:
import pyspark
from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("PySpark SQL Example") \
    .config("spark.plugins", "org.apache.gravitino.spark.connector.plugin.GravitinoSparkPlugin") \
    .config("spark.sql.gravitino.uri", "http://gravitino:8090") \
    .config("spark.sql.gravitino.metalake", "metalake_demo") \
    .config("spark.sql.gravitino.enableIcebergSupport", "true") \
    .config("spark.sql.catalog.catalog_rest", "org.apache.iceberg.spark.SparkCatalog") \
    .config("spark.sql.catalog.catalog_rest.type", "rest") \
    .config("spark.sql.catalog.catalog_rest.uri", "http://gravitino:9001/iceberg/") \
    .config("spark.locality.wait.node", "0") \
    .config("spark.sql.warehouse.dir", "hdfs://hive:9000/user/hive/warehouse") \
    .config("spark.jars", "/opt/spark/jars/*") \
    .config("spark.driver.extraClassPath", "/opt/spark/conf") \
    .config("spark.sql.extensions", "org.apache.kyuubi.plugin.spark.authz.ranger.RangerSparkExtension")\
    .enableHiveSupport() \
    .getOrCreate()

### Show databases list under the catalog_hive

In [None]:
spark.sql("use catalog_hive")
spark.sql("show databases").show()

### Anyone have permssion to create databases but doesn't create tables

In [None]:
spark.sql("CREATE DATABASE IF NOT EXISTS access_control;")
spark.sql("use catalog_hive")
spark.sql("show databases").show()

In [None]:
from py4j.protocol import Py4JJavaError

try:
    spark.sql("USE access_control;")
    spark.sql("CREATE TABLE IF NOT EXISTS employees (id INT, name STRING, age INT) PARTITIONED BY (department STRING) STORED AS PARQUET;")
except Py4JJavaError as e:
    print("An error occurred: ", e.java_exception)

### Grant Spark execute user `root` has permission to `create employees` table

#### Add Spark execute user `root` into Gravitino
+ https://gravitino.apache.org/docs/0.6.0-incubating/security/access-control#add-a-user

In [None]:
import requests
import json

headers = {
    'Accept': 'application/vnd.gravitino.v1+json',
    'Content-Type': 'application/json',
}

data = {
  "name": "root"
}

response = requests.post('http://gravitino:8090/api/metalakes/metalake_demo/users', headers=headers, data=json.dumps(data))

# print the response text (the content of the requested file):
print(response.text)

#### Create a role have `catalog_hive.access_control` privileges `CREATE_TABLE`, `MODIFY_TABLE`, `SELECT_TABLE`
+ https://gravitino.apache.org/docs/0.6.0-incubating/security/access-control#create-a-role

In [None]:
import requests
import json

url = "http://gravitino:8090/api/metalakes/metalake_demo/roles"
headers = {
    "Accept": "application/vnd.gravitino.v1+json",
    "Content-Type": "application/json",
}
data = {
    "name": "role1",
    "properties": {"k1": "v1"},
    "securableObjects": [
        {
            "fullName": "catalog_hive.access_control",
            "type": "SCHEMA",
            "privileges": [
                {
                    "name": "CREATE_TABLE",
                    "condition": "ALLOW"
                },
                {
                    "name": "MODIFY_TABLE",
                    "condition": "ALLOW"
                },
                {
                    "name": "SELECT_TABLE",
                    "condition": "ALLOW"
                }
            ]    
        }
    ]
}

response = requests.post(url, headers=headers, data=json.dumps(data))

# 打印响应内容
print(response.text)

#### Grant role to Spark execute user `root`
+ https://gravitino.apache.org/docs/0.6.0-incubating/security/access-control#grant-roles-to-a-user

In [None]:
import requests
import json

url = "http://gravitino:8090/api/metalakes/metalake_demo/permissions/users/root/grant"
headers = {
    "Accept": "application/vnd.gravitino.v1+json",
    "Content-Type": "application/json",
}
data = {
    "roleNames": ["role1"]
}

response = requests.put(url, headers=headers, data=json.dumps(data))

# print status code and response text
print(response.status_code)
print(response.text)

#### Currently, Gravitino-0.6.0 have a bug, need manual add table=`*` into `catalog_hive.access_control` Ranger policy

#### Spark execute user `root` have permssion to create tables under the `catalog_hive.access_control`

In [None]:
from py4j.protocol import Py4JJavaError

try:
    spark.sql("use catalog_hive")
    spark.sql("use access_control;")
    spark.sql("create table customers (customer_id int, customer_name varchar(100), customer_email varchar(100));")
    spark.sql("show tables in access_control").show()
except Py4JJavaError as e:
    print("An error occurred: ", e.java_exception)

In [None]:
spark.sql("use catalog_hive")
spark.sql("use access_control;")
spark.sql("insert into customers (customer_id, customer_name, customer_email) values (11,'Rory Brown','rory@123.com');")
spark.sql("insert into customers (customer_id, customer_name, customer_email) values (12,'Jerry Washington','jerry@dt.com');")
spark.sql("select * from customers").show()

#### Delete this role, the `root` doesn't have permission access `customers` table
+ https://gravitino.apache.org/docs/0.6.0-incubating/security/access-control#delete-a-role

In [None]:
import requests

headers = {
    'Accept': 'application/vnd.gravitino.v1+json',
    'Content-Type': 'application/json',
}

response = requests.delete('http://gravitino:8090/api/metalakes/metalake_demo/roles/role1', headers=headers)

# print the response text (the content of the requested file):
print(response.text)

In [None]:
spark.sql("show tables in access_control").show()

In [None]:
from pyspark.sql.utils import AnalysisException

try:
    spark.sql("use catalog_hive")
    spark.sql("use access_control;")
    spark.sql("SELECT * from employees").show()
except AnalysisException as e:
    print("An error occurred: ", e)