Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.

## [Unreleased]

### Added

- Add OPA authorization using the operator-rs `OpaConfig` ([#652]).

[#652]: https://github.com/stackabletech/hive-operator/pull/652

## [25.11.0] - 2025-11-07

## [25.11.0-rc1] - 2025-11-06
Expand Down
27 changes: 27 additions & 0 deletions deploy/helm/hive-operator/crds/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,33 @@ spec:
required:
- kerberos
type: object
authorization:
description: |-
Authorization options for Hive.
Learn more in the [Hive authorization usage guide](https://docs.stackable.tech/home/nightly/hive/usage-guide/security#authorization).
nullable: true
properties:
opa:
description: |-
Configure the OPA stacklet [discovery ConfigMap](https://docs.stackable.tech/home/nightly/concepts/service_discovery)
and the name of the Rego package containing your authorization rules.
Consult the [OPA authorization documentation](https://docs.stackable.tech/home/nightly/concepts/opa)
to learn how to deploy Rego authorization rules with OPA.
nullable: true
properties:
configMapName:
description: |-
The [discovery ConfigMap](https://docs.stackable.tech/home/nightly/concepts/service_discovery)
for the OPA stacklet that should be used for authorization requests.
type: string
package:
description: The name of the Rego package containing the Rego rules for the product.
nullable: true
type: string
required:
- configMapName
type: object
type: object
database:
description: Database connection specification for the metadata database.
properties:
Expand Down
115 changes: 115 additions & 0 deletions docs/modules/hive/pages/usage-guide/security.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,118 @@ The `kerberos.secretClass` is used to give Hive the possibility to request keyta
=== 5. Access Hive
In case you want to access Hive it is recommended to start up a client Pod that connects to Hive, rather than shelling into the master.
We have an https://github.com/stackabletech/hive-operator/blob/main/tests/templates/kuttl/kerberos/70-install-access-hive.yaml.j2[integration test] for this exact purpose, where you can see how to connect and get a valid keytab.


== Authorization
The Stackable Operator for Apache Hive supports the following authorization methods.

=== Open Policy Agent (OPA)
The Apache Hive metastore can be configured to delegate authorization decisions to an Open Policy Agent (OPA) instance.
More information on the setup and configuration of OPA can be found in the xref:opa:index.adoc[OPA Operator documentation].
A Hive cluster can be configured using OPA authorization by adding this section to the configuration:

[source,yaml]
----
spec:
clusterConfig:
authorization:
opa:
configMapName: opa # <1>
package: hms # <2>
----
<1> The name of your OPA Stacklet (`opa` in this case)
<2> The rego rule package to use for policy decisions.
This is optional and defaults to the name of the Hive Stacklet.

==== Defining rego rules
For a general explanation of how rules are written, please refer to the {opa-rego-docs}[OPA documentation].
Authorization with OPA is done using the https://github.com/boschglobal/hive-metastore-opa-authorizer[hive-metastore-opa-authorizer] plugin.

===== OPA Inputs
The payload sent by Hive with each request to OPA, that is accessible within the rego rules, has the following structure:

[source,json]
----
{
"identity": {
"username": "<user>",
"groups": ["<group1>", "<group2>"]
},
"resources": {
"database": null,
"table": null,
"partition": null,
"columns": ["col1", "col2"]
},
"privileges": {
"readRequiredPriv": [],
"writeRequiredPriv": [],
"inputs": null,
"outputs": null
}
}
----
* `identity`: Contains user information.
** `username`: The name of the user.
** `groups`: A list of groups the user belongs to.
* `resources`: Specifies the resources involved in the request.
** `database`: The database object.
** `table`: The table object.
** `partition`: The partition object.
** `columns`: A list of column names involved in the request.
* `privileges`: Details the privileges required for the request.
** `readRequiredPriv`: A list of required read privileges.
** `writeRequiredPriv`: A list of required write privileges.
** `inputs`: Input tables for the request.
** `outputs`: Output tables for the request.

===== Example OPA Rego Rule
Below is a basic rego rule that demonstrates how to handle input dictionary sent from the hive authorizer to OPA:

[source,rego]
----
package hms

default database_allow = false
default table_allow = false
default column_allow = false
default partition_allow = false
default user_allow = false

database_allow if {
input.identity.username == "stackable"
input.resources.database.name == "test_db"
}

table_allow if {
input.identity.username == "stackable"
input.resources.table.dbName == "test_db"
input.resources.table.tableName == "test_table"
input.privileges.readRequiredPriv[0].priv == "SELECT"
}

table_allow if {
input.identity.username == "stackable"
input.resources.table.dbName == "test_db"
input.privileges.writeRequiredPriv[0].priv == "CREATE"
}
----
* `database_allow` grants access if the user is `stackable` and the database is `test_db`.
* `table_allow` grants access if the user is `stackable`, the database is `test_db` and:
** the table is `test_table` and the required read privilege is `SELECT`.
** the required write privilege is `CREATE` without any table restriction.

==== Configuring policy URLs

The `database_allow`, `table_allow`, `column_allow`, `partition_allow`, and `user_allow` policy URLs can be (config) overriden using the properties in `hive-site.xml`:

* `com.bosch.bdps.opa.authorization.policy.url.database`
* `com.bosch.bdps.opa.authorization.policy.url.table`
* `com.bosch.bdps.opa.authorization.policy.url.column`
* `com.bosch.bdps.opa.authorization.policy.url.partition`
* `com.bosch.bdps.opa.authorization.policy.url.user`

==== TLS secured OPA cluster

Stackable OPA clusters secured via TLS are supported and no further configuration is required.
The Stackable Hive operator automatically adds the certificate from the SecretClass used to secure the OPA cluster to its trust.
89 changes: 89 additions & 0 deletions examples/hive-opa-cluster.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# helm install postgresql oci://registry-1.docker.io/bitnamicharts/postgresql \
# --version 16.5.0 \
# --namespace default \
# --set image.repository=bitnamilegacy/postgresql \
# --set volumePermissions.image.repository=bitnamilegacy/os-shell \
# --set metrics.image.repository=bitnamilegacy/postgres-exporter \
# --set global.security.allowInsecureImages=true \
# --set auth.username=hive \
# --set auth.password=hive \
# --set auth.database=hive \
# --set primary.extendedConfiguration="password_encryption=md5" \
# --wait
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: hive
spec:
image:
productVersion: 4.1.0
pullPolicy: IfNotPresent
clusterConfig:
authorization:
opa:
configMapName: opa
package: hms
database:
connString: jdbc:postgresql://postgresql:5432/hive
credentialsSecret: hive-postgresql-credentials
dbType: postgres
metastore:
roleGroups:
default:
replicas: 1
config:
resources:
cpu:
min: 300m
max: "2"
memory:
limit: 5Gi
---
apiVersion: v1
kind: Secret
metadata:
name: hive-postgresql-credentials
type: Opaque
stringData:
username: hive
password: hive
---
apiVersion: opa.stackable.tech/v1alpha1
kind: OpaCluster
metadata:
name: opa
spec:
image:
productVersion: 1.8.0
servers:
config:
logging:
enableVectorAgent: false
containers:
opa:
console:
level: INFO
file:
level: INFO
loggers:
decision:
level: INFO
roleGroups:
default: {}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: hive-opa-bundle
labels:
opa.stackable.tech/bundle: "hms"
data:
hive.rego: |
package hms

database_allow = true
table_allow = true
column_allow = true
partition_allow = true
user_allow = true
22 changes: 17 additions & 5 deletions rust/operator-binary/src/command.rs
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
use stackable_operator::crd::s3;

use crate::crd::{
DB_PASSWORD_ENV, DB_PASSWORD_PLACEHOLDER, DB_USERNAME_ENV, DB_USERNAME_PLACEHOLDER,
HIVE_METASTORE_LOG4J2_PROPERTIES, HIVE_SITE_XML, STACKABLE_CONFIG_DIR,
STACKABLE_CONFIG_MOUNT_DIR, STACKABLE_LOG_CONFIG_MOUNT_DIR, STACKABLE_TRUST_STORE,
STACKABLE_TRUST_STORE_PASSWORD, v1alpha1,
use crate::{
config::opa::HiveOpaConfig,
crd::{
DB_PASSWORD_ENV, DB_PASSWORD_PLACEHOLDER, DB_USERNAME_ENV, DB_USERNAME_PLACEHOLDER,
HIVE_METASTORE_LOG4J2_PROPERTIES, HIVE_SITE_XML, STACKABLE_CONFIG_DIR,
STACKABLE_CONFIG_MOUNT_DIR, STACKABLE_LOG_CONFIG_MOUNT_DIR, STACKABLE_TRUST_STORE,
STACKABLE_TRUST_STORE_PASSWORD, v1alpha1,
},
};

pub fn build_container_command_args(
hive: &v1alpha1::HiveCluster,
start_command: String,
s3_connection_spec: Option<&s3::v1alpha1::ConnectionSpec>,
hive_opa_config: Option<&HiveOpaConfig>,
) -> Vec<String> {
let mut args = vec![
// copy config files to a writeable empty folder in order to set s3 access and secret keys
Expand Down Expand Up @@ -51,6 +55,14 @@ pub fn build_container_command_args(
}
}

if let Some(opa) = hive_opa_config {
if let Some(ca_cert_dir) = opa.tls_ca_cert_mount_path() {
args.push(format!(
"cert-tools generate-pkcs12-truststore --pkcs12 {STACKABLE_TRUST_STORE}:{STACKABLE_TRUST_STORE_PASSWORD} --pem {ca_cert_dir}/ca.crt --out {STACKABLE_TRUST_STORE} --out-password {STACKABLE_TRUST_STORE_PASSWORD}"
));
}
}

// db credentials
args.extend([
format!("echo replacing {DB_USERNAME_PLACEHOLDER} and {DB_PASSWORD_PLACEHOLDER} with secret values."),
Expand Down
1 change: 1 addition & 0 deletions rust/operator-binary/src/config/mod.rs
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
pub mod jvm;
pub mod opa;
Loading