Skip to content
Permalink
Browse files
policy topic restructure (closes #111)
  • Loading branch information
lisakowen authored and dyozie committed Apr 3, 2017
1 parent 245c21c commit 1b0ae87e7e31d49b44e20d70bd0287d7f32436c9
Showing 11 changed files with 574 additions and 383 deletions.
@@ -0,0 +1,75 @@
---
title: Introducing HAWQ Authorization
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

Native HAWQ authorization provides SQL standard authorization at the database and table level for specific users/roles using the `GRANT` and `REVOKE` SQL commands. HAWQ integration with Ranger provides policy-based authorization, enabling you to identify the conditions under which a user and/or group can access individual HAWQ resources, including the operations permitted on those resources.

Native HAWQ and Ranger authorization are mutually exclusive.

Native HAWQ and Ranger authorization share `pg_hba.conf`-based user authentication. Native HAWQ authorization is used for certain database operations, even when Ranger is enabled. Additionally, HAWQ always verifies superuser privileges.


## <a id="pghbaconf"></a> pg_hba.conf
The `pg_hba.conf` file on the HAWQ master node identifies the users you permit to access the HAWQ cluster, and the hosts from which the access may be initiated. This authentication is the first line of defense for both HAWQ-Native and HAWQ-Ranger authorization.


## <a id="alwaysnative"></a> HAWQ Native Authorization
HAWQ *always* employs its native authorization for operations on its catalog. HAWQ also uses only native authorization for the following HAWQ operations, *even when Ranger is enabled*. These operations are available to superusers and may be available those non-admin users to which access was specifically configured:

- operations on HAWQ catalog
- `CREATE CAST` command when function is NULL
- `CREATE DATABASE`, `DROP DATABASE`, `createdb`, `dropdb`
- `hawq filespace` management tool
- `CREATE`, `DROP`, or `ALTER` commands for resource queues
- `CREATE ROLE`, `DROP ROLE`, `SET ROLE`, `createuser`, `dropuser`
- `CREATE TABLESPACE`, `DROP TABLESPACE` (Ranger does manage authorization for creating tables and indexes _within_ an existing tablespace.)
- HAWQ catalog-related built-in functions such as pg\_logdir\_ls, pg\_ls\_dir, pg\_read\_file, pg\_reload\_conf, pg\_rotate\_logfile, pg\_signal\_backend, pg\_start\_backup, pg\_stat\_file, pg\_stat\_get\_activity, pg\_stat\_get\_backend\_activity\_start, pg\_stat\_get\_backend\_activity, pg\_stat\_get\_backend\_client\_addr, pg\_stat\_get\_backend\_client\_port, pg\_stat\_get\_backend\_start, pg\_stat\_get\_backend\_waiting, pg\_stop\_backup, pg\_switch\_xlog, and pg\_stat\_reset.


The following SQL operations do not require any authorization checks:

- `DEALLOCATE`
- `SET`, `RESET`


## <a id="rangersuperuser"></a> Ranger Authorization
When Ranger authorization is enabled, HAWQ uses Ranger policies to determine access to all user database objects, apart from the operations listed above. HAWQ denies a user operation if no policy exists to provide the necessary permissions for the requesting user to access the specific resource(s).

In cases where an operation requires super-user privileges, HAWQ first performs a super-user check, and then requests the Ranger policy check. Operations that require super-user checks include:

- `CREATE`, `DROP`, or `ALTER` commands that involve a foreign-data wrapper
- `CREATE LANGUAGE` and `DROP LANGUAGE` for non-built-in languages
- `CREATE FUNCTION` command for untrusted languages
- `CREATE EXTERNAL TABLE` commands that include the `EXECUTE` clause
- `CREATE OPERATOR CLASS` command
- `COPY` command. Using `COPY` is always limited to the super-user. When Ranger policy management is enabled, the super-user must have `SELECT` or `INSERT` privileges on a table in order to `COPY` from or to that table.


## <a id="authalgorithm"></a> Access Check Summary

When determining if a database operation is supported for a specific user, HAWQ:

1. Confirms user access allowed by pg_hba.conf file.
2. Determines if the operation requires superuser access, and if so, verifies the requesting user has such privileges.
3. Determines if the operation requires native HAWQ authorization.
4. Determines if Ranger authorization for HAWQ is enabled.
4. Performs a HAWQ Native authorization check if required or if Ranger is not enabled **OR** Performs a HAWQ Ranger policy check.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@@ -0,0 +1,42 @@
---
title: Using MADLib with Ranger Authorization
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->


You can use MADlib, an open source library for in-database analytics, with your HAWQ installation. MADlib functions typically operate on source, output, and model tables. When Ranger is enabled for HAWQ authorization, you will need to provide access to all MADLib-related databases, schemas, tables, and functions to the appropriate users.

Consider the following when setting up HAWQ policies for MADlib access:

- Assign `temp` permission to the database on which users will run MADlib functions.
- MADlib users often share their output tables. If this is the case in your deployment, create a shared schema dedicated to output tables, assigning `usage-schema` and `create` privileges for all MADlib users to this shared schema.
- Assign `create-schema` database permission to those MADlib users that do not choose to share their output tables.

- `madlib` Schema-Level Permissions
- Assign `usage-schema` and `create` privileges to the `madlib` schema.
- Assign `execute` permissions on all functions within the `madlib` schema, including any functions called within.
- Assign `insert` and `select` permissions to all tables within the `madlib` schema.
- Assign the `usage-schema` and `create` permissions for the current schema, and any schema in which the source, output, and model tables may reside.

- Function-Specific Permissions
- Assign `insert` and `select` permissions for the source, output, and model tables.
- Assign `insert` and `select` permissions for the output \_summary and \__group tables.

@@ -113,54 +113,45 @@ To use HAWQ Ranger integration, install a compatible Hadoop distribution and Apa

7. After HAWQ reloads the configuration, use the fully-qualified domain name to log into the Ambari server. Click the **Ranger** link to display the Ranger Summary page, then select **Quick Links > Ranger Admin UI**.

8. Log into the Ranger Access Manager. Click the **Edit** button for the **HAWQ** service. Ensure that the Active Status is set to Enabled, and click **Test Connection**. You should receive a message that Ranger connected successfully. If the connection fails, verify the `hawq` service Config Properties, as well as your `pg_hba.conf` entries, and re-test the connection.
8. Log in to the Ranger Access Manager. Click the **Edit** button for the **HAWQ** service. Ensure that the Active Status is set to Enabled, and click **Test Connection**. You should receive a message that Ranger connected successfully. If the connection fails, verify the `hawq` service Config Properties, as well as your `pg_hba.conf` entries, and re-test the connection.

## <a id="enable"></a>Step 2: Configure HAWQ to Use Ranger Policy Management

The default Ranger service definition for HAWQ assigns the HAWQ administrator (typically `gpadmin`) all privileges to all objects.

Once the connection between HAWQ and Ranger is configured, you can either set up policies for the HAWQ users according to the procedures in [Creating HAWQ Authorization Policies in Ranger](ranger-policy-creation.html) or enable Ranger with only the default policies.
Once the connection between HAWQ and Ranger is configured, you may choose to set up policies for the HAWQ users according to the procedures in [Creating HAWQ Authorization Policies in Ranger](ranger-policy-creation.html) or enable Ranger with only the default policies.

**Note**: Any authorization defined using GRANT commands will no longer apply after enabling HAWQ Ranger. Only gpadmin access is allowed when Ranger is first initialized.
**Note**: Any authorization defined using `GRANT` commands will no longer apply after enabling HAWQ Ranger. Enabling Ranger authorization for HAWQ with only the default policies defined provides access only to the `gpadmin` user.

1. On Ambari, select the **HAWQ** Service, and then select the **Configs** tab.
1. Log in to the Ambari UI, select the **HAWQ** Service, and then select the **Configs** tab.
2. Select the **Advanced** tab, and then expand **Custom hawq-site**.
4. Click **Add Property...** and add the new property, `hawq_acl_type=ranger` property. (If the property already exists, change its value from `standalone` (the default) to `ranger`.)
5. Click **Save** to save your changes.
6. Select **Service Actions > Restart All** and confirm that you want to restart the HAWQ cluster.


## <a id="customconfig"></a> Custom Configuration

Configuration files for the HAWQ Ranger Plug-in Service are located in the `$GPHOME/ranger/etc` directory. These files include:

| File | Description |
|-------------|---------------------------|
| ranger-hawq-audit.xml | HAWQ Ranger audit-related configuration, including the audit provider (log4j, Solr, HDFS) and provider-specific configuration |
| ranger-hawq-security.xml | HAWQ Ranger service configuration, including the policy change polling interval |
| rps.properties | HAWQ Ranger deployment-related configuration, including the HAWQ Ranger Plug-in Service port definition and JVM parameters|

Any configuration changes you make after you have registered the HAWQ Ranger Plug-in require a restart of the service. You can either restart the HAWQ cluster or restart just the HAWQ Ranger Plug-in Service:

``` shell
gpadmin@master$ /usr/local/hawq/ranger/bin/rps.sh stop
gpadmin@master$ /usr/local/hawq/ranger/bin/rps.sh start
```

### <a id="caching"></a>Changing the Frequency of Policy Caching

The default polling interval for HAWQ Ranger Plug-in Service policy updates is 30 seconds. To increase or decrease this value, update the `ranger.plugin.hawq.policy.pollIntervalMs` property setting in the `ranger-hawq-security.xml` file:

<pre>
&lt;property&gt;
&lt;name&gt;ranger.plugin.hawq.policy.pollIntervalMs&lt;/name&gt;
<b>&lt;value&gt;30000&lt;/value&gt;</b>
&lt;description&gt;
How often to poll for changes in policies?
&lt;/description&gt;
&lt;/property&gt;
</pre>

Provide a value in milliseconds.

You must restart the HAWQ Ranger Plug-in Service as described above after updating the polling interval.
6. Select **Service Actions > Restart All** and confirm that you want to restart the HAWQ cluster.


## <a id="customconfig"></a> Custom Configuration

Configuration files for the HAWQ Ranger Plug-in Service are located in the `$GPHOME/ranger/etc` directory. These files include:

| File | Description |
|-------------|---------------------------|
| ranger-hawq-audit.xml | HAWQ Ranger audit-related configuration, including the audit provider (log4j, Solr, HDFS) and provider-specific configuration |
| ranger-hawq-security.xml | HAWQ Ranger service configuration, including the policy change polling interval |
| rps.properties | HAWQ Ranger deployment-related configuration, including the HAWQ Ranger Plug-in Service port definition and JVM parameters|

Any configuration changes you make after you have registered the HAWQ Ranger Plug-in require a restart of the service. You can either restart the HAWQ cluster or restart just the HAWQ Ranger Plug-in Service:

``` shell
gpadmin@master$ /usr/local/hawq/ranger/bin/rps.sh stop
gpadmin@master$ /usr/local/hawq/ranger/bin/rps.sh start
```

## <a id="troubleshoot"></a> Troubleshooting Ranger Configuration

If resource name lookup is not working in the Ranger Admin UI:

1. Verify that the HAWQ Ranger plug-in JARs and JDBC driver have been copied to \<ranger-admin-node\>.
2. Test the connection between the Ranger Admin UI and the HAWQ master node by clicking the edit icon associated with the active HAWQ service definition, then clicking the **Config Properties: > Test Connection** button.
3. Verify that the HAWQ master node `pg_hba.conf` file includes a `host` entry for \<ranger-admin-node\>, HAWQ user (typically `gpadmin`).

@@ -35,7 +35,7 @@ HAWQ also provides a JAR library that enables the Ranger Policy Manager to looku

A single configuration parameter, `hawq_acl_type` determines whether HAWQ defers all policy management to Ranger via the plug-in service, or whether HAWQ handles authorization natively using catalog tables. By default, HAWQ uses SQL commands to create all access policies, and the policy information is stored in catalog tables. When you enable Ranger integration for policy management, any authorization policies that you have configured in HAWQ using SQL no longer apply to your installation; you must create new policies using the Ranger interface. See [Creating HAWQ Authorization Policies in Ranger](ranger-policy-creation.html).

The Ranger plug-in service caches Ranger policies locally on each HAWQ node to avoid unnecessary round trips between the HAWQ node and the Ranger Policy Manager server. You can use the configuration property `ranger.plugin.hawq.policy.pollIntervalMs` to control how frequently the plug-in service contacts the Ranger Policy Manager to refresh cached policies. See [Changing the Frequency of Policy Caching](ranger-integration-config.html#caching).
The Ranger plug-in service caches Ranger policies locally on each HAWQ node to avoid unnecessary round trips between the HAWQ node and the Ranger Policy Manager server.

## <a id="limitations"></a>Limitations of Ranger Policy Management in HAWQ 2.2.0.0-incubating
Neither Kerberos authentication nor SSL encryption is supported between a HAWQ node and the Ranger plug-in service, or between the plug-in service and the Ranger Policy Manager.

0 comments on commit 1b0ae87

Please sign in to comment.