Skip to content
Permalink
Browse files
Adding draft documentation for Ranger integration feature
  • Loading branch information
dyozie committed Mar 30, 2017
1 parent 9175d25 commit a7e32e0ce4f8cd15001f7b4900218bbfa2ba8d45
Showing 7 changed files with 866 additions and 0 deletions.
@@ -282,6 +282,12 @@ PXF provides both service- and database-level logging. Refer to [PXF Logging](..

Ambari log files may be useful in helping diagnose general cluster problems. The Ambari server log files are located in the `/var/log/ambari-server/` directory. Ambari agent log files are located in `/var/log/ambari-agent/`. Refer to [Reviewing Ambari Log Files](https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_ambari_troubleshooting/content/_reviewing_ambari_log_files.html) for additional information.

## <a id="rangerlogs"></a> Ranger Log Files

The HAWQ Ranger Plug-in Service log files may be useful in helping diagnose Ranger connectivity and authorization problems. You will find these log files in the `$GPHOME/ranger/plugin-service/logs/` directory. In addition to HAWQ Ranger Plug-in service-related logs, this directory includes the `log4j` provider `audit.log` file. (Refer to [Auditing Authorization Events](../ranger/ranger-auditing.html) for information on configuring HAWQ Ranger audit logging.)

Ranger log files are located in the `/var/log/ranger/admin/` directory.


## <a id="logging_other"></a>Hadoop Log Files

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@@ -0,0 +1,156 @@
---
title: Auditing Authorization Events
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

The HAWQ Ranger Plug-in Service supports storing auditing records in any of the Ranger auditing framework audit destinations, referred to as *audit sink*s. The `/usr/local/hawq/ranger/etc/ranger-hawq-audit.xml` file specifies the audit configuration. It contains sample definitions for the HDFS, Solr, and Log4j audit sinks.

As a best practice, configure one or more audit sinks in `ranger-hawq-audit.xml` before you register the HAWQ Ranger Plug-in Service. By default only the Log4j sink is enabled. Production deployments should use both a Solr and an HDFS audit sink, with the Solr destination configured to automatically purge audit records after some period of time. This configuration enables you to search the most recent audit records, while a keeping a full history of auditing records in HDFS.

If you modify `ranger-hawq-audit.xml` after you have registered the HAWQ Ranger Plug-in, you must restart the plug-in for the changes to take effect.

Full documentation for the Ranger auditing configuration properties and the Ranger auditing framework is available at [Ranger 0.5 Audit Configuration](https://cwiki.apache.org/confluence/display/RANGER/Ranger+0.5+Audit+Configuration).

## <a id="solr"></a>Configuring Solr Auditing
To configure a Solr audit sink, you define a different set of properties in `ranger-hawq-audit.xml` depending on whether you use Zookeeper or a direct URL connect to your Solr destination. For a production environment, use Zookeeper instead of a direct URL.

If you use Zookeeper to connect to Solr, configure these auditing properties in `ranger-hawq-audit.xml`:

Table 1. Properties for Zookeeper Configuration

| Property | Value | Description |
| -------- | ----- | ----------- |
| xasecure.audit.destination.solr | true | Use this property to enable or disable the Solr sink. |
| xasecure.audit.destination.solr.zookeepers | &lt;zookeeper connect string&gt; | Specify the Zookeeper connection string for the Solr destination. |
| xasecure.audit.destination.solr.collection | &lt;collection name&gt; | Specify the Solr collection name to use for indexing the HAWQ audit records. By default HAWQ uses the `ranger_audits` collection. |
| xasecure.audit.destination.solr.batch.filespool.* | Multiple Properties | See [Configuration related to File spooling](https://cwiki.apache.org/confluence/display/RANGER/Ranger+0.5+Audit+Configuration#Ranger0.5AuditConfiguration-ConfigurationrelatedtoFilespooling) in the Ranger documentation if you want to configure spooling of auditing events to disk when the in-memory buffer is full. |
| xasecure.audit.destination.solr.urls | NONE | Leave this property value empty or set it to `NONE` when using Zookeeper to connect to Solr. |

For example:

```
<!-- ********************************* -->
<!-- SOLR audit provider configuration -->
<!-- ********************************* -->
<property>
<name>xasecure.audit.destination.solr</name>
<value>true</value>
</property>

<property>
<name>xasecure.audit.destination.solr.zookeepers</name>
<value>zkhost1:2181,zkhost2:2181/infra-solr</value>
</property>

<property>
<name>xasecure.audit.destination.solr.collection</name>
<value>ranger_audits</value>
</property>

<property>
<name>xasecure.audit.destination.solr.urls</name>
<value>NONE</value>
</property>

<property>
<name>xasecure.audit.destination.solr.batch.filespool.enabled</name>
<value>true</value>
</property>

<property>
<name>xasecure.audit.destination.solr.batch.filespool.dir</name>
<value>/usr/local/hawq_2_2_0_0/ranger/plugin-service/logs/spool/audit/solr</value>
</property>
```
## <a id="hdfs"></a>Configuring HDFS Auditing
To configure an HDFS audit sink, define these auditing properties in `ranger-hawq-audit.xml`:

Table 2. Properties for HDFS Configuration

| Property | Value | Description |
| -------- | ----- | ----------- |
| xasecure.audit.destination.hdfs | true | Use this property to enable or disable the HDFS sink. |
| xasecure.audit.destination.hdfs.dir | &lt;HDFS directory&gt; | Specify the HDFS directory in which the plug-in records audit events. |
| xasecure.audit.destination.hdfs.batch.filespool.* | Multiple Properties | See [Configuration related to File spooling](https://cwiki.apache.org/confluence/display/RANGER/Ranger+0.5+Audit+Configuration#Ranger0.5AuditConfiguration-ConfigurationrelatedtoFilespooling) in the Ranger documentation if you want to configure spooling of auditing events to disk when the in-memory buffer is full. |

For example:

```
<!-- ********************************* -->
<!-- HDFS audit provider configuration -->
<!-- ********************************* -->
<property>
<name>xasecure.audit.destination.hdfs</name>
<value>true</value>
</property>

<property>
<name>xasecure.audit.destination.hdfs.dir</name>
<value>hdfs://localhost:8020/ranger/audit</value>
</property>

<property>
<name>xasecure.audit.destination.hdfs.batch.filespool.enabled</name>
<value>true</value>
</property>

<property>
<name>xasecure.audit.destination.hdfs.batch.filespool.dir</name>
<value>/usr/local/hawq_2_2_0_0/ranger/plugin-service/logs/spool/audit/hdfs</value>
</property>
```
## <a id="log4j"></a>Configuring Log4j Auditing
To configure a Log4j audit sink, define these auditing properties in `ranger-hawq-audit.xml`:

Table 3. Properties for Log4j Configuration

| Property | Value | Description |
| -------- | ----- | ----------- |
| xasecure.audit.destination.log4j | true | Use this property to enable or disable the Log4j sink. |
| xasecure.audit.destination.log4j.logger | &lt;Logger Name&gt; | Specify the name of the logger to use for sending audit events. |
| xasecure.audit.destination.log4j.batch.filespool.* | Multiple Properties | See [Configuration related to File spooling](https://cwiki.apache.org/confluence/display/RANGER/Ranger+0.5+Audit+Configuration#Ranger0.5AuditConfiguration-ConfigurationrelatedtoFilespooling) in the Ranger documentation if you want to configure spooling of auditing events to disk when the in-memory buffer is full. |

For example:

```
<!-- ********************************** -->
<!-- Log4j audit provider configuration -->
<!-- ********************************** -->
<property>
<name>xasecure.audit.destination.log4j</name>
<value>true</value>
</property>

<property>
<name>xasecure.audit.destination.log4j.logger</name>
<value>ranger_audit_logger</value>
</property>
```

## <a id="reconfigure"></a>Changing the Plug-in Auditing Settings
If you modify `ranger-hawq-audit.xml` after you have registered the HAWQ Ranger Plug-in, you must either restart the HAWQ cluster or restart the plug-in for the changes to take effect.

To restart only the HAWQ Ranger Plug-in:

``` bash
$ /usr/local/hawq_2_2_0_0/ranger/bin/rps.sh stop
$ /usr/local/hawq_2_2_0_0/ranger/bin/rps.sh start
```
@@ -0,0 +1,131 @@
---
title: Configuring HAWQ to use Ranger Policy Management
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

Your HAWQ 2.2.0 installation includes the following HAWQ-related Ranger components:

- Ranger Administrative UI
- HAWQ Ranger Plug-in Service

The Ranger Administrative UI is installed when you install HDP. You configure the Ranger service itself through Ambari. You configure HAWQ-Ranger authorization policies through the Ranger Administrative UI, which you can access at `http://<ranger-admin-node>:6080`.

Installing or upgrading to HAWQ 2.2.0 installs the HAWQ Ranger Plug-in Service, but neither configures nor registers the plug-in.

To use Ranger for managing HAWQ authentication events, you must first install and register several HAWQ JAR files on the Ranger Administration host. This one-time configuration establishes connectivity to your HAWQ cluster from the Ranger Administration host.

After registering the JAR files, you enable or disable Ranger integration in HAWQ by setting the `hawq_acl_type` configuration parameter. After Ranger integration is enabled, you must use the Ranger interface to create all security policies to manage access to HAWQ resources. Ranger is only pre-populated with policies to allow `gpadmin` superuser access to default resources. See [Creating HAWQ Authorization Policies in Ranger](ranger-policy-creation.html) for information about creating policies in Ranger. When Ranger is enabled, all access to HAWQ resources is controlled by security policies on Ranger.

Use the following procedures to register the HAWQ Ranger Plug-in Service and enable Ranger authorization for HAWQ..

## <a id="prereq"></a>Prerequisites
To use HAWQ Ranger integration, install a compatible Hadoop distribution and Apache Ranger 0.6. You must also have `admin` access to the **Ranger Admin UI**.

## <a id="jar"></a>Step 1: Install Ranger Connectivity to HAWQ
1. `ssh` into the Ranger Administration host as a user with root privileges:

``` bash
$ ssh root@<ranger-admin-node>
root@ranger-admin-node$
```
2. Create the directory for the HAWQ JAR files:

``` bash
root@ranger-admin-node$ cd /usr/hdp/current/ranger-admin/ews/webapp/WEB-INF/classes/ranger-plugins
root@ranger-admin-node$ mkdir hawq
```
3. Copy the necessary HAWQ JAR files (`postgresql-9.1-901-1.jdbc4.jar` and `ranger-plugin-admin-2.2.0.0.jar`) from the HAWQ master node to the new directory:

``` bash
root@ranger-admin-node$ scp <hawq-master>:/usr/local/hawq/ranger/lib/*.jar ./hawq
```
4. Change the ownership of the new folder and JAR files to the `ranger` user:

``` bash
root@ranger-admin-node$ chown -R ranger:ranger hawq
```
5. The `enable-ranger-plugin.sh` script configures Ranger connectivity to your HAWQ cluster. The command has the syntax:

``` pre
enable-ranger-plugin.sh -r <ranger_admin_node>:<ranger_port> -u <ranger_user> -p <ranger_password> -h <hawq_master>:<hawq_port> -w <hawq_user> -q <hawq_password>
```

Log in to the HAWQ master node as the `gpadmin` user and execute the `enable-ranger-plugin.sh` script. Ensure \<hawq_master\> identifies the fully qualified domain name of the HAWQ master node. For example:

``` bash
sudo su - gpadmin
gpadmin@master$ cd /usr/local/hawq/ranger/bin
gpadmin@master$ ./enable-ranger-plugin.sh -r ranger_host:6080 -u admin -p admin -h hawq_master:5432 -w gpadmin -q gpadmin
```

***Note*** You can also enter the short form of the command: `./enable-ranger-plugin.sh -r` and the script will prompt you for entries.

When the script completes, the default HAWQ service definition is registered in the Ranger Admin UI. This service definition is named `hawq`.

6. Locate the `pg_hba.conf` file on the HAWQ master node, for example:

``` bash
gpadmin@master$ hawq config --show hawq_master_directory
GUC : hawq_master_directory
Value : /data/hawq/master

```

Edit the `pg_hba.conf` file on the HAWQ master node to configure HAWQ access for \<hawq_user\> on the \<ranger-admin-node\>. For example, you would add an entry similar to the following for the example `enable-ranger-plugin.sh` call above:

``` bash
host all gpadmin ranger_host/32 trust
```

And reload HAWQ configuration:

``` bash
gpadmin@master$ hawq stop cluster --reload
```

7. When setup is complete, use the fully-qualified domain name to log into the Ambari server. Use the Ranger link in the left nav to bring up the Ranger Summary pane in the HAWQ Ambari interface. Use the Quick Links to access Ranger. This link will take you to the Ranger Login interface.

8. Log into the Ranger Access Manager. You will see a list of icons under the Service Manager. Click the **Edit** icon on the right, under the HAWQ service icon. Ensure that the Active Status is set to Enabled, and click the **Test Connection** button. You should receive a message that Ranger connected successfully. If it fails to connect, you may need to edit your Ranger connection in `pg_hba.conf,` perform

``` bash
gpadmin@masterhawq stop cluster --reload
```
and re-test the connection.


## <a id="enable"></a>Step 2: Configure HAWQ to Use Ranger Policy Management

The default Ranger service definition for HAWQ assigns the HAWQ administrator (typically `gpadmin`) all privileges to all objects.

Once the connection between HAWQ and Ranger is configured, you can either set up policies for the HAWQ users according to the procedures in [Creating HAWQ Authorization Policies in Ranger](ranger-policy-creation.html) or enable Ranger with only the default policies.

**Note**: Any authorization defined using GRANT commands will no longer apply after enabling HAWQ Ranger. Only gpadmin access is allowed when Ranger is first initialized.

1. On Ambari, select the **HAWQ** Service, and then select the **Configs** tab.
2. Select the **Advanced** tab, and then expand **Custom hawq-site**.
4. Click **Add Property...** and add the new property, `hawq_acl_type=ranger` property. (If the property already exists, change its value from `standalone` (the default) to `ranger`.)
5. Click **Save** to save your changes.
6. Select **Service Actions > Restart All** and confirm that you want to restart the HAWQ cluster.


## <a id="caching"></a>Changing the Frequency of Policy Caching

You may wish to change the frequency of policy caching to suit your individual needs.

0 comments on commit a7e32e0

Please sign in to comment.