Skip to content

Commit

Permalink
docs - add template upgrade topics for pxf v6 (#487)
Browse files Browse the repository at this point in the history
* docs - add template upgrade topics for v6

* overview page to link to 5->6 upgrade topic
  • Loading branch information
Lisa Owen committed Nov 12, 2020
1 parent 0aa66cd commit d924bb7
Show file tree
Hide file tree
Showing 4 changed files with 190 additions and 1 deletion.
2 changes: 1 addition & 1 deletion docs/content/overview_pxf.html.md.erb
Expand Up @@ -65,7 +65,7 @@ Check out the [PXF introduction](intro_pxf.html) for a high level overview impor
The Greenplum Database administrator manages PXF, Greenplum Database user privileges, and external data source configuration. Tasks include:

- [Installing](about_pxf_dir.html), [configuring](instcfg_pxf.html), [starting](cfginitstart_pxf.html), [monitoring](monitor_pxf.html), and [troubleshooting](troubleshooting_pxf.html) the PXF service.
- Managing PXF [upgrade](upgrade_pxf_rpm.html).
- Managing PXF [upgrade](upgrade_5_to_6.html).
- [Configuring](cfg_server.html) and publishing one or more server definitions for each external data source. This definition specifies the location of, and access credentials to, the external data source.
- [Granting](using_pxf.html) Greenplum user access to PXF and PXF external tables.

Expand Down
122 changes: 122 additions & 0 deletions docs/content/upgrade_5_to_6.html.md.erb
@@ -0,0 +1,122 @@
---
title: Upgrading from PXF 5
---

If you have installed, configured, and are using PXF in your Greenplum Database 5 or 6 cluster, you must perform some upgrade actions when you install PXF 6.x.

<div class="note">If you are using PXF with Greenplum Database 5, you must upgrade Greenplum to version 5.21.2 or newer before you upgrade to PXF 6.</div>

The PXF upgrade procedure has two parts. You perform one procedure before, and one procedure after, you install a new version to upgrade to PXF 6:

- [Step 1: Perform the PXF Pre-Upgrade Actions](#pxfpre)
- Install the new version of PXF
- [Step 2: Complete the Upgrade to PXF 6](#pxfup)


## <a id="pxfpre"></a>Step 1: Performing the PXF Pre-Upgrade Actions

Perform this procedure before you upgrade to a new version of PXF:

1. Log in to the Greenplum Database master node. For example:

``` shell
$ ssh gpadmin@<gpmaster>
```

1. Identify and note the version of PXF currently running in your Greenplum cluster:

``` shell
gpadmin@gpmaster$ pxf version
```

1. add other steps here - TBD like backing up $PXF_HOME, $PXF_CONF

1. Stop PXF on each segment host as described in [Stopping PXF](cfginitstart_pxf.html#stop_pxf).

1. Install the new version of PXF, identify and note the new PXF version number, and then continue your PXF upgrade with [Step 2: Complete the Upgrade to PXF 6](#pxfup).


## <a id="pxfup"></a>Step 2: Completing the Upgrade to PXF 6

After you install the new version of PXF, perform the following procedure:

1. Log in to the Greenplum Database master node. For example:

``` shell
$ ssh gpadmin@<gpmaster>
```

1. other steps here

1. THE STEPS BELOW ARE FOR CONFIG PARITY WITH 5.16, TBD WHERE THEY END UP IN THIS SECTION.

3. **If you are upgrading from PXF version 5.9.x or earlier** and you have configured any JDBC servers that access Kerberos-secured Hive, you must now set the `hadoop.security.authentication` property to the `jdbc-site.xml` file to explicitly identify use of the Kerberos authentication method. Perform the following for each of these server configs:

1. Navigate to the server configuration directory.
2. Open the `jdbc-site.xml` file in the editor of your choice and uncomment or add the following property block to the file:

```xml
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
```
3. Save the file and exit the editor.

4. **If you are upgrading from PXF version 5.11.x or earlier**: The PXF `Hive` and `HiveRC` profiles now support column projection using column name-based mapping. If you have any existing PXF external tables that specify one of these profiles, and the external table relied on column index-based mapping, you may be required to drop and recreate the tables:

1. Identify all PXF external tables that you created that specify a `Hive` or `HiveRC` profile.

2. For *each* external table that you identify in step 1, examine the definitions of both the PXF external table and the referenced Hive table. If the column names of the PXF external table *do not* match the column names of the Hive table:

1. Drop the existing PXF external table. For example:

``` sql
DROP EXTERNAL TABLE pxf_hive_table1;
```

2. Recreate the PXF external table using the Hive column names. For example:

``` sql
CREATE EXTERNAL TABLE pxf_hive_table1( hivecolname int, hivecolname2 text )
LOCATION( 'pxf://default.hive_table_name?PROFILE=Hive')
FORMAT 'custom' (FORMATTER='pxfwritable_import');
```

3. Review any SQL scripts that you may have created that reference the PXF external table, and update column names if required.

4. **If you are upgrading from PXF version 5.15.x or earlier**:

1. The `pxf.service.user.name` property in the `pxf-site.xml` template file is now commented out by default. Keep this in mind when you configure new PXF servers.
2. The default value for the `jdbc.pool.property.maximumPoolSize` property is now `15`. If you have previously configured a JDBC server(s) and want that server to use the new default value, you must manually change the property value in the server's `jdbc-site.xml` file.
3. PXF 5.16 disallows specifying relative paths and environment variables in the `CREATE EXTERNAL TABLE` `LOCATION` clause file path. If you previously created any external tables that specified a relative path or environment variable, you must drop each external table, and then re-create it without these constructs.
4. Filter pushdown is enabled by default for queries on external tables that specify the `Hive`, `HiveRC`, or `HiveORC` profiles. If you have previously created an external table that specifies one of these profiles and queries are failing with PXF v5.16+, you can disable filter pushdown at the external table-level or at the server level:

1. (External table) Drop the external table and re-create it, specifying the `&PPD=false` option in the `LOCATION` clause.
2. (Server) If you do not want to recreate the external table, you can disable filter pushdown *for all* `Hive*` *profile queries using the server* by setting the `pxf.ppd.hive` property in the `pxf-site.xml` file to `false`:

``` pre
<property>
<name>pxf.ppd.hive</name>
<value>false</value>
</property>
```

You may need to add this property block to the `pxf-site.xml` file.

1. **If you are upgrading to PXF version 6.0**:

1. step 1
2. step 2
3. step 3

1. next step

4. Synchronize the PXF configuration from the master host to the standby master and each Greenplum Database segment host. For example:

``` shell
gpadmin@gpmaster$ pxf cluster sync
```

5. Start PXF on each segment host as described in [Starting PXF](cfginitstart_pxf.html#start_pxf).

58 changes: 58 additions & 0 deletions docs/content/upgrade_6.html.md.erb
@@ -0,0 +1,58 @@
---
title: Upgrading from an Earlier PXF 6 Release
---

If you have installed a PXF 6.x `rpm` or `deb` package and have initialized, configured, and are using PXF in your current Greenplum Database 5.21.2+ or 6.x installation, you must perform some upgrade actions when you install a new version of PXF 6.

The PXF upgrade procedure has two parts. You perform one procedure before, and one procedure after, you install a new version to upgrade PXF:

- [Step 1: Perform the PXF Pre-Upgrade Actions](#pxfpre)
- Install the new version of PXF
- [Step 2: Complete the Upgrade to a Newer PXF 6](#pxfup)


## <a id="pxfpre"></a>Step 1: Perform the PXF Pre-Upgrade Actions

Perform this procedure before you upgrade to a new version of PXF 6:

1. Log in to the Greenplum Database master node. For example:

``` shell
$ ssh gpadmin@<gpmaster>
```

1. Identify and note the version of PXF currently running in your Greenplum cluster:

``` shell
gpadmin@gpmaster$ pxf version
```

1. next step

1. Stop PXF on each segment host as described in [Stopping PXF](cfginitstart_pxf.html#stop_pxf).

1. Install the new version of PXF, identify and note the new PXF version number, and then continue your PXF upgrade with [Step 2: Complete the Upgrade to a Newere PXF 6](#pxfup).


## <a id="pxfup"></a>Step 2: Completing the Upgrade to a Newer PXF 6

After you install the new version of PXF, perform the following procedure:

1. Log in to the Greenplum Database master node. For example:

``` shell
$ ssh gpadmin@<gpmaster>
```

1. next step

1. next step

1. Synchronize the PXF configuration from the master host to the standby master and each Greenplum Database segment host. For example:

``` shell
gpadmin@gpmaster$ pxf cluster sync
```

1. Start PXF on each segment host as described in [Starting PXF](cfginitstart_pxf.html#start_pxf).

9 changes: 9 additions & 0 deletions docs/content/upgrade_landing.html.md.erb
@@ -0,0 +1,9 @@
---
title: Upgrading to PXF 6
---

PXF 6.x supports these upgrade paths:

- [Upgrading from PXF 5](upgrade_5_to_6.html)
- [Upgrading from an earlier PXF 6 Release](upgrade_6.html)

0 comments on commit d924bb7

Please sign in to comment.