Port smoke automation group to pg_regress

Includes HdfsSmokeTest, WritableSmokeTest, MultiBlockDataSmokeTest, HiveSmokeTest, and HBaseSmokeTest. Tests are separated into external table and foreign table. Foreign table tests begin with the prefix 'FDW_'. Uses sed to search and replace some templates (e.g. `{{ TEMPLATE }}`) for things like HCFS_CMD (path to `hdfs` CLI). Templates live in `sql` (script run in `psql`) and `expected` (output that's expected). Results will end up under `results/`. Generated files have the same name as the sql/*.sql and expected/*.out files, but with a leading underscore. These interpolated files are used by pg_regress. All test names end in 'Test'. Tests can be grouped arbitrarily using schedule files, under `schedules/`. Schedule names end in '_schedule', FDW schedules being with 'fdw_'. See README.md for more info. init_file contains regex ignore and substitution patterns. PXF_TEST_DEBUG may be set to anything to keep files and run pg_regress in debug mode. Using `make clean` should clean out any remnants. It is convenient to use an external writable table or foreign table to write data to HDFS, but to avoid the 'garbage-in, garbage-out' scenario, we write to HDFS with a Greenplum table, then uses `hdfs` command to confirm that the data arrived in HDFS in the correct format. Includes an example pipeline that runs pg_regress against all the clouds that we support (WASBS, ADL, S3, GCS) along with Minio and local HDFS. The pipeline builds Greenplum (needed because this relies on a specifice branch of Greenplum at the moment). It is possible to run against a remote Greenplum and HDFS (e.g. a Kerberized dataproc) however this commit doesn't include test code to do that in CI. Authored-by: Oliver Albertini <oalbertini@pivotal.io>
greenplum-db · Sep 24, 2019 · edf8196 · edf8196
1 parent 1407664
commit edf8196
Show file tree

Hide file tree

Showing 46 changed files with 4,158 additions and 29 deletions.
diff --git a/README.md b/README.md
@@ -31,6 +31,8 @@ Hadoop testing environment to exercise the pxf automation tests
 ## concourse/
 Resources for PXF's Continuous Integration pipelines
 
+## regression/
+Contains the end-to-end (integration) tests for PXF against the various datasources, utilizing the PostgreSQL testing framework `pg_regress`
 
 PXF Development
 =================

diff --git a/concourse/README.md b/concourse/README.md
@@ -128,3 +128,21 @@ The master and 5X pipelines are exposed. Here are the commands to expose the pip
 fly -t ud expose-pipeline -p pxf_master
 fly -t ud expose-pipeline -p pxf_5X_STABLE
 ```
+
+# Deploy `pg_regress` pipeline
+
+This pipeline currently runs the smoke test group against the different clouds using `pg_regress` instead of automation.
+It uses both external and foreign tables.
+You can adjust the `folder-prefix`, `gpdb-git-branch`, `gpdb-git-remote`, `pxf-git-branch`, and `pxf-git-remote`.
+For example, you may want to work off of a development branch for PXF or Greenplum.
+
+```
+fly -t ud set-pipeline -p pg_regress \
+	-c ~/workspace/pxf/concourse/pipelines/pg_regress_pipeline.yml \
+	-l ~/workspace/gp-continuous-integration/secrets/gpdb6-integration-testing.dev.yml \
+	-l ~/workspace/gp-continuous-integration/secrets/ccp-integration-pipelne-secrets.yml \
+	-l ~/workspace/gp-continuous-integration/secrets/gpdb_common-ci-secrets.yml \
+	-v folder-prefix=dev/pivotal \
+	-v gpdb-git-branch=pxf-fdw-pass-filter-string -v gpdb-git-remote=https://github.com/pivotal/gp-gpdb-dev \
+	-v pxf-git-branch=master -v pxf-git-remote=https://github.com/greenplum-db/pxf \
+```