Skip to content

Commit

Permalink
Port smoke automation group to pg_regress
Browse files Browse the repository at this point in the history
Includes HdfsSmokeTest, WritableSmokeTest, MultiBlockDataSmokeTest,
HiveSmokeTest, and HBaseSmokeTest.

Tests are separated into external table and foreign table. Foreign table
tests begin with the prefix 'FDW_'.  Uses sed to search and replace some
templates (e.g. `{{ TEMPLATE }}`) for things like HCFS_CMD (path to
`hdfs` CLI). Templates live in `sql` (script run in `psql`) and
`expected` (output that's expected). Results will end up under
`results/`. Generated files have the same name as the sql/*.sql and
expected/*.out files, but with a leading underscore. These interpolated
files are used by pg_regress. All test names end in 'Test'.

Tests can be grouped arbitrarily using schedule files, under
`schedules/`. Schedule names end in '_schedule', FDW schedules being
with 'fdw_'. See README.md for more info.

init_file contains regex ignore and substitution patterns.

PXF_TEST_DEBUG may be set to anything to keep files and run pg_regress
in debug mode. Using `make clean` should clean out any remnants.

It is convenient to use an external writable table or foreign table to
write data to HDFS, but to avoid the 'garbage-in, garbage-out' scenario,
we write to HDFS with a Greenplum table, then uses `hdfs` command to
confirm that the data arrived in HDFS in the correct format.

Includes an example pipeline that runs pg_regress against all the
clouds that we support (WASBS, ADL, S3, GCS) along with Minio and local
HDFS. The pipeline builds Greenplum (needed because this relies on a
specifice branch of Greenplum at the moment).

It is possible to run against a remote Greenplum and HDFS (e.g. a
Kerberized dataproc) however this commit doesn't include test code to do
that in CI.

Authored-by: Oliver Albertini <oalbertini@pivotal.io>
  • Loading branch information
Oliver Albertini committed Sep 24, 2019
1 parent 1407664 commit edf8196
Show file tree
Hide file tree
Showing 46 changed files with 4,158 additions and 29 deletions.
2 changes: 2 additions & 0 deletions README.md
Expand Up @@ -31,6 +31,8 @@ Hadoop testing environment to exercise the pxf automation tests
## concourse/
Resources for PXF's Continuous Integration pipelines

## regression/
Contains the end-to-end (integration) tests for PXF against the various datasources, utilizing the PostgreSQL testing framework `pg_regress`

PXF Development
=================
Expand Down
18 changes: 18 additions & 0 deletions concourse/README.md
Expand Up @@ -128,3 +128,21 @@ The master and 5X pipelines are exposed. Here are the commands to expose the pip
fly -t ud expose-pipeline -p pxf_master
fly -t ud expose-pipeline -p pxf_5X_STABLE
```

# Deploy `pg_regress` pipeline

This pipeline currently runs the smoke test group against the different clouds using `pg_regress` instead of automation.
It uses both external and foreign tables.
You can adjust the `folder-prefix`, `gpdb-git-branch`, `gpdb-git-remote`, `pxf-git-branch`, and `pxf-git-remote`.
For example, you may want to work off of a development branch for PXF or Greenplum.

```
fly -t ud set-pipeline -p pg_regress \
-c ~/workspace/pxf/concourse/pipelines/pg_regress_pipeline.yml \
-l ~/workspace/gp-continuous-integration/secrets/gpdb6-integration-testing.dev.yml \
-l ~/workspace/gp-continuous-integration/secrets/ccp-integration-pipelne-secrets.yml \
-l ~/workspace/gp-continuous-integration/secrets/gpdb_common-ci-secrets.yml \
-v folder-prefix=dev/pivotal \
-v gpdb-git-branch=pxf-fdw-pass-filter-string -v gpdb-git-remote=https://github.com/pivotal/gp-gpdb-dev \
-v pxf-git-branch=master -v pxf-git-remote=https://github.com/greenplum-db/pxf \
```

0 comments on commit edf8196

Please sign in to comment.