Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Port smoke automation group to pg_regress
Includes HdfsSmokeTest, WritableSmokeTest, MultiBlockDataSmokeTest, HiveSmokeTest, and HBaseSmokeTest. Tests are separated into external table and foreign table. Foreign table tests begin with the prefix 'FDW_'. Uses sed to search and replace some templates (e.g. `{{ TEMPLATE }}`) for things like HCFS_CMD (path to `hdfs` CLI). Templates live in `sql` (script run in `psql`) and `expected` (output that's expected). Results will end up under `results/`. Generated files have the same name as the sql/*.sql and expected/*.out files, but with a leading underscore. These interpolated files are used by pg_regress. All test names end in 'Test'. Tests can be grouped arbitrarily using schedule files, under `schedules/`. Schedule names end in '_schedule', FDW schedules being with 'fdw_'. See README.md for more info. init_file contains regex ignore and substitution patterns. PXF_TEST_DEBUG may be set to anything to keep files and run pg_regress in debug mode. Using `make clean` should clean out any remnants. It is convenient to use an external writable table or foreign table to write data to HDFS, but to avoid the 'garbage-in, garbage-out' scenario, we write to HDFS with a Greenplum table, then uses `hdfs` command to confirm that the data arrived in HDFS in the correct format. Includes an example pipeline that runs pg_regress against all the clouds that we support (WASBS, ADL, S3, GCS) along with Minio and local HDFS. The pipeline builds Greenplum (needed because this relies on a specifice branch of Greenplum at the moment). It is possible to run against a remote Greenplum and HDFS (e.g. a Kerberized dataproc) however this commit doesn't include test code to do that in CI. Authored-by: Oliver Albertini <oalbertini@pivotal.io>
- Loading branch information