Skip to content

Commit

Permalink
[DOCS] Document BigQuery test dataset configuration
Browse files Browse the repository at this point in the history
  • Loading branch information
jdimatteo committed Aug 18, 2021
1 parent 88b636a commit 1c287f8
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 11 deletions.
1 change: 1 addition & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ title: Changelog

### Develop
* [BUGFIX] Fix deprecation warning for importing from collections (#3228)
* [DOCS] Document BigQuery test dataset configuration (#3273)

### 0.13.28
* [FEATURE] Implement ColumnPairValuesInSet metric for PandasExecutionEngine
Expand Down
7 changes: 3 additions & 4 deletions docs/contributing/contributing_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,16 @@ In order to run BigQuery tests, you first need to go through the following steps

1. [Select or create a Cloud Platform project](https://console.cloud.google.com/project).
2. [Setup Authentication](https://googleapis.dev/python/google-api-core/latest/auth.html).
3. In your project, [create a BigQuery dataset](https://cloud.google.com/bigquery/docs/datasets) named `test_ci` and [set the dataset default table expiration](https://cloud.google.com/bigquery/docs/updating-datasets#table-expiration) to `.1` days
3. In your project, [create a BigQuery dataset](https://cloud.google.com/bigquery/docs/datasets) (e.g. named `test_ci`) and [set the dataset default table expiration](https://cloud.google.com/bigquery/docs/updating-datasets#table-expiration) to `.1` days

After setting up authentication, you can run with your project using the environment variable `GE_TEST_BIGQUERY_PROJECT`, e.g.
After setting up authentication, you can run with your project using the environment variables `GE_TEST_BIGQUERY_PROJECT` and `GE_TEST_BIGQUERY_DATASET`, e.g.

```bash
GE_TEST_BIGQUERY_PROJECT=<YOUR_GOOGLE_CLOUD_PROJECT>
GE_TEST_BIGQUERY_DATASET=test_ci
pytest tests/test_definitions/test_expectations_cfe.py --bigquery --no-spark --no-postgresql
```

Note that if you prefer to use a different dataset besides "test_ci", you can specify a different dataset with `GE_TEST_BIGQUERY_DATASET`.

### Writing unit and integration tests

Production code in Great Expectations must be thoroughly tested. In general, we insist on unit tests for all branches of every method, including likely error states. Most new feature contributions should include several unit tests. Contributions that modify or extend existing features should include a test of the new behavior.
Expand Down
9 changes: 4 additions & 5 deletions docs_rtd/contributing/testing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,20 +41,19 @@ In order to run BigQuery tests, you first need to go through the following steps

1. `Select or create a Cloud Platform project.`_
2. `Setup Authentication.`_
3. `In your project, create a BigQuery dataset named "test_ci"`_ and `set the dataset default table expiration to .1 days`_
3. `In your project, create a BigQuery dataset (e.g. named "test_ci")`_ and `set the dataset default table expiration to .1 days`_

.. _Select or create a Cloud Platform project.: https://console.cloud.google.com/project
.. _Setup Authentication.: https://googleapis.dev/python/google-api-core/latest/auth.html
.. _`In your project, create a BigQuery dataset named "test_ci"`: https://cloud.google.com/bigquery/docs/datasets
.. _`In your project, create a BigQuery dataset (e.g. named "test_ci")`: https://cloud.google.com/bigquery/docs/datasets
.. _`set the dataset default table expiration to .1 days`: https://cloud.google.com/bigquery/docs/updating-datasets#table-expiration

After setting up authentication, you can run with your project using the environment variable `GE_TEST_BIGQUERY_PROJECT`, e.g.
After setting up authentication, you can run with your project using the environment variables `GE_TEST_BIGQUERY_PROJECT` and `GE_TEST_BIGQUERY_DATASET`, e.g.

.. code-block::
GE_TEST_BIGQUERY_PROJECT=<YOUR_GOOGLE_CLOUD_PROJECT> pytest tests/test_definitions/test_expectations_cfe.py --bigquery --no-spark --no-postgresql -k bigquery
GE_TEST_BIGQUERY_PROJECT=<YOUR_GOOGLE_CLOUD_PROJECT> GE_TEST_BIGQUERY_DATASET=test_ci pytest tests/test_definitions/test_expectations_cfe.py --bigquery --no-spark --no-postgresql -k bigquery
Note that if you prefer to use a different dataset besides "test_ci", you can specify a different dataset with `GE_TEST_BIGQUERY_DATASET`.
Writing unit and integration tests
----------------------------------
Expand Down
9 changes: 7 additions & 2 deletions great_expectations/self_check/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -2006,10 +2006,15 @@ def _create_bigquery_engine() -> Engine:
gcp_project = os.getenv("GE_TEST_BIGQUERY_PROJECT")
if not gcp_project:
raise ValueError(
"Environment Variable GE_TEST_BIGQUERY_PROJECT is required to run expectation tests"
"Environment Variable GE_TEST_BIGQUERY_PROJECT is required to run BigQuery expectation tests"
)
return create_engine(f"bigquery://{gcp_project}/{_bigquery_dataset()}")


def _bigquery_dataset() -> str:
return os.getenv("GE_TEST_BIGQUERY_DATASET")
dataset = os.getenv("GE_TEST_BIGQUERY_DATASET")
if not dataset:
raise ValueError(
"Environment Variable GE_TEST_BIGQUERY_DATASET is required to run BigQuery expectation tests"
)
return dataset

0 comments on commit 1c287f8

Please sign in to comment.