Skip to content

Commit

Permalink
Merge branch 'develop' into pr/7339
Browse files Browse the repository at this point in the history
* develop: (137 commits)
  [RELEASE] 0.16.4 (great-expectations#7525)
  [MAINTENANCE] Remove airflow2 min depdency test. (great-expectations#7524)
  [MAINTENANCE] Dedicated airflow 2.2.0 async test (great-expectations#7518)
  [MAINTENANCE] Use YAMLHandler in tests and docs (great-expectations#7507)
  [MAINTENANCE] Revert PR 7490 (great-expectations#7515)
  [BUGFIX] Typo in min versions install (great-expectations#7516)
  [DOCS] corrects typo in GCS setup guide (great-expectations#7514)
  [DOCS] Testing ADR (great-expectations#7495)
  [MAINTENANCE] Bump numpy from 1.21.0 to 1.22.0 in /docs_rtd (great-expectations#7509)
  [MAINTENANCE] Bump nbconvert from 5.6.1 to 6.5.1 in /docs_rtd (great-expectations#7508)
  [MAINTENANCE] Bump jupyter-core from 4.6.3 to 4.11.2 in /docs_rtd (great-expectations#7496)
  [MAINTENANCE] Bump certifi from 2020.6.20 to 2022.12.7 in /docs_rtd (great-expectations#7497)
  [MAINTENANCE] Bump gitpython from 3.1.7 to 3.1.30 in /docs_rtd (great-expectations#7494)
  [MAINTENANCE] Fix sqlalchemy warnings for pandas + sql fluent datasources (great-expectations#7504)
  [MAINTENANCE] Make `dataset_name` a parameter for Expectations tests or create name from `Expectation` name, which ensures only limited number of tables created. (great-expectations#7476)
  [MAINTENANCE] Pass `PandasDatasource` `batch_metadata` as `kwargs` to remove possibility of `None` on `DataAsset` model (great-expectations#7503)
  [BUGFIX] Corrected typographical errors in two docstrings (great-expectations#7506)
  [FEATURE] Introducing CapitalOne DataProfilerColumnDomainBuilder as well as multiple improvements to CapitalOne codebase and testability. (great-expectations#7498)
  [FEATURE] `BatchMetadata` for all fluent `DataAsset`s (great-expectations#7392)
  [MAINTENANCE] Connection.connect warning for SQLAlchemy 2.0 compatibility (great-expectations#7489)
  ...
  • Loading branch information
Will Shin committed Mar 31, 2023
2 parents 196f931 + 397727d commit 510e49d
Show file tree
Hide file tree
Showing 837 changed files with 56,670 additions and 25,669 deletions.
24 changes: 24 additions & 0 deletions .github/workflows/pr-title-checker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Validate that PR titles are prefixed with one of our accepted labels
name: "PR Title Checker"
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
- auto_merge_enabled

jobs:
check:
runs-on: ubuntu-latest
steps:
- name : Check PR title validity
run: |
echo ${{ github.event.pull_request.title }} | grep -E "^\\[(FEATURE|BUGFIX|DOCS|MAINTENANCE|CONTRIB|RELEASE)\\]"
if [ $? -ne 0 ]; then
echo "Invalid PR title - please prefix with one of: [FEATURE] | [BUGFIX] | [DOCS] | [MAINTENANCE] | [CONTRIB] | [RELEASE]"
exit 1
fi
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ venv/
ENV/
ge_dev/
gx_dev/
.gx_dev/

# Spyder project settings
.spyderproject
Expand Down Expand Up @@ -146,3 +147,4 @@ metastore_db
# dependency management
Pipfile
Pipfile.lock

2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ repos:
hooks:
- id: black-jupyter
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: 'v0.0.253'
rev: 'v0.0.255'
hooks:
- id: ruff
files: ^(great_expectations|contrib|scripts|tasks\.py) # TODO: add tests/ docs/ etc.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@



<img align="right" src="./static/img/gx-mark-160.png">
<img align="right" src="./docs/docusaurus/static/img/gx-mark-160.png">

Great Expectations
================================================================================
Expand Down
2 changes: 1 addition & 1 deletion assets/partners/anthonydb/just_connect.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

connection = "mssql://sa:BK72nEAoI72CSWmP@db:1433/integration?driver=ODBC+Driver+17+for+SQL+Server&charset=utf&autocommit=true"
e = sa.create_engine(connection)
results = e.execute("SELECT TOP 10 * from dbo.taxi_data").fetchall()
results = e.execute(sa.text("SELECT TOP 10 * from dbo.taxi_data")).fetchall()
for r in results:

print(r)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ const replicaIndexAndSettings = [
]

// Main Index setSettings
const attributesForFaceting = ['searchable(library_metadata.tags)', 'searchable(engineSupported)', 'searchable(exp_type)']
const attributesForFaceting = ['searchable(library_metadata.tags)', 'searchable(engineSupported)', 'searchable(exp_type)', 'searchable(package)', 'searchable(metrics)', 'searchable(contributors)']
const maxFacetHits = 100
const searchableAttributes = ['description.snake_name', 'description.short_description']
const customRanking = ['asc(description.snake_name)']
Expand Down Expand Up @@ -92,6 +92,21 @@ function formatExpectation (ExpecData) {
data.created_at = ExpecData[key].created_at
data.updated_at = ExpecData[key].updated_at
data.exp_type = ExpecData[key].exp_type
data.package = ExpecData[key].package
data.metrics = []
data.contributors = []
// Flatten the metrics array to get all the metrics
if (ExpecData[key].metrics) {
ExpecData[key].metrics.forEach((metric) => {
data['metrics'].push(metric.name)
});
}
// Flatten the contributors array to get all the contributors
if (ExpecData[key].library_metadata.contributors) {
ExpecData[key].library_metadata.contributors.forEach((contributor) => {
data['contributors'].push(contributor.replace(/@/g, ""))
});
}
dataset.push(data)
})
return dataset
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@ stages:
- bash: python -m pip install --upgrade pip==20.2.4
displayName: 'Update pip'

# includes explicit install of chardet, which was causing errors in pipeline
- script: |
pip install --constraint constraints-dev.txt ".[dev]" pytest-azurepipelines google-cloud-bigquery-storage
pip install chardet==3.0.4
displayName: 'Install dependencies'
- task: DownloadSecureFile@1
Expand Down Expand Up @@ -74,9 +76,6 @@ stages:
expectations_v3_api:
test_script: 'tests/test_definitions/test_expectations_v3_api.py'
extra_args: ''
expectations_v2_api:
test_script: 'tests/test_definitions/test_expectations_v2_api.py'
extra_args: ''
maxParallel: 1

steps:
Expand All @@ -88,8 +87,10 @@ stages:
- bash: python -m pip install --upgrade pip==20.2.4
displayName: 'Update pip'

# includes explicit install of chardet, which was causing errors in pipeline
- script: |
pip install --constraint constraints-dev.txt ".[dev]" pytest-azurepipelines google-cloud-bigquery-storage
pip install chardet==3.0.4
displayName: 'Install dependencies'
- task: DownloadSecureFile@1
Expand All @@ -113,3 +114,36 @@ stages:
GOOGLE_APPLICATION_CREDENTIALS: $(gcp_authkey.secureFilePath)
GE_TEST_GCP_PROJECT: $(GE_TEST_GCP_PROJECT)
GE_TEST_BIGQUERY_DATASET: $(GE_TEST_BIGQUERY_DATASET)
- job: snowflake_expectations_test
timeoutInMinutes: 45 # snowflake tests will run in about 30 min
variables:
python.version: '3.8'

steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'

- bash: python -m pip install --upgrade pip==20.2.4
displayName: 'Update pip'

# includes explicit install of grpcio-status and chardet, which was causing errors in pipeline
- script: |
pip install --constraint constraints-dev.txt ".[dev]" pytest-azurepipelines
pip install chardet==3.0.4
pip install grpcio-status
displayName: 'Install dependencies'
- script: |
pytest -v --snowflake tests/test_definitions/test_expectations_v3_api.py
displayName: 'pytest'
env:
SNOWFLAKE_ACCOUNT: $(SNOWFLAKE_ACCOUNT)
SNOWFLAKE_USER: $(SNOWFLAKE_USER)
SNOWFLAKE_PW: $(SNOWFLAKE_PW)
SNOWFLAKE_DATABASE: $(SNOWFLAKE_DATABASE)
SNOWFLAKE_SCHEMA: $(SNOWFLAKE_SCHEMA)
SNOWFLAKE_WAREHOUSE: $(SNOWFLAKE_WAREHOUSE)
SNOWFLAKE_ROLE: $(SNOWFLAKE_ROLE)
29 changes: 13 additions & 16 deletions ci/azure-pipelines-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ stages:
- script: |
pip install --requirement requirements-types.txt
invoke type-check --ci --pretty
# initial run doesn't check `.pyi` source files
invoke type-check --ci --pretty --check-stub-sources
name: StaticTypeCheck
- job: docstring_linter
Expand Down Expand Up @@ -237,6 +239,7 @@ stages:
displayName: 'Unit Tests'
env:
GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
SQLALCHEMY_WARN_20: true
- script: |
# Run pytest
Expand All @@ -255,6 +258,7 @@ stages:
displayName: 'pytest'
env:
GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
SQLALCHEMY_WARN_20: true
- task: PublishTestResults@2
condition: succeededOrFailed()
Expand All @@ -270,6 +274,7 @@ stages:

# Runs pytest with Spark and Postgres enabled
- job: comprehensive
timeoutInMinutes: 90
condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
strategy:
# This matrix is intended to split up our sizeable test suite into two distinct components.
Expand Down Expand Up @@ -316,6 +321,7 @@ stages:
displayName: 'pytest'
env:
GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
SQLALCHEMY_WARN_20: true
- task: PublishTestResults@2
condition: succeededOrFailed()
Expand Down Expand Up @@ -416,6 +422,7 @@ stages:
displayName: 'pytest'
env:
GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
SQLALCHEMY_WARN_20: true
- job: mssql
condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
Expand Down Expand Up @@ -455,6 +462,7 @@ stages:
displayName: 'pytest'
env:
GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
SQLALCHEMY_WARN_20: true
- job: trino
condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
Expand Down Expand Up @@ -500,6 +508,7 @@ stages:
displayName: 'pytest'
env:
GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
SQLALCHEMY_WARN_20: true
- stage: cli_integration
dependsOn: [scope_check, lint, import_ge, custom_checks]
Expand Down Expand Up @@ -536,6 +545,7 @@ stages:
displayName: 'pytest'
env:
GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
SQLALCHEMY_WARN_20: true
- stage: airflow_provider
dependsOn: [scope_check, lint, import_ge, custom_checks]
Expand All @@ -550,21 +560,8 @@ stages:
versionSpec: '3.7'
displayName: 'Use Python 3.7'

- script: ./ci/checks/check_min_airflow_dependency_compatibility.sh
name: CheckMinAirflowDependencyCompatibility

- script: ./ci/checks/run_gx_airflow_operator_tests.sh
name: RunAirflowProviderTests

- stage: test_build_docs
dependsOn: [scope_check, lint, import_ge, custom_checks]
pool:
vmImage: 'ubuntu-latest'

jobs:
- job: test_build_docs
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.8'
displayName: 'Use Python 3.8'

- script: cd docs/docusaurus && yarn install && bash ../build_docs
name: TestBuildDocs
Loading

0 comments on commit 510e49d

Please sign in to comment.