Skip to content

fix: schema defaulting to None instead of "public" in postgres database integration #136

@dusklight00

Description

@dusklight00

Describe the bug
When schema is not described in data_sources of config.yaml, then by default schema takes the value None, which is when passed to create_engine function from sqlalchemy, which is not properly handled and that creates a poor connection and tables are not loaded.

Expected behavior
There should be a proper condition to handle if schema is set to none instead of directly passing it to create_engine function.

To Reproduce
Step 1: Setup sample SQL given in https://docs.datachecks.io/getting_started/ in PostgresSQL
Step 2: Copy paste the config.yaml as given by the documentation, and customize it according to your database name, mine is following.

data_sources:
  - name: postgres
    type: postgres
    connection:
      host: 127.0.0.1
      port: 5432
      username: postgres
      password: password
      database: postgres
metrics:
  - name: count_of_products
    metric_type: row_count
    resource: postgres.products
    validation:
      threshold: "> 0 & < 1000"
  - name: max_product_price_in_india
    metric_type: max
    resource: postgres.products.price
    filters:
      where: "country_code = 'IN'"
    validation:
      threshold: "< 190"

Step 3: Run the project using the command poetry run datachecks inspect -C config.yaml

Screenshots
When schema is set to none
image
image

When schema is made public
image
image

Logs and additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions