Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving / Updating how some hooks work #118

Open
noel opened this issue Mar 23, 2023 · 3 comments
Open

Improving / Updating how some hooks work #118

noel opened this issue Mar 23, 2023 · 3 comments
Labels
enhancement New feature or request priority: high

Comments

@noel
Copy link
Collaborator

noel commented Mar 23, 2023

Background

When we started using pre-commit-dbt (the original project) we found that some hooks did not exhibit what we believe is the "correct" behavior. We made updates to the way some hooks worked and merged those changes. Now that we have moved to dbt-checkpoint and officially updated the version, some people are running into issues due to the new behavior.

What were we trying to solve for?

We noticed that a hook we were using check_model_has_all_columns was not always catching issues. The root cause was that it only checked that a property (yml) file contained all the columns in a model (sql) file if the model was changed. However, if the property file was the only file changed then it was not compared to the model file. This would allow someone to delete or add a column to the property file even though it did not match the model file. We felt this check should assure that these two files are always in sync and as such it should check if either file is changed.

How was this issue addressed?

We "fill in" the missing files so that if the yml file is changed we find the corresponding sql file so we can make sure we do the proper check.

Side effects

@followingell reported an issue with check-model-has-tests where excluded files were being included in the check. @karabulute found what we had implemented and submitted a PR that removed the functionality we added.

Our Opinion

We don't believe we should remove get_missing_file_paths from all the hooks because without this we are indirectly not complying with the spirit of the hook.

Proposal

We propose we add a parameter to the hooks that have this behavior so that files can be excluded, but we cannot use the pre-commit exclude parameter because we don't have that information in dbt-checkpoint.
Instead of doing this

- id: check-model-has-tests
  description: "Ensures that the model has a number of tests"
  args: ["--test-cnt", "1", "--"]
  exclude: |
    (?x)(
      models/demo
    )

We propose this

- id: check-model-has-tests
  description: "Ensures that the model has a number of tests"
  args: ["--test-cnt", "1", "--exclude","models/demo", "--"]

Which hooks are Impacted

Hooks that implement yml/sql file discovery:

@JFrackson
Copy link
Collaborator

@noel or @BAntonellini : Can you provide some more clarity on why the pre-commit exclude configuration is not an option here?

Its availability at multiple levels of the pre-commit-config is a compelling reason to use it if possible instead of creating another CLI exclude option. For example, if I always wanted to ignore a given directory for every single ID, it would be simpler/DRYer to use exclude at the repo level in my pre-commit-config.

@BAntonellini
Copy link
Collaborator

@JFrackson pre-commit's exclude works in a previous stage before reaching dbt-checkpoint hooks. What we receive in dbt-checkpoint are the results of pre-commit configurations, and we have no access to what the user specified in exclude, be it at repo or hook level.

@JFrackson
Copy link
Collaborator

Thanks for clarifying @BAntonellini . So without a more significant change it wouldn't be possible. That seems fine to me then to merge this as is! Let's merge this and then push a v1.2 so that there's a new stable release with this patch and the others recently.

Also, just in case users would find it useful: we can create an issue for making exclusions easier so that users can leave comments or reactions on what they would like to see (i.e. to keep conversation in one place). Would you mind creating that issue after you merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority: high
Projects
None yet
Development

No branches or pull requests

3 participants