-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding dbt parse hook #195
Adding dbt parse hook #195
Conversation
@BAntonellini I'd be interested in your opinion on this, especially,
|
how are you handling the hooks that require docs generate? In general we are actually thinking of deprecating all dbt commands from the hooks as they can be run as prior steps in the pipeline and doing that you can fix issues such as those when you run slim-ci |
That would be interesting! Is there somewhere I can follow this conversation? My concern with this approach is that for pre-commit to work effectively locally it needs the latest version of the dbt artifacts, so my preference is that these are generated locally on every run. Obviously this can be slow, but this is the downside that comes with the upside of these awesome hooks. And in a CI pipeline we can just skip these hooks as the artifacts are already present (I already do this here). Remember as well that Slim CI is a dbt Cloud feature, not every dbt project will have this. Regarding this PR, my thought process is that I'd need to re-evaluate whether each hook requires a compile, parse or docs generate command to be run beforehand and update the documentation. Bit manual but not impossible to do. At this stage I just wanted validation that such a change would be acceptable, then it makes sense for me to invest this time. |
@pgoslatara Slim CI is not a dbt Cloud feature, this is in core as well as deferral. We have many clients doing this in Datacoves.
I am curious about your CI/CD process. Are you having dbt cloud do the actual run and hence the CI runner does not need the credentials to the db? Have you looked at the dbt Cloud Admin API? You can pull the manifest from there as well. |
I don't think this link is what you intended.
True, I've implemented this using dbt Core. I think my process doesn't really matter here, I think My concern is the local development experience. If we are "deprecating all dbt commands from the hooks" then the developer who uses dbt Core suffers as they will now have an additional step to perform to ensure their hooks have the latest artifact available every time they commit. Let's re-focus on the original question, is there value in adding a
|
Ok. Feel free to add it. Make sure to update any tests and readme. One thing I want to make sure is that if there are any hooks where this approach doesn’t work then we call it out. Conversely if this works for all hooks then it should be preferred over compile, no? |
�[0m18:19:33 Registered adapter: duckdb=1.7.3 �[0m18:19:33 Found 20 models, 53 tests, 0 sources, 0 exposures, 0 metrics, 615 macros, 0 groups, 0 semantic models �[0m18:19:33 �[0m18:19:33 Concurrency: 8 threads (target='dev') �[0m18:19:33 to �[0m18:19:37 Running with dbt=1.7.10 �[0m18:19:37 Registered adapter: duckdb=1.7.3 �[0m18:19:37 Performance info: /home/pslattery/repos/medium_scraper/dbt/target/perf_info.json
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #195 +/- ##
==========================================
+ Coverage 96.91% 96.94% +0.03%
==========================================
Files 55 56 +1
Lines 2592 2622 +30
Branches 349 349
==========================================
+ Hits 2512 2542 +30
Misses 59 59
Partials 21 21 ☔ View full report in Codecov by Sentry. |
@pgoslatara thanks for the PR. |
@noel You're fast 🚀! In this scenario it's rather simple, those hooks cannot be used. I've sometimes been in this situation when the system where the CI runs is not allowed to access the database, adding |
I've updated this PR, primarily adding a test and updating documentation. Regarding hooks where Regarding what should be the preferred approach, I've updated the documentation for all hooks that previously said: It means that you need to run `dbt run`, `dbt compile` before run this hook. To now read: It means that you need to run `dbt parse` before run this hook. The difference between the output of |
Hey @pgoslatara We found this in
Meaning this PR won't work in dbt <1.5 Could you add some kind of warning message (maybe in the README) with this information? |
@BAntonellini Good catch! In 123452f I added that dbt >= 1.5 is required when |
@BAntonellini @noel Happy to move forward with this or is there another aspect we should look into? |
Why the hook is configured to run only for SQL files?
Changes in YAML files, e.g. in |
Also, the hook is missing:
|
Addresses #166.
I really like what
dbt-checkpoint
offers and want to use it on most of the dbt projects I work with. One downside is that it needs to rundbt compile
for a lot of hooks and this requires a connection to the underlying database. This can be rather painful as a lot of my recent clients are using dbt Cloud and when we try to enabledbt-checkpoint
through an Azure DevOps pipeline or GitHub workflow then we need to add authentication from the version control provider to the database, this is a barrier for a lot of clients and a hard blocker for some. This PR adds adbt-parse
hook that generates amanifest.json
but doesn't require a database connection.To do:
dbt-parse
and notdbt-compile
.