Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run dbt deps before running the dbt ls command #481

Merged
merged 48 commits into from
Aug 31, 2023

Conversation

DanMawdsleyBA
Copy link
Contributor

@DanMawdsleyBA DanMawdsleyBA commented Aug 21, 2023

Description

Runs the dbt deps command before running the dbt ls command

Related Issue(s)

Closes #445

Breaking Change?

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@DanMawdsleyBA DanMawdsleyBA requested a review from a team as a code owner August 21, 2023 20:08
@DanMawdsleyBA DanMawdsleyBA requested a review from a team August 21, 2023 20:08
@netlify
Copy link

netlify bot commented Aug 21, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name Link
🔨 Latest commit 3103c30
🔍 Latest deploy log https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/64f07a9f6fe79a00081d0261

@pre-commit-ci pre-commit-ci bot temporarily deployed to internal August 21, 2023 20:09 Inactive
@pre-commit-ci pre-commit-ci bot temporarily deployed to internal August 21, 2023 20:21 Inactive
@DanMawdsleyBA DanMawdsleyBA temporarily deployed to internal August 22, 2023 06:36 — with GitHub Actions Inactive
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thanks a lot for your first contribution, @DanMawdsleyBA ! I'm hopeful this could be part of the upcoming 1.1 release.

I can see two areas for improvement:

  1. A test that represents the original issue this PR solves, that breaks before the change and passes after this modification. Ideally, it would highlight what happens when a profile depends on a specific dbt adapter that is not installed and which error would be raised with dbt ls.
  2. Do you think all users would be happy for, during DAG parsing time, to have dbt deps installed? What would happen if they had a read-only filesystem? A possibility to address this could be to have a new parameter added to RenderConfig, something like install_deps, set to True by default, but users could opt out (@jlaneve may have other ideas/suggestions)

Please do reach out if you'd like support writing the test.

@DanMawdsleyBA
Copy link
Contributor Author

DanMawdsleyBA commented Aug 22, 2023

This looks great. Thanks a lot for your first contribution, @DanMawdsleyBA ! I'm hopeful this could be part of the upcoming 1.1 release.

I can see two areas for improvement:

  1. A test that represents the original issue this PR solves, that breaks before the change and passes after this modification. Ideally, it would highlight what happens when a profile depends on a specific dbt adapter that is not installed and which error would be raised with dbt ls.
  2. Do you think all users would be happy for, during DAG parsing time, to have dbt deps installed? What would happen if they had a read-only filesystem? A possibility to address this could be to have a new parameter added to RenderConfig, something like install_deps, set to True by default, but users could opt out (@jlaneve may have other ideas/suggestions)

Please do reach out if you'd like support writing the test.

@tatiana
I've added a dbt_deps into RenderConfig with a default value as True but can be set to False to not run dbt deps this only applies when loading via dbt ls . I've added a test to catch the failure by having a package file with dbt_utils and manually setting the dbt deps to False. I also improved the errors to catch this specific case. Let me know if you have any more thoughts. If you're happy if you could add it to a pre-release version I can test it on an actual airflow instance

@pre-commit-ci pre-commit-ci bot temporarily deployed to internal August 22, 2023 14:31 Inactive
@harels harels added this to the 1.1.0 milestone Aug 22, 2023
@pre-commit-ci pre-commit-ci bot temporarily deployed to internal August 22, 2023 15:11 Inactive
@DanMawdsleyBA
Copy link
Contributor Author

One thing I have realised is that this fix won't work on systems that have the read only issue as by default the packages are stored in dbt project path/ packages . There is a variable that can be set in the dbt_project.yml packages_install_path which may help however I don't think there is a way to specify that via a flag or env_var (Although happy to be wrong on that). It might be possible to do some kind of project.yml update to that variable and override the current but feels a janky solution. Not sure if @harels or @tatiana do you have any better ideas?
I'm starting to feel it might be easier to just copy the whole dbt project into a new directory for read only systems and have everything point to that new directory.

@tatiana
Copy link
Collaborator

tatiana commented Aug 23, 2023

One thing I have realised is that this fix won't work on systems that have the read only issue as by default the packages are stored in dbt project path/ packages . There is a variable that can be set in the dbt_project.yml packages_install_path which may help however I don't think there is a way to specify that via a flag or env_var (Although happy to be wrong on that). It might be possible to do some kind of project.yml update to that variable and override the current but feels a janky solution. Not sure if @harels or @tatiana do you have any better ideas? I'm starting to feel it might be easier to just copy the whole dbt project into a new directory for read only systems and have everything point to that new directory.

Hi @DanMawdsleyBA , that's a very important point.
I refactored a bit the logic in the dbt ls and hopefully it will be able to address the issue of running dbt deps on project directories that were originally read only: #488

cosmos/dbt/graph.py Outdated Show resolved Hide resolved
tests/dbt/test_graph.py Outdated Show resolved Hide resolved
@tatiana tatiana temporarily deployed to external August 31, 2023 09:33 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to external August 31, 2023 09:48 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to external August 31, 2023 10:34 — with GitHub Actions Inactive
@tatiana tatiana changed the title Added deps before ls command Run dbt deps before running the dbt ls command Aug 31, 2023
@codecov
Copy link

codecov bot commented Aug 31, 2023

Codecov Report

Patch coverage: 92.85% and project coverage change: -0.08% ⚠️

Comparison is base (8c9e768) 91.59% compared to head (3103c30) 91.51%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #481      +/-   ##
==========================================
- Coverage   91.59%   91.51%   -0.08%     
==========================================
  Files          50       50              
  Lines        1856     1874      +18     
==========================================
+ Hits         1700     1715      +15     
- Misses        156      159       +3     
Files Changed Coverage Δ
cosmos/dbt/graph.py 98.67% <92.30%> (-1.33%) ⬇️
cosmos/config.py 92.85% <100.00%> (+0.08%) ⬆️
cosmos/converter.py 95.77% <100.00%> (-1.37%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tatiana tatiana temporarily deployed to external August 31, 2023 11:34 — with GitHub Actions Inactive
@tatiana
Copy link
Collaborator

tatiana commented Aug 31, 2023

@DanMawdsleyBA We managed to get all the tests to pass 🎉 Let's see after the rebase. Once this is passing, we'll merge this. It will be released as part of 1.1.

Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy for these changes to be merged, I could test it locally.

It will help many Cosmos users with DAGs with no nodes due to dependencies not being installed. We'll improve the coverage in a follow up PR.

@tatiana tatiana merged commit 13087d5 into astronomer:main Aug 31, 2023
40 of 41 checks passed
@tatiana tatiana mentioned this pull request Sep 6, 2023
tatiana added a commit that referenced this pull request Sep 6, 2023
**Features**

* Support dbt global flags (via dbt_cmd_global_flags in operator_args)
by @tatiana in #469
* Support parsing DAGs when there are no connections by @jlaneve in #489

**Enhancements**

* Hide sensitive field when using BigQuery keyfile_dict profile mapping
by @jbandoro in #471
* Consistent Airflow Dataset URIs, inlets and outlets with `Openlineage
package <https://pypi.org/project/openlineage-integration-common/>`_ by
@tatiana in #485. `Read more
<https://astronomer.github.io/astronomer-cosmos/configuration/lineage.html>`_.
* Refactor ``LoadMethod.DBT_LS`` to run from a temporary directory with
symbolic links by @tatiana in #488
* Run ``dbt deps`` when using ``LoadMethod.DBT_LS`` by @DanMawdsleyBA in
#481
* Update Cosmos log color to purple by @harels in #494
* Change operators to log ``dbt`` commands output as opposed to
recording to XCom by @tatiana in #513

**Bug fixes**

* Fix bug on select node add exclude selector subset ids logic by
@jensenity in #463
* Refactor dbt ls to run from a temporary directory, to avoid Read-only
file system errors during DAG parsing, by @tatiana in #414
* Fix profile_config arg in DbtKubernetesBaseOperator by @david-mag in
#505
* Fix SnowflakePrivateKeyPemProfileMapping private_key reference by
@nacpacheco in #501
* Fix incorrect temporary directory creation in VirtualenvOperator init
by @tatiana in #500
* Fix log propagation issue by @tatiana in #498
* Fix PostgresUserPasswordProfileMapping to retrieve port from
connection by @jlneve in #511

**Others**

* Docs: Fix RenderConfig load argument by @jbandoro in #466
* Enable CI integration tests from external forks by @tatiana in #458
* Improve CI tests runtime by @tatiana in #457
* Change CI to run coverage after tests pass by @tatiana in #461
* Fix forks code revision in code coverage by @tatiana in #472
* [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #467
* Drop support to Python 3.7 in the CI test matrix by @harels in #490
* Add Airflow 2.7 to the CI test matrix by @tatiana in #487
* Add MyPy type checks to CI since we exceeded pre-commit disk quota
usage by @tatiana in #510
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run dbt deps when using load_mode=LoadMode.DBT_LS
4 participants