Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt run, test, seed, etc. counts exclude disabled models but include tests that depend on disabled models #1804

Closed
jrandrews opened this issue Oct 4, 2019 · 5 comments
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors!

Comments

@jrandrews
Copy link

Describe the bug

When running dbt compile, dbt run, dbt test, etc., a count of objects in the dbt project are provided, like this:

Found 45 models, 252 tests, 0 snapshots, 0 analyses, 123 macros, 0 operations, 0 seed files, 57 sources

This count excludes dbt models which are disabled, but includes tests which depend on disabled models. This is confusing and inconsistent.

Steps To Reproduce

Create a model and some associated schema tests that are related to the model. Execute dbt compile and look at the counts for the models and tests in the project, e.g.

Found 45 models, 252 tests, 0 snapshots, 0 analyses, 123 macros, 0 operations, 0 seed files, 57 sources

Disable the model. Run dbt compile and observe the number of objects associated with the project. The number of models will have decreased by one but the number of tests will have remained constant even though some of the tests depend on the disabled model, e.g.:

Found 44 models, 252 tests, 0 snapshots, 0 analyses, 123 macros, 0 operations, 0 seed files, 57 sources

Expected behavior

The more immediate behavior to fix the bug would be to exclude tests that are disabled from the count of objects generated from dbt commands. So in the scenario above, assuming that two tests depend on the disabled model, the count of objects after disabling the model should exclude these two tests and be:

Found 44 models, 250 tests, 0 snapshots, 0 analyses, 123 macros, 0 operations, 0 seed files, 57 sources

A more helpful longer-term approach might be to include and count both enabled and disabled objects, such as:

Found 44 enabled models, 1 disabled model, 250 enabled tests, 2 disabled tests, 0 snapshots, 0 analyses, 123 macros, 0 operations, 0 seed files, 57 sources

I understand that there are a lot of mechanical reasons why it is difficult to count disabled models. But given that disabled tests are already being lumped in with enabled tests in the counts, it should hopefully not be that hard to at least call out the number of enabled vs. disabled tests as per the following:

Found 44 models, 250 enabled tests, 2 disabled tests, 0 snapshots, 0 analyses, 123 macros, 0 operations, 0 seed files, 57 sources

Screenshots and log output

Already above.

System information

Which database are you using dbt with?
Snowflake

The output of dbt --version:

installed version: 0.14.2
   latest version: 0.14.2

Up to date!

The operating system you're using:
Mac OS Mojave 10.14.5

The output of python --version:
Python 2.7.10

@jrandrews jrandrews added bug Something isn't working triage labels Oct 4, 2019
@drewbanin drewbanin removed the triage label Oct 7, 2019
@drewbanin
Copy link
Contributor

Thanks @jrandrews! I'd be very interested in merging a PR that excludes disabled models from this resource count log line. Is that something you'd be interested in submitting a patch for? I believe the operative lines of code are around here:
https://github.com/fishtown-analytics/dbt/blob/31e085b7df015eac92faee4912fa481d1085c4b9/core/dbt/ui/printer.py#L90-L106

@drewbanin drewbanin added the good_first_issue Straightforward + self-contained changes, good for new contributors! label Oct 7, 2019
@NiallRees
Copy link
Contributor

Thanks @jrandrews! I'd be very interested in merging a PR that excludes disabled models from this resource count log line. Is that something you'd be interested in submitting a patch for? I believe the operative lines of code are around here:
https://github.com/fishtown-analytics/dbt/blob/31e085b7df015eac92faee4912fa481d1085c4b9/core/dbt/ui/printer.py#L90-L106

The code invoked for compile stats is https://github.com/fishtown-analytics/dbt/blob/e51c942e91a94936f68f2965963d3b46f1257658/core/dbt/compilation.py#L34-L54

Not sure that this is a case of simply filtering out the test nodes in that function, I think the issue lies in a higher level function. From the digging I've done so far it, created manifests exclude disabled models, but includes the tests for them. Here is where disabled models are filtered out: https://github.com/fishtown-analytics/dbt/blob/e51c942e91a94936f68f2965963d3b46f1257658/core/dbt/parser/util.py#L79-L96

This is my first contribution so very new to the codebase, keen for thoughts @drewbanin.

@drewbanin
Copy link
Contributor

hey @NiallRees - I lost track of this one!

This change looks like it would be simple, but it's actually pretty complicated!! There's a lot going on here that's not super obvious.

Take a look at this method:
https://github.com/fishtown-analytics/dbt/blob/8cd8705d67627e8f701afb5f5490d553e76f3dfa/core/dbt/compilation.py#L204-L207

This is where the "stats" that are printed out get generated. In this case, test nodes that reference disabled models will have the property node.config.enabled set to False. This happens in the parsing code, check it out here:

https://github.com/fishtown-analytics/dbt/blob/8cd8705d67627e8f701afb5f5490d553e76f3dfa/core/dbt/parser/util.py#L200-L209

So, I think we can add some logic to this stats generating method to check where node.config.enabled is False, then exclude those nodes from the count. One small problem is that certain node types (like macros) don't have config properties! We could address that by:

  • using something like getattr(node, 'config', None) (not ideal)
  • using a conditional to check the node's resource type, eg (pseudocode):
if node.resource_type in [NodeTypes.model, NodeTypes.test, NodeTypes.snapshot, ....] \
   and not node.config.enabled:
  pass
else:
  stats[node.resource_type] += 1

Happy to follow up here if you have any questions or thoughts - let me know!

@NiallRees
Copy link
Contributor

@drewbanin Submitted a PR here: #2026

@drewbanin drewbanin added this to the Barbara Gittings milestone Jan 6, 2020
@drewbanin
Copy link
Contributor

closed by #2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors!
Projects
None yet
Development

No branches or pull requests

3 participants