Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial unit tests for ModelRunner class #10196

Merged
merged 5 commits into from
Jun 18, 2024

Conversation

QMalcolm
Copy link
Contributor

@QMalcolm QMalcolm commented May 21, 2024

resolves #9867

Problem

We haven't had a way to easily unit test things like the ModelRunner class previously. This is because there were A LOT of things to mock / build before getting to "testing". We've been building a fair amount of testing fixtures and utilities over the past few months. We wanted to investigate how close we are to being able to unit test the ModelRunner class.

Solution

Attempt to unit test ModelRunner class. We were able to write a unit test the print_result_line method, which itself calls many ModelRunner methods. However we were unable to successfully test the primary method execute.

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@QMalcolm QMalcolm added the Skip Changelog Skips GHA to check for changelog file label May 21, 2024
@cla-bot cla-bot bot added the cla:yes label May 21, 2024
Copy link

codecov bot commented May 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.71%. Comparing base (d4a6482) to head (4bda8ab).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10196      +/-   ##
==========================================
- Coverage   88.77%   88.71%   -0.06%     
==========================================
  Files         180      180              
  Lines       22483    22488       +5     
==========================================
- Hits        19959    19951       -8     
- Misses       2524     2537      +13     
Flag Coverage Δ
integration 85.97% <ø> (-0.13%) ⬇️
unit 62.13% <ø> (+0.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@QMalcolm QMalcolm force-pushed the qmalcolm--9867-model-runner-unit-testing branch from b225e21 to f745ba1 Compare May 22, 2024 05:16
@QMalcolm QMalcolm force-pushed the qmalcolm--9867-model-runner-unit-testing branch from f745ba1 to fbd2667 Compare June 13, 2024 19:19
An attempt at testing `ModelRunner.execute`. We should probably also be
asserting that the model has been executed. However before we could get there,
we're running into runtime errors during `ModelRunner.execute`. Currently the
struggle is ensuring the adapter exists in the global factory when `execute`
goes looking for. The error we're getting looks like the following:
```
    def test_execute(self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner) -> None:
>       model_runner.execute(model=table_model, manifest=manifest)

tests/unit/task/test_run.py:121:
----
core/dbt/task/run.py:259: in execute
    context = generate_runtime_model_context(model, self.config, manifest)
core/dbt/context/providers.py:1636: in generate_runtime_model_context
    ctx = ModelContext(model, config, manifest, RuntimeProvider(), None)
core/dbt/context/providers.py:834: in __init__
    self.adapter = get_adapter(self.config)
venv/lib/python3.10/site-packages/dbt/adapters/factory.py:207: in get_adapter
    return FACTORY.lookup_adapter(config.credentials.type)
---`
self = <dbt.adapters.factory.AdapterContainer object at 0x106e73280>, adapter_name = 'postgres'

    def lookup_adapter(self, adapter_name: str) -> Adapter:
>       return self.adapters[adapter_name]
E       KeyError: 'postgres'

venv/lib/python3.10/site-packages/dbt/adapters/factory.py:132: KeyError
```
Previously we were running into an issue where the during `ModelRunner.execute`
the mock_adapter we were using wouldn't be found in the global adapter
factory. We've gotten past this error by supply a "real" adapter, a
`PostgresAdapter` instance. However we're now running into a new error
in which the materialization macro can't be found. This error looks like
```
model_runner = <dbt.task.run.ModelRunner object at 0x106746650>

    def test_execute(
        self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
    ) -> None:
>       model_runner.execute(model=table_model, manifest=manifest)

tests/unit/task/test_run.py:129:
----
self = <dbt.task.run.ModelRunner object at 0x106746650>
model = ModelNode(database='dbt', schema='dbt_schema', name='table_model', resource_type=<NodeType.Model: 'model'>, package_na...ected'>, constraints=[], version=None, latest_version=None, deprecation_date=None, defer_relation=None, primary_key=[])
manifest = Manifest(nodes={'seed.pkg.seed': SeedNode(database='dbt', schema='dbt_schema', name='seed', resource_type=<NodeType.Se...s(show=True, node_color=None), patch_path=None, arguments=[], created_at=1718229810.21914, supported_languages=None)}})

    def execute(self, model, manifest):
        context = generate_runtime_model_context(model, self.config, manifest)

        materialization_macro = manifest.find_materialization_macro_by_name(
            self.config.project_name, model.get_materialization(), self.adapter.type()
        )

        if materialization_macro is None:
>           raise MissingMaterializationError(
                materialization=model.get_materialization(), adapter_type=self.adapter.type()
            )
E           dbt.adapters.exceptions.compilation.MissingMaterializationError: Compilation Error
E             No materialization 'table' was found for adapter postgres! (searched types 'default' and 'postgres')

core/dbt/task/run.py:266: MissingMaterializationError
```
…xecute` test

Previously the `TestModelRunner:test_execute` test was running into a runtime error
do to the macro `materialization_table_default` macro not existing in the project. This
commit adds that macro to the project (though it should ideally get loaded via interactions
between the manifest and adapter). Manually adding it resolved our previous issue, but created
a new one. The macro appears to not be properly loaded into the manifest, and thus isn't
discoverable later on when getting the macros for the jinja context. This leads to an error
that looks like the following:
```
model_runner = <dbt.task.run.ModelRunner object at 0x1080a4f70>

    def test_execute(
        self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
    ) -> None:
>       model_runner.execute(model=table_model, manifest=manifest)

tests/unit/task/test_run.py:129:
----
core/dbt/task/run.py:287: in execute
    result = MacroGenerator(
core/dbt/clients/jinja.py:82: in __call__
    return self.call_macro(*args, **kwargs)
venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:294: in call_macro
    macro = self.get_macro()
---
self = <dbt.clients.jinja.MacroGenerator object at 0x1080f3130>

    def get_macro(self):
        name = self.get_name()
        template = self.get_template()
        # make the module. previously we set both vars and local, but that's
        # redundant: They both end up in the same place
        # make_module is in jinja2.environment. It returns a TemplateModule
        module = template.make_module(vars=self.context, shared=False)
>       macro = module.__dict__[get_dbt_macro_name(name)]
E       KeyError: 'dbt_macro__materialization_table_default'

venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:277: KeyError
```

It's becoming apparent that we need to find a better way to either mock or legitimately
load the default and adapter macros. At this point I think I've exausted the time box
I should be using to figure out if testing the `ModelRunner` class is possible currently,
with the result being more work has yet to be done.
@QMalcolm QMalcolm force-pushed the qmalcolm--9867-model-runner-unit-testing branch from fbd2667 to 161b32e Compare June 13, 2024 20:09
@QMalcolm QMalcolm marked this pull request as ready for review June 13, 2024 20:35
@QMalcolm QMalcolm requested a review from a team as a code owner June 13, 2024 20:35
run_result: RunResult,
) -> None:
# Setup way to catch events
event_manager = get_event_manager()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be some setup steps vs just something in a single test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move this to the local log_model_result_catcher fixture. Additionally, the setup can be reduced by a few steps because, in the time since originally writing this, we've released a version of dbt-common which provides a add_callback_to_manager method. So the log_model_result_catcher fixture would become

@pytest.fixture
def log_model_result_catcher(self) -> EventCatcher:
    catcher = EventCatcher(event_to_catch=LogModelResult)
    add_callback_to_manager(catcher.catch)
    return catcher

def test_execute(
self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
) -> None:
model_runner.execute(model=table_model, manifest=manifest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on how we can actually test this? I think we probably want to think a bit more about what the runner layer logic is? Is it run a specific macro with correct parameters? or something else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A non-exhaustive list of things we could unit test about the execute function

  1. The result on success
    a. execute should return a RunResult
    1. assert that the status is success
    2. assert that the adapter_response is as expected
  2. If the materialization_macro is not found, the appropriate error is raised
    a. this is what we are unintentionally doing currently (and we want to be doing it intentionally)
  3. If config is not in context the appropriate error is raised
  4. If the materialization macro doesn't support the language of the model, the appropriate error is raised
  5. Pre model hook is fired
  6. Post model hook is fired

What is blocking most of this currently is (2.a)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we open a follow up ticket about 2.a?

Copy link
Contributor Author

@QMalcolm QMalcolm Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeedy 👍

@QMalcolm QMalcolm requested a review from ChenyuLInx June 17, 2024 15:50
Copy link
Contributor

@ChenyuLInx ChenyuLInx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Lets make sure we have follow up ticket to run execute(or maybe have some sort of scoped down running materialization)

@QMalcolm QMalcolm merged commit da19d7b into main Jun 18, 2024
60 checks passed
@QMalcolm QMalcolm deleted the qmalcolm--9867-model-runner-unit-testing branch June 18, 2024 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes Skip Changelog Skips GHA to check for changelog file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SPIKE+] Make a plan for how to write a unittest for ModelRunner
2 participants