Add initial unit tests for `ModelRunner` class #10196

QMalcolm · 2024-05-21T16:36:10Z

resolves #9867

Problem

We haven't had a way to easily unit test things like the ModelRunner class previously. This is because there were A LOT of things to mock / build before getting to "testing". We've been building a fair amount of testing fixtures and utilities over the past few months. We wanted to investigate how close we are to being able to unit test the ModelRunner class.

Solution

Attempt to unit test ModelRunner class. We were able to write a unit test the print_result_line method, which itself calls many ModelRunner methods. However we were unable to successfully test the primary method execute.

Checklist

I have read the contributing guide and understand what's expected of me
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
This PR includes type annotations for new and modified functions

codecov · 2024-05-21T16:50:45Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.71%. Comparing base (d4a6482) to head (4bda8ab).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #10196      +/-   ##
==========================================
- Coverage   88.77%   88.71%   -0.06%     
==========================================
  Files         180      180              
  Lines       22483    22488       +5     
==========================================
- Hits        19959    19951       -8     
- Misses       2524     2537      +13

Flag	Coverage Δ
integration	`85.97% <ø> (-0.13%)`	⬇️
unit	`62.13% <ø> (+0.34%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

An attempt at testing `ModelRunner.execute`. We should probably also be asserting that the model has been executed. However before we could get there, we're running into runtime errors during `ModelRunner.execute`. Currently the struggle is ensuring the adapter exists in the global factory when `execute` goes looking for. The error we're getting looks like the following: ``` def test_execute(self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner) -> None: > model_runner.execute(model=table_model, manifest=manifest) tests/unit/task/test_run.py:121: ---- core/dbt/task/run.py:259: in execute context = generate_runtime_model_context(model, self.config, manifest) core/dbt/context/providers.py:1636: in generate_runtime_model_context ctx = ModelContext(model, config, manifest, RuntimeProvider(), None) core/dbt/context/providers.py:834: in __init__ self.adapter = get_adapter(self.config) venv/lib/python3.10/site-packages/dbt/adapters/factory.py:207: in get_adapter return FACTORY.lookup_adapter(config.credentials.type) ---` self = <dbt.adapters.factory.AdapterContainer object at 0x106e73280>, adapter_name = 'postgres' def lookup_adapter(self, adapter_name: str) -> Adapter: > return self.adapters[adapter_name] E KeyError: 'postgres' venv/lib/python3.10/site-packages/dbt/adapters/factory.py:132: KeyError ```

Previously we were running into an issue where the during `ModelRunner.execute` the mock_adapter we were using wouldn't be found in the global adapter factory. We've gotten past this error by supply a "real" adapter, a `PostgresAdapter` instance. However we're now running into a new error in which the materialization macro can't be found. This error looks like ``` model_runner = <dbt.task.run.ModelRunner object at 0x106746650> def test_execute( self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner ) -> None: > model_runner.execute(model=table_model, manifest=manifest) tests/unit/task/test_run.py:129: ---- self = <dbt.task.run.ModelRunner object at 0x106746650> model = ModelNode(database='dbt', schema='dbt_schema', name='table_model', resource_type=<NodeType.Model: 'model'>, package_na...ected'>, constraints=[], version=None, latest_version=None, deprecation_date=None, defer_relation=None, primary_key=[]) manifest = Manifest(nodes={'seed.pkg.seed': SeedNode(database='dbt', schema='dbt_schema', name='seed', resource_type=<NodeType.Se...s(show=True, node_color=None), patch_path=None, arguments=[], created_at=1718229810.21914, supported_languages=None)}}) def execute(self, model, manifest): context = generate_runtime_model_context(model, self.config, manifest) materialization_macro = manifest.find_materialization_macro_by_name( self.config.project_name, model.get_materialization(), self.adapter.type() ) if materialization_macro is None: > raise MissingMaterializationError( materialization=model.get_materialization(), adapter_type=self.adapter.type() ) E dbt.adapters.exceptions.compilation.MissingMaterializationError: Compilation Error E No materialization 'table' was found for adapter postgres! (searched types 'default' and 'postgres') core/dbt/task/run.py:266: MissingMaterializationError ```

…xecute` test Previously the `TestModelRunner:test_execute` test was running into a runtime error do to the macro `materialization_table_default` macro not existing in the project. This commit adds that macro to the project (though it should ideally get loaded via interactions between the manifest and adapter). Manually adding it resolved our previous issue, but created a new one. The macro appears to not be properly loaded into the manifest, and thus isn't discoverable later on when getting the macros for the jinja context. This leads to an error that looks like the following: ``` model_runner = <dbt.task.run.ModelRunner object at 0x1080a4f70> def test_execute( self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner ) -> None: > model_runner.execute(model=table_model, manifest=manifest) tests/unit/task/test_run.py:129: ---- core/dbt/task/run.py:287: in execute result = MacroGenerator( core/dbt/clients/jinja.py:82: in __call__ return self.call_macro(*args, **kwargs) venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:294: in call_macro macro = self.get_macro() --- self = <dbt.clients.jinja.MacroGenerator object at 0x1080f3130> def get_macro(self): name = self.get_name() template = self.get_template() # make the module. previously we set both vars and local, but that's # redundant: They both end up in the same place # make_module is in jinja2.environment. It returns a TemplateModule module = template.make_module(vars=self.context, shared=False) > macro = module.__dict__[get_dbt_macro_name(name)] E KeyError: 'dbt_macro__materialization_table_default' venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:277: KeyError ``` It's becoming apparent that we need to find a better way to either mock or legitimately load the default and adapter macros. At this point I think I've exausted the time box I should be using to figure out if testing the `ModelRunner` class is possible currently, with the result being more work has yet to be done.

ChenyuLInx · 2024-06-13T21:31:10Z

tests/unit/task/test_run.py

+        run_result: RunResult,
+    ) -> None:
+        # Setup way to catch events
+        event_manager = get_event_manager()


Should this be some setup steps vs just something in a single test?

We can move this to the local log_model_result_catcher fixture. Additionally, the setup can be reduced by a few steps because, in the time since originally writing this, we've released a version of dbt-common which provides a add_callback_to_manager method. So the log_model_result_catcher fixture would become

@pytest.fixture def log_model_result_catcher(self) -> EventCatcher: catcher = EventCatcher(event_to_catch=LogModelResult) add_callback_to_manager(catcher.catch) return catcher

ChenyuLInx · 2024-06-13T21:41:38Z

tests/unit/task/test_run.py

+    def test_execute(
+        self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
+    ) -> None:
+        model_runner.execute(model=table_model, manifest=manifest)


Any thoughts on how we can actually test this? I think we probably want to think a bit more about what the runner layer logic is? Is it run a specific macro with correct parameters? or something else

A non-exhaustive list of things we could unit test about the execute function

The result on success
a. execute should return a RunResult

assert that the status is success

assert that the adapter_response is as expected

If the materialization_macro is not found, the appropriate error is raised
a. this is what we are unintentionally doing currently (and we want to be doing it intentionally)

If config is not in context the appropriate error is raised

If the materialization macro doesn't support the language of the model, the appropriate error is raised

Pre model hook is fired

Post model hook is fired

What is blocking most of this currently is (2.a)

Should we open a follow up ticket about 2.a?

Yes indeedy 👍

…s fixture

ChenyuLInx

LGTM!
Lets make sure we have follow up ticket to run execute(or maybe have some sort of scoped down running materialization)

QMalcolm added the Skip Changelog Skips GHA to check for changelog file label May 21, 2024

cla-bot bot added the cla:yes label May 21, 2024

QMalcolm force-pushed the qmalcolm--9867-model-runner-unit-testing branch from b225e21 to f745ba1 Compare May 22, 2024 05:16

QMalcolm force-pushed the qmalcolm--9867-model-runner-unit-testing branch from f745ba1 to fbd2667 Compare June 13, 2024 19:19

QMalcolm added 4 commits June 13, 2024 13:09

Add unit test for ModelRunner.print_result_line

42843d4

QMalcolm force-pushed the qmalcolm--9867-model-runner-unit-testing branch from fbd2667 to 161b32e Compare June 13, 2024 20:09

QMalcolm marked this pull request as ready for review June 13, 2024 20:35

QMalcolm requested a review from a team as a code owner June 13, 2024 20:35

ChenyuLInx reviewed Jun 13, 2024

View reviewed changes

Begin adding the LogModelResult event catcher to event manager clas…

4bda8ab

…s fixture

QMalcolm requested a review from ChenyuLInx June 17, 2024 15:50

ChenyuLInx approved these changes Jun 17, 2024

View reviewed changes

QMalcolm merged commit da19d7b into main Jun 18, 2024
60 checks passed

QMalcolm deleted the qmalcolm--9867-model-runner-unit-testing branch June 18, 2024 00:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial unit tests for `ModelRunner` class #10196

Add initial unit tests for `ModelRunner` class #10196

QMalcolm commented May 21, 2024 •

edited

Loading

codecov bot commented May 21, 2024 •

edited

Loading

ChenyuLInx Jun 13, 2024

QMalcolm Jun 14, 2024

ChenyuLInx Jun 13, 2024

QMalcolm Jun 14, 2024

ChenyuLInx Jun 17, 2024

QMalcolm Jun 18, 2024 •

edited

Loading

ChenyuLInx left a comment

Add initial unit tests for ModelRunner class #10196

Add initial unit tests for ModelRunner class #10196

Conversation

QMalcolm commented May 21, 2024 • edited Loading

Problem

Solution

Checklist

codecov bot commented May 21, 2024 • edited Loading

Codecov Report

ChenyuLInx Jun 13, 2024

Choose a reason for hiding this comment

QMalcolm Jun 14, 2024

Choose a reason for hiding this comment

ChenyuLInx Jun 13, 2024

Choose a reason for hiding this comment

QMalcolm Jun 14, 2024

Choose a reason for hiding this comment

ChenyuLInx Jun 17, 2024

Choose a reason for hiding this comment

QMalcolm Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

ChenyuLInx left a comment

Choose a reason for hiding this comment

Add initial unit tests for `ModelRunner` class #10196

Add initial unit tests for `ModelRunner` class #10196

QMalcolm commented May 21, 2024 •

edited

Loading

codecov bot commented May 21, 2024 •

edited

Loading

QMalcolm Jun 18, 2024 •

edited

Loading