Enable inline csv format in unit testing #8743

gshank · 2023-09-28T20:30:15Z

resolves #8626

Problem

Users want to be able to use the csv format in addition to dictionaries in their unit tests.

Solution

Add a "format" field to the given and expect structure, create an OutputFixture class (in addition to InputFixture).

Checklist

I have read the contributing guide and understand what's expected of me
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
This PR includes type annotations for new and modified functions

codecov · 2023-09-28T20:32:35Z

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (bb6fd30) 86.66% compared to head (ada13fb) 86.69%.
Report is 4 commits behind head on unit_testing_feature_branch.

Additional details and impacted files

@@                       Coverage Diff                       @@
##           unit_testing_feature_branch    #8743      +/-   ##
===============================================================
+ Coverage                        86.66%   86.69%   +0.02%     
===============================================================
  Files                              178      178              
  Lines                            26234    26276      +42     
===============================================================
+ Hits                             22737    22779      +42     
  Misses                            3497     3497

Flag	Coverage Δ
integration	`83.51% <95.45%> (+0.02%)`	⬆️
unit	`64.98% <65.90%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
core/dbt/contracts/graph/model_config.py	`92.34% <100.00%> (+0.08%)`	⬆️
core/dbt/contracts/graph/nodes.py	`95.21% <100.00%> (+<0.01%)`	⬆️
core/dbt/parser/unit_tests.py	`95.09% <100.00%> (+2.16%)`	⬆️
core/dbt/contracts/graph/unparsed.py	`93.76% <94.44%> (-0.01%)`	⬇️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

core/dbt/contracts/graph/unparsed.py

core/dbt/parser/unit_tests.py

tests/functional/unit_testing/test_unit_testing.py

…_csv

core/dbt/parser/unit_tests.py

MichelleArk · 2023-10-02T20:40:32Z

core/dbt/parser/unit_tests.py

+        if test_case.expect.format == "dict":
+            if isinstance(test_case.expect.rows, List):
+                expected_rows = test_case.expect.rows
+            else:
+                raise ParsingError("Wrong format for expected rows")
+        else:  # test_case.expect.format == "csv":
+            # build a dictionary from the csv string
+            if isinstance(test_case.expect.rows, str):
+                expected_rows = self._build_rows_from_csv(test_case.expect.rows)
+            else:
+                raise ParsingError("Wrong format for expected rows")


I'm brainstorming ways that we could consolidate this logic and share it across InputFixture and OutputFixture. The simplest thing I can think of is adding a parsed_rows property on a shared UnitTestFixture base class that does this conditional logic to parse the raw rows given a format and convert it into a consistent internal representation. What do you think?

An alternative could be to have a UnitTestFixture class that can be created from either an InputFixture or OutputFixture (making these just the external-facing interfaces). That way we could consolidate the logic of parsing the rows based on the format, into a consistent internal fixture object and store the parsed rows as state on that object.

Yeah, I thought it would be a good idea to have some kind of combined logic but when I wrote this my brain was kind of offline :). Since we know we're going to also support csv files soon, it should be a solution that makes that easy. Let me think about whether a base class or derived class makes more sense...

MichelleArk · 2023-10-04T19:35:49Z

core/dbt/contracts/graph/unparsed.py

+    @property
+    def format(self) -> UnitTestFormat:
+        return UnitTestFormat.Dict
+
+    @property
+    def rows(self) -> Union[str, List[Dict[str, Any]]]:
+        return []


is it necessary for these to be properties? Could they instead be attributes that are inherited by UnitTestInputFixture and UnitTestOutputFixture?

e.g.

rows: Union[str, List[Dict[str, Any]]] = "" format: UnitTestFormat = UnitTestFormat.Dict

If we do that then we run into the frustrating issue of fields without defaults can't come after fields with defaults issue and have to split them out into a special class and do a different order. I'd kind of rather not.

core/dbt/parser/unit_tests.py

core/dbt/contracts/graph/unparsed.py

MichelleArk

some small suggestions, otherwise LGTM!

* Initial implementation of unit testing (from pr #2911) Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com> * 8295 unit testing artifacts (#8477) * unit test config: tags & meta (#8565) * Add additional functional test for unit testing selection, artifacts, etc (#8639) * Enable inline csv format in unit testing (#8743) * Support unit testing incremental models (#8891) * update unit test key: unit -> unit-tests (#8988) * convert to use unit test name at top level key (#8966) * csv file fixtures (#9044) * Unit test support for `state:modified` and `--defer` (#9032) Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com> * Allow use of sources as unit testing inputs (#9059) * Use daff for diff formatting in unit testing (#8984) * Fix #8652: Use seed file from disk for unit testing if rows not specified in YAML config (#9064) Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com> Fix #8652: Use seed value if rows not specified * Move unit testing to test and build commands (#9108) * Enable unit testing in non-root packages (#9184) * convert test to data_test (#9201) * Make fixtures files full-fledged members of manifest and enable partial parsing (#9225) * In build command run unit tests before models (#9273) --------- Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com> Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com> Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com> Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com> Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>

Enable using inline csv format in unit testing

677031f

gshank requested review from a team as code owners September 28, 2023 20:30

gshank requested review from thisispvb, McKnight-42 and aranke and removed request for a team September 28, 2023 20:30

cla-bot bot added the cla:yes label Sep 28, 2023

Changie, return type

9959b05

gshank changed the base branch from main to unit_testing_feature_branch September 28, 2023 20:39

aranke reviewed Sep 29, 2023

View reviewed changes

core/dbt/contracts/graph/unparsed.py Outdated Show resolved Hide resolved

core/dbt/parser/unit_tests.py Outdated Show resolved Hide resolved

tests/functional/unit_testing/test_unit_testing.py Show resolved Hide resolved

gshank requested a review from MichelleArk October 2, 2023 19:48

gshank added 2 commits October 2, 2023 15:59

Merge branch 'unit_testing_feature_branch' into 8626-ut_format_inline…

e87ea8d

…_csv

fix a comment

9db2f72

MichelleArk reviewed Oct 2, 2023

View reviewed changes

core/dbt/parser/unit_tests.py Outdated Show resolved Hide resolved

MichelleArk reviewed Oct 2, 2023

View reviewed changes

core/dbt/parser/unit_tests.py Outdated Show resolved Hide resolved

MichelleArk reviewed Oct 2, 2023

View reviewed changes

gshank added 5 commits October 2, 2023 17:14

Use UnitTestFormat in parser/unit_test.py

f15277a

Move construction of rows to UnitTestFixture

9fe3683

Test for invalid format

c2e1968

Add test for invalid format

00b61c2

Fix unit test

6eb5d49

MichelleArk reviewed Oct 4, 2023

View reviewed changes

core/dbt/parser/unit_tests.py Outdated Show resolved Hide resolved

Move format/rows validation to class method in UnitTestFixture

fb6f32f

gshank requested a review from MichelleArk October 5, 2023 00:27

MichelleArk reviewed Oct 5, 2023

View reviewed changes

core/dbt/contracts/graph/unparsed.py Outdated Show resolved Hide resolved

MichelleArk reviewed Oct 5, 2023

View reviewed changes

core/dbt/contracts/graph/unparsed.py Outdated Show resolved Hide resolved

MichelleArk approved these changes Oct 5, 2023

View reviewed changes

Make a validate_fixture a non-class method, tweak error message

ada13fb

gshank merged commit 3b6f9bd into unit_testing_feature_branch Oct 5, 2023
49 checks passed

gshank deleted the 8626-ut_format_inline_csv branch October 5, 2023 15:17

aranke mentioned this pull request Nov 13, 2023

Fix #8652: Use seed file from disk for unit testing if rows not specified in YAML config #9064

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable inline csv format in unit testing #8743

Enable inline csv format in unit testing #8743

gshank commented Sep 28, 2023

codecov bot commented Sep 28, 2023 •

edited

Loading

MichelleArk Oct 2, 2023

MichelleArk Oct 2, 2023 •

edited

Loading

gshank Oct 2, 2023

MichelleArk Oct 4, 2023

gshank Oct 4, 2023

MichelleArk left a comment

Enable inline csv format in unit testing #8743

Enable inline csv format in unit testing #8743

Conversation

gshank commented Sep 28, 2023

Problem

Solution

Checklist

codecov bot commented Sep 28, 2023 • edited Loading

Codecov Report

MichelleArk Oct 2, 2023

Choose a reason for hiding this comment

MichelleArk Oct 2, 2023 • edited Loading

Choose a reason for hiding this comment

gshank Oct 2, 2023

Choose a reason for hiding this comment

MichelleArk Oct 4, 2023

Choose a reason for hiding this comment

gshank Oct 4, 2023

Choose a reason for hiding this comment

MichelleArk left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 28, 2023 •

edited

Loading

MichelleArk Oct 2, 2023 •

edited

Loading