Added validation checks for import_tasks_from #686

LawrenceCheng1570 · 2022-03-28T21:36:52Z

Issue #669
Changes:
When loading import_tasks_from: file.yaml

If yaml.safe_load is None, replace it with an empty list
If yaml.safe_load returns something other than a list, raise an error saying we were expecting a list

edublancas · 2022-03-28T22:15:33Z

thanks for your contribution! can you add some tests? please add them here

In the first case, we wanna check that an empty file does not break initializing the DAGSpec object, in the second case, we want to test that an appropriate error is shown

let me know if you encounter issues or have questions!

edublancas · 2022-04-01T02:31:29Z

Hi @LawrenceCheng1570 , thanks for the contribution!

I reproduced the error. The problem is that a downstream function doesn't work with empty lists, so I think the original logic needs to change a bit

instead of this:

if imported is None:
    imported = []

it should be:

if not imported:
    path = str(self.data['meta']['import_tasks_from'])
    raise ValueError(f'expected import_tasks_from file ({path!r}) to return a list of tasks, got: {imported}')

Note that this conditional will take care of None, [], and empty dictionaries.

Please also update the test so it checks that the appropriate error message is shown: see here

the condition can be something like: assert "expected import_tasks_from" in str(excinfo.value)

…-list import_tasks_from tests

LawrenceCheng1570 · 2022-04-01T08:15:05Z

Hi @edublancas, thanks for the suggestions! It fixed my error and I just pushed my changes. Please let me know if I need to make any other edits.

edublancas

almost there! just minor observations

edublancas · 2022-04-01T14:35:40Z

src/ploomber/spec/dagspec.py

+                        'Expected list when loading YAML file from '
+                        'import_tasks_from: file.yaml, '
+                        f'but got {type(imported)}')
+


looks good!

edublancas · 2022-04-01T14:37:15Z

tests/spec/test_dagspec.py

+        spec_d = yaml.safe_load(Path('pipeline.yaml').read_text())
+        spec_d['meta']['import_tasks_from'] = 'some_tasks.yaml'
+
+        spec = DAGSpec(spec_d)


as a best practice when writing tests that evaluate error messages. it's best to only include the statement that throws the error, in this case:

with pytest.raises(ValueError) as excinfo: DAGSpec(spec_d)

Please move everything the spec = ... statement outside the with pytest.raises line, and you can remove the spec.to_dag().render()

Thanks @edublancas! Just pushed my changes

edublancas · 2022-04-01T17:46:53Z

wooo great job! thanks a lot for contributing!

LawrenceCheng1570 · 2022-04-01T17:48:56Z

Thanks so much! Really appreciate all the help and support for my first time contributing :)

* Added validation checks for import_tasks_from * Removed unnecessary blank line in DAGSpecPartial * Added tests for empty yaml and non-list yaml file * Changed logic for handling empty YAML files and updated empty and non-list import_tasks_from tests * Removed unnecessary blank lines * Updated import_tasks_from tests for empty and non-list YAML file

LawrenceCheng1570 added 2 commits March 28, 2022 17:19

Added validation checks for import_tasks_from

6ed115e

Removed unnecessary blank line in DAGSpecPartial

47d2276

Added tests for empty yaml and non-list yaml file

4c8fcb7

LawrenceCheng1570 added 2 commits April 1, 2022 04:05

Changed logic for handling empty YAML files and updated empty and non…

2a036dd

…-list import_tasks_from tests

Removed unnecessary blank lines

0ee4cec

edublancas requested changes Apr 1, 2022

View reviewed changes

Updated import_tasks_from tests for empty and non-list YAML file

7ee7925

edublancas merged commit 8b0ea52 into ploomber:master Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added validation checks for import_tasks_from #686

Added validation checks for import_tasks_from #686

LawrenceCheng1570 commented Mar 28, 2022

edublancas commented Mar 28, 2022

edublancas commented Apr 1, 2022

LawrenceCheng1570 commented Apr 1, 2022

edublancas left a comment

edublancas Apr 1, 2022

edublancas Apr 1, 2022

LawrenceCheng1570 Apr 1, 2022

edublancas commented Apr 1, 2022

LawrenceCheng1570 commented Apr 1, 2022

Added validation checks for import_tasks_from #686

Added validation checks for import_tasks_from #686

Conversation

LawrenceCheng1570 commented Mar 28, 2022

edublancas commented Mar 28, 2022

edublancas commented Apr 1, 2022

LawrenceCheng1570 commented Apr 1, 2022

edublancas left a comment

Choose a reason for hiding this comment

edublancas Apr 1, 2022

Choose a reason for hiding this comment

edublancas Apr 1, 2022

Choose a reason for hiding this comment

LawrenceCheng1570 Apr 1, 2022

Choose a reason for hiding this comment

edublancas commented Apr 1, 2022

LawrenceCheng1570 commented Apr 1, 2022