Need a to_junit method to standardize the report format of TestSuiteResult #1685

luca-martial · 2023-12-18T10:19:16Z

Discussed in https://github.com/orgs/Giskard-AI/discussions/1681

To standardize the reporting from the test suite, we could add a to_junit method to our TestSuiteResult

^{Originally posted by AdriMarteau December 15, 2023}
I am looking at integrating giskard in a CI process and I was wondering if it was compatible with pytest.
This would help with leveraging CI tools reporting capabilities.

My idea was to use a command like:

pytest giskard-tests.py --junitxml=junit/test-results.xml
```</div>

The text was updated successfully, but these errors were encountered:

Kranium2002 · 2023-12-19T11:26:57Z

Hi, I would like to work on this issue. Could you guide me a bit?

luca-martial · 2023-12-19T11:50:33Z

Hi @Kranium2002 sounds great, thanks for being open to contributing! The TestSuiteResult class needs a new method that we will name to_junit in order to return test suite results in JUnit XML format.

JUnit XML format would look something like this for test suite results:

JUnit XML is something like
<testsuite>
    <testcase name="NameOfTheTest">
        <failure>Description of failure</failure>
    </testcase>
</testsuite>

Let us know if you have any questions!

Kranium2002 · 2023-12-19T17:22:28Z

I'll do this by this weekend, could you maybe assign this issue to me?

AdriMarteau · 2023-12-20T15:52:31Z

If I may I did a quick script that can improve compatibility with pytest (let's call is test_ml.py)

from giskard import demo, Model, Dataset, testing, Suite
import pytest


model, df = demo.titanic()

wrapped_dataset = Dataset(
    df=df,
    target="Survived",
    cat_columns=["Pclass", "Sex", "SibSp", "Parch", "Embarked"],
)

suite = (
    Suite()
    .add_test(testing.test_f1(dataset=wrapped_dataset, threshold=.6))
    .add_test(testing.test_accuracy(dataset=wrapped_dataset))
)

my_first_model = Model(model=model, model_type="classification")
suite_results = suite.run(model=my_first_model)

@pytest.mark.parametrize("test_result", suite_results.results, ids = lambda t: t[0])
def test_giskard(test_result):
    name_, result_, data_ = test_result
    assert result_.passed, result_.messages

which can be called via:

python -W ignore -m pytest test_ml.py

and here is the output:

============================================= test session starts =============================================
platform win32 -- Python 3.11.5, pytest-7.4.3, pluggy-1.3.0
rootdir: C:\user\SE48561\GitHub\demo-ml-process
plugins: typeguard-4.1.5
collected 2 items

tests.py .F                                                                                              [100%] 

================================================== FAILURES =================================================== 
___________________________________________ test_giskard[Accuracy] ____________________________________________ 

test_result = ('Accuracy',
               Test failed
               Metric: 0.79

               , {'dataset': <gis...se.Dataset object at 0x000001E7DB224490>, 'model': <giskard.models.sklearn.SKLearnModel object at 0x000001E7D6043C50>})

    @pytest.mark.parametrize("test_result", suite_results.results, ids = lambda t: t[0])
    def test_giskard(test_result):
        name_, result_, data_ = test_result
>       assert result_.passed, result_.messages
E       AssertionError: []
E       assert False
E        +  where False = \n               Test failed\n               Metric: 0.79\n               \n
     .passed

tests.py:25: AssertionError
=========================================== short test summary info =========================================== 
FAILED tests.py::test_giskard[Accuracy] - AssertionError: []
========================================= 1 failed, 1 passed in 4.44s =========================================

This is not bad but if we improve the TestResult class we can have richer asserts.
Also having a list of Tuple clutter a bit the outputs.

From this using

python -W ignore -m pytest tests.py --junitxml=junit-example.xml

would create the report with "base" library (see attached file)
junit-example.txt

AdriMarteau · 2023-12-22T09:39:51Z

looking at the code for the tests ideally refactoring the function so that they implement an assert would be ideal so that it can be easily used with pytest.
Were there discussions about this in the past and maybe reasons for not using it @luca-martial ?

kevinmessiaen · 2023-12-22T10:14:52Z

I think that keeping test return TestResult is more flexible than adding assert in the test. For example, by having asserts it doesn't allow you to return a dataset containing all the failing cases.

In a meanwhile it's fairly easy to add an assert method in GiskardTest as following:

class GiskardTest(Artifact[TestFunctionMeta], ABC):

  # Omitting existing methods

  def assert(self):
    result = self.execute()
    if type(result) == bool:
      assert result
    else:
     assert result.passed, '\n'.join([repr(message) for message in result.messages])

Then you'll have this in your tests:

def test_model_accuracy(model, dataset):
  from giskard.testing import test_accuracy
  test_accuracy(model, dataset, threshold=0.75).assert()

AdriMarteau · 2023-12-22T10:28:46Z

this a great suggestion!! Let me try this locally.

I think this should easily address the design compatibility with pytest.
However, I still think that the result would be nicer if we were able to get how the passed boolean is computed and assert it in the test to leverage pytest reporting 😃

Kranium2002 · 2023-12-24T21:18:40Z

Started working on this in #1703

luca-martial added feature good first issue Good for newcomers labels Dec 18, 2023

luca-martial assigned Kranium2002 Dec 19, 2023

AdriMarteau mentioned this issue Dec 22, 2023

Add method to facilitate using pytest #1701

Merged

11 tasks

Kranium2002 mentioned this issue Dec 24, 2023

Add a to_junit method to standardize the report format of TestSuiteResult #1703

Merged

11 tasks

kevinmessiaen closed this as completed Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need a to_junit method to standardize the report format of TestSuiteResult #1685

Need a to_junit method to standardize the report format of TestSuiteResult #1685

luca-martial commented Dec 18, 2023

Kranium2002 commented Dec 19, 2023

luca-martial commented Dec 19, 2023

Kranium2002 commented Dec 19, 2023

AdriMarteau commented Dec 20, 2023

AdriMarteau commented Dec 22, 2023

kevinmessiaen commented Dec 22, 2023 •

edited

Loading

AdriMarteau commented Dec 22, 2023

Kranium2002 commented Dec 24, 2023

Need a to_junit method to standardize the report format of TestSuiteResult #1685

Need a to_junit method to standardize the report format of TestSuiteResult #1685

Comments

luca-martial commented Dec 18, 2023

Discussed in https://github.com/orgs/Giskard-AI/discussions/1681

Kranium2002 commented Dec 19, 2023

luca-martial commented Dec 19, 2023

Kranium2002 commented Dec 19, 2023

AdriMarteau commented Dec 20, 2023

AdriMarteau commented Dec 22, 2023

kevinmessiaen commented Dec 22, 2023 • edited Loading

AdriMarteau commented Dec 22, 2023

Kranium2002 commented Dec 24, 2023

kevinmessiaen commented Dec 22, 2023 •

edited

Loading