Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need a to_junit method to standardize the report format of TestSuiteResult #1685

Closed
luca-martial opened this issue Dec 18, 2023 Discussed in #1681 · 8 comments
Closed

Need a to_junit method to standardize the report format of TestSuiteResult #1685

luca-martial opened this issue Dec 18, 2023 Discussed in #1681 · 8 comments
Assignees
Labels
feature good first issue Good for newcomers

Comments

@luca-martial
Copy link
Member

Discussed in https://github.com/orgs/Giskard-AI/discussions/1681

To standardize the reporting from the test suite, we could add a to_junit method to our TestSuiteResult

Originally posted by AdriMarteau December 15, 2023
I am looking at integrating giskard in a CI process and I was wondering if it was compatible with pytest.
This would help with leveraging CI tools reporting capabilities.

My idea was to use a command like:

pytest giskard-tests.py --junitxml=junit/test-results.xml
```</div>

@Kranium2002
Copy link
Contributor

Hi, I would like to work on this issue. Could you guide me a bit?

@luca-martial
Copy link
Member Author

Hi @Kranium2002 sounds great, thanks for being open to contributing! The TestSuiteResult class needs a new method that we will name to_junit in order to return test suite results in JUnit XML format.

JUnit XML format would look something like this for test suite results:

JUnit XML is something like
<testsuite>
    <testcase name="NameOfTheTest">
        <failure>Description of failure</failure>
    </testcase>
</testsuite>

Let us know if you have any questions!

@Kranium2002
Copy link
Contributor

I'll do this by this weekend, could you maybe assign this issue to me?

@AdriMarteau
Copy link
Contributor

If I may I did a quick script that can improve compatibility with pytest (let's call is test_ml.py)

from giskard import demo, Model, Dataset, testing, Suite
import pytest


model, df = demo.titanic()

wrapped_dataset = Dataset(
    df=df,
    target="Survived",
    cat_columns=["Pclass", "Sex", "SibSp", "Parch", "Embarked"],
)

suite = (
    Suite()
    .add_test(testing.test_f1(dataset=wrapped_dataset, threshold=.6))
    .add_test(testing.test_accuracy(dataset=wrapped_dataset))
)

my_first_model = Model(model=model, model_type="classification")
suite_results = suite.run(model=my_first_model)

@pytest.mark.parametrize("test_result", suite_results.results, ids = lambda t: t[0])
def test_giskard(test_result):
    name_, result_, data_ = test_result
    assert result_.passed, result_.messages

which can be called via:

python -W ignore -m pytest test_ml.py

and here is the output:

============================================= test session starts =============================================
platform win32 -- Python 3.11.5, pytest-7.4.3, pluggy-1.3.0
rootdir: C:\user\SE48561\GitHub\demo-ml-process
plugins: typeguard-4.1.5
collected 2 items

tests.py .F                                                                                              [100%] 

================================================== FAILURES =================================================== 
___________________________________________ test_giskard[Accuracy] ____________________________________________ 

test_result = ('Accuracy',
               Test failed
               Metric: 0.79

               , {'dataset': <gis...se.Dataset object at 0x000001E7DB224490>, 'model': <giskard.models.sklearn.SKLearnModel object at 0x000001E7D6043C50>})

    @pytest.mark.parametrize("test_result", suite_results.results, ids = lambda t: t[0])
    def test_giskard(test_result):
        name_, result_, data_ = test_result
>       assert result_.passed, result_.messages
E       AssertionError: []
E       assert False
E        +  where False = \n               Test failed\n               Metric: 0.79\n               \n
     .passed

tests.py:25: AssertionError
=========================================== short test summary info =========================================== 
FAILED tests.py::test_giskard[Accuracy] - AssertionError: []
========================================= 1 failed, 1 passed in 4.44s ========================================= 

This is not bad but if we improve the TestResult class we can have richer asserts.
Also having a list of Tuple clutter a bit the outputs.

From this using

python -W ignore -m pytest tests.py --junitxml=junit-example.xml

would create the report with "base" library (see attached file)
junit-example.txt

@AdriMarteau
Copy link
Contributor

looking at the code for the tests ideally refactoring the function so that they implement an assert would be ideal so that it can be easily used with pytest.
Were there discussions about this in the past and maybe reasons for not using it @luca-martial ?

@kevinmessiaen
Copy link
Member

kevinmessiaen commented Dec 22, 2023

I think that keeping test return TestResult is more flexible than adding assert in the test. For example, by having asserts it doesn't allow you to return a dataset containing all the failing cases.

In a meanwhile it's fairly easy to add an assert method in GiskardTest as following:

class GiskardTest(Artifact[TestFunctionMeta], ABC):

  # Omitting existing methods

  def assert(self):
    result = self.execute()
    if type(result) == bool:
      assert result
    else:
     assert result.passed, '\n'.join([repr(message) for message in result.messages])

Then you'll have this in your tests:

def test_model_accuracy(model, dataset):
  from giskard.testing import test_accuracy
  test_accuracy(model, dataset, threshold=0.75).assert()

@AdriMarteau
Copy link
Contributor

this a great suggestion!! Let me try this locally.

I think this should easily address the design compatibility with pytest.
However, I still think that the result would be nicer if we were able to get how the passed boolean is computed and assert it in the test to leverage pytest reporting 😃

@Kranium2002
Copy link
Contributor

Started working on this in #1703

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature good first issue Good for newcomers
Development

No branches or pull requests

4 participants