Skip to content

Conversation

pyth0n1c
Copy link
Contributor

Respect manual_testing flag.
Previously, there was a bug where the validator was not always working as expected because:

  1. If tests were not defined, then the validator was not called. This has been fixed by setting it to always=True
  2. Because of the way tests are run and results are generated, it was showing as a failure. We now show a manual test as a "successful" test with a special message indicating that it was marked as "manual_test" in the output in test_results/summary.yml. Please see the attached image as an example. This should make it easy to review results.

Note that test 'name' and 'message' fields have been dynamically generated at run time due to the presence of the manual_test field:
image

Copy link
Contributor

@cmcginley-splunk cmcginley-splunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a few comments for you Eric, but looks good!

slightly unexpected check for MANUAL_TEST
when actually executing test cases.
@pyth0n1c pyth0n1c added the Draft label Feb 17, 2024
@pyth0n1c
Copy link
Contributor Author

Merged latest contentctl updates into this branch. Now that a test SKIPPED state has been introduced, detections marked with MANUAL_TEST will instead be SKIPPED instead of simply logged as successful

pyth0n1c and others added 5 commits March 20, 2024 09:27
functionality to take advantage
of the new SKIP state.
accounting for skipped
unit tests (due to the
presence of the
manual_test flag).
#SKIPPED tests still show a success in this field, but we want to count them differently
pass_increment = 1
for test in summary.get("tests"):
if test.get("test_type") == "unit" and test.get("status") == "skip":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, and I don't need this to be blocking, but I'd like to make detection state resolution (e.g. is it considered a skip/fail/error/pass) a property of the detection itself

E.g. define a property function status which iterates over tests

  • Pass: If at least one test passes and the rest pass/skip
  • Skip: If all tests skip
  • Fail: If at least one test fails and the rest pass/skip/fail
  • Error: If at least one detection errors (regardless of other test states)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯
I think we should make this a pydantic computed_property. On top of that, I would add an additional option to this set of values (which is the default value) along the lines of NOT_YET_TESTED.
This way, when we are rendering the current state of all tests, like in a user interface, we can easily differentiate /sort all sorts of different tests.


# Sort s.t. all failures appear first (then by name)
tested_detections.sort(key=lambda x: (x["success"], x["name"]))
#Second short condition is a hack to get detections with unit skipped tests to appear above pass tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask why you want skip above pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results file can be, and often is, REALLY large. I think it makes sense for errors/failures at the top, as these are the ones that deserve the most attention (and what people are likely to be looking for when they open this file).

I personally think "pass" is the least interesting since it means they are working as expected.
I think SKIP falls somewhere in between a failure and a pass. For example, someone might want to manually evaluable these OR note how may tests have been skipped.

Definitely a personal opinion.

)
success_rate = Utils.getPercent(
total_pass, total_detections, 1
total_pass, total_detections-total_skipped, 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a non-blocking comment and a reiteration of a conversation in slack; just including it for good book-keeping

I think future work could break down another metric as tested_detections which is the value you are calculating in-line here. It would provide clarity to someone reading the YAML as to how success rate is derived. Likewise, total_detection should become a count of all detections w/in the scope of this test run (e.g. all, selected, changes) and if all, should include experimental and deprecated in its count, with separate counts for total_production total_experimental etc.

I think this is just good data to have, but this also serves a request from Javier around having a metics for test coverage of all detections delivered in ESCU (which would include experimental and deprecated)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this type of additional breakdown is a good idea.
I also think it should be abstracted from this particular view to a Model so that it can be used in different UI components, such as:

  1. The file, like we use it today
  2. Command line output
  3. Other User Interfaces we may support in the future.



# A list of groups of tests, relying on the same data
test_groups: Union[list[TestGroup], None] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you migrated this code from Detection

Can you explain the intention behind having separate classes here (detection and the abstract)? I'm fine w/ this code living in either class, but I chose Detection bc I thought the spirit of the abstract class was to be a more faithful representation of the model as it exists in YAML, and since the test_groups attribute is an inferred field, I thought maybe it made sense to live elsewhere.

So tl;dr I'm totally fine with this, and no change required, just curious, and wondering if we don't have a need for two separate classes, perhaps we could collapse them into one in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! Many contentctl users want to change content somehow (and detections are the content they are most-often interested in changing). These changes could be, but are not limited to:

  1. Adding new fields (and possibly building validators for those fields)
  2. Setting more or less strict validations on fields
  3. Setting values for fields already defined in that object (but which may be missing from their YMLs)
  4. Disabling some of the validations we do.

However, if they change Detection_Abstract and WE change Detection_Abstract, it has a high probability of creating painful merge conflict issues. As such, the plan was to retain control of Detection_Abstract and ship all our out-of-the-box functionality in that file. Then if customers want their own functionality, they can add it to Detection.py.
This way, pulling the latest contentctl updates should be painless and without merge conflicts. We are hoping to do this with all types, but have started with Detection since it is the one that receives user changes most often.

Copy link
Contributor

@cmcginley-splunk cmcginley-splunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments regarding philosophy and future work, but LGTM :)

@pyth0n1c pyth0n1c removed the Draft label Apr 5, 2024
@pyth0n1c pyth0n1c merged commit a6ab65e into main Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants