Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use danger to report test results in PR comment #19436

Open
wants to merge 4 commits into
base: trunk
Choose a base branch
from

Conversation

crazytonyli
Copy link
Contributor

Use danger-junit to report test failures in GitHub PRs, which can save us a couple of clicks to view the test failures if there is any. Let me know if you think this is not a good idea.

Regression Notes

  1. Potential unintended areas of impact
    None.

  2. What I did to test those areas of impact (or what existing automated tests I relied on)
    None.

  3. What automated tests I added (or what prevented me from doing so)
    None.

PR submission checklist:

  • I have completed the Regression Notes.
  • I have considered adding unit tests for my changes.
  • I have considered adding accessibility improvements for my changes.
  • I have considered if this change warrants user-facing release notes and have added them to RELEASE-NOTES.txt if necessary.

@crazytonyli crazytonyli added this to the 21.0 milestone Oct 12, 2022
@crazytonyli crazytonyli requested a review from a team October 12, 2022 09:25
@crazytonyli crazytonyli requested a review from a team as a code owner October 12, 2022 09:25
@crazytonyli crazytonyli self-assigned this Oct 12, 2022
@pachlava
Copy link
Contributor

👋 Just leaving a note that I was pinged as a reviewer because the PR touches a UI tests related file. I guess this is a temporary change in the tests to make them fail for easier testing of the PR. I won't be marking the PR as reviewed by mobile-ui-testing-squad squad, because I suppose the tests change will be reverted before merge, and I think the rest of the non UI-tests-related changes should be reviewed by a Release Manager 🦉.

Copy link
Contributor

@AliSoftware AliSoftware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth validating with the rest of the team if Danger is the right tool for that —especially after @jkmassel 's recent post in paaHJt-3NR-p2 — as opposed to have the CI step run a script to parse the .junit file (similar to what is already done using this annotate_test_failures action from bash-cache-buildkite-plugin), and expend said action if we want to use comment_to_pr to do the PR comment (in addition to the buildkite-agent annotate that it already does)

My gut would be that we might want to prefer using our existing annotate_test_failures, and consider expanding / improving it, to also support optionally doing a comment_on_pr in addition to just the Buildkite annotation, rather than rely on Danger.

Especially given annotate_test_failures is already called by this repo's pipeline during our various unit test .sh scripts, so expanding that action to just add comment_on_pr will likely be a trivial change. And it doesn't require all three test steps (unit-tests, ui-tests-iphone and ui-tests-ipad) to finish before annotating, nor to re-download the artefacts after the fact, as each annotation is done at the end of the respective CI step 💪


To be fair, using Danger-junit has its merits (including the fact that the logic of the parsing and comment rendering is already made and we don't have to reinvent the wheel); but Jeremy's arguments on when to use Danger vs a CI custom step would make me tend to use the latter for this situation (especially since we already have a bash-cache action for that mostly ready), and also the odd thing with using Danger is that it will do multiple things at once as the Dangerfile evolves, as opposed to having dedicated and isolate-able steps for each aspect we'd want to check (Rubocop, Junit, …), which would allow us to do more fine-grained step dependencies (e.g. not having to wait for the Unit tests steps to finish to have Rubocop being able to run). As can be seen by your changes that had to rename rubocop-via-danger.sh into danger.sh because it now does more than one thing. 🤷

Dangerfile Outdated
Comment on lines 7 to 8
junit.parse "#{file}.junit"
junit.report
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a risk that each junit.report could override the comment made by a previous report? e.g. if unit-tests have failures and thus make Danger generate a comment to report those, but ui-tests-iphone also have failures, will the Danger comment for those override the previous comment made for unit-tests? Or will it append to it? Or will it report it in a separate comment each time?

@wpmobilebot
Copy link
Contributor

You can test the changes in Jetpack from this Pull Request by:
  • Clicking here or scanning the QR code below to access App Center
  • Then installing the build number pr19436-1254ff7 on your iPhone

If you need access to App Center, please ask a maintainer to add you.

@wpmobilebot
Copy link
Contributor

wpmobilebot commented Oct 12, 2022

You can test the changes in WordPress from this Pull Request by:
  • Clicking here or scanning the QR code below to access App Center
  • Then installing the build number pr19436-447e15b on your iPhone

If you need access to App Center, please ask a maintainer to add you.

@crazytonyli
Copy link
Contributor Author

e.g. not having to wait for the Unit tests steps to finish to have Rubocop being able to run

Good point. I see this as a big downside of current setup, it's always good to get a faster feedback from PR checks. We probably need to use multiple Dangerfiles.

But I'm keen to hear @jkmassel 's thoughts on this. I had read the P2 post about danger, I don't see there is any recommendation around this kind of reporting feature.

@@ -6,6 +6,7 @@ gem 'cocoapods', '~> 1.11'
gem 'commonmarker'
gem 'danger', '~> 8.6'
gem 'danger-rubocop', '~> 0.10'
gem 'danger-junit', '~> 1.0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 Gems should be sorted in an alphabetical order within their section of the Gemfile. Gem danger-junit should appear before danger-rubocop.

@wpmobilebot
Copy link
Contributor

1 Error
🚫 Tests have failed, see below for more information.

Tests:

Classname Name
WordPressTest.AbstractPostTest testTitleForStatus
WordPressTest.AbstractPostTest testTitleForStatus
WordPressTest.AbstractPostTest testTitleForStatus

Generated by 🚫 Danger

@wpmobilebot
Copy link
Contributor

1 Error
🚫 Tests have failed, see below for more information.

Tests:

Classname Name
WordPressUITests.LoginTests testWPcomLoginLogout
WordPressUITests.LoginTests testWPcomLoginLogout
WordPressUITests.LoginTests testWPcomLoginLogout

Generated by 🚫 Danger

@wpmobilebot
Copy link
Contributor

1 Error
🚫 Tests have failed, see below for more information.

Tests:

Classname Name
WordPressUITests.LoginTests testWPcomLoginLogout
WordPressUITests.LoginTests testWPcomLoginLogout
WordPressUITests.LoginTests testWPcomLoginLogout

Generated by 🚫 Danger

@AliSoftware
Copy link
Contributor

But I'm keen to hear @jkmassel 's thoughts on this. I had read the P2 post about danger, I don't see there is any recommendation around this kind of reporting feature.

To me it's more like:

  • Operations like Linting and Unit Testing fall into the category of "we want that kind of operation to be done on the whole codebase, not just on the files that have been changed on the PR, so this should be done by a CI step, not by Danger (that should probably only be used when we only need to operate on the PR metadata or change subset only)".
  • And then, by extension, the reporting of said operations makes more sense (to me, at least) to be also done in the same place that the operation itself is done. In other words, imho the reporting should be done at the end of the .sh that also called the commands to run the unit tests.

Imho, this is also one of the reasons why things like "zipping test result and attaching them as artefacts", "reporting test failures as Buildkite annotations" and the likes are already commands that are part of the run_unit_tests.sh scripts, as opposed to be a separate step… or in a different tool like Danger. This helps keeping both the execution of the tests and the reporting of its result in the same single place.

And reporting those test failures as a PR comment feels very similar, feature- and goal-wise, to reporting those test failures as Buildkite Annotations, so that's why I'd suggest to do them in a similar manner, either by having a separate dedicated comment_with_test_failures action, or, more logically and way more simply, to extend our already-existing annotate_test_failures command to also support publishing the annotation message as PR comment.


In fact, the message in the PR comment we'd want… can actually be the exact same message that annotate_test_failures already generates, so basically, adopting this solution would only consist of adding a line calling comment_on_pr right after the one calling buildkite-agent annotate, with the same body.

Alternatively, the annotate_test_failures could write the markdown (generated from it parsing the junit files) into a file, and then in the run_unit_tests.sh script you could then call a fastlane lane that would use the comment_on_pr fastlane action (which supports the concept of reuse identifiers) similar to how it's already done for Installable Build comments here.

Whatever the solution, generating the comment directly from the run_unit_test.sh (and similar scripts for UI tests) instead of a separate step in the pipeline (that will have to re-download the artefacts and would depend on previous steps) feels more readable and logical to me personally.


That said, that's only my own and personal opinion 😛 so let's see what @jkmassel thinks 🙂

@crazytonyli
Copy link
Contributor Author

Imho, this is also one of the reasons why things like [...] are part of the run_unit_tests.sh scripts, as opposed to be a separate step… or in a different tool like Danger. This helps keeping both the execution of the tests and the reporting of its result in the same single place.

Yeah, I agree that execution and reporting should be in the same place, which would simplify the scripts and give us faster feedback during the CI run. And I've updated this PR to use this approach. But I would argue that it's not necessary to rule out using a thirdparty tool like danger to do the reporting. That's what I most curious about: when should we implement it ourselves and when should we use a thirdparty tool.

DIY vs using thirdparty tools is probably a too broad question to answer (unless there are some principles that we would want to adhere to), I feel like it's more easy to answer this question when thinking about a concrete problem. Like this one here, if we'd like to report test failures in PRs, how do we want it to look like? Can we implement this ideal result using a tool like danger? If not, maybe we should build it ourselves so that we get to have full control on the result. But on the other side, does the cost of building our solution outweighs using a thirdparty solution and having a less than ideal result?

I don't think the result we see here in this PR(test failures are reported in three separate comments) is ideal. I'd want just one single comment containing all test failures, but at the same time the reporting should be done along with the test execution, not in a dedicated danger step. If that's the result we'd want, probably we have no choice but implementing it ourselves.

or, more logically and way more simply, to extend our already-existing annotate_test_failures.

Agreed. This step already parses the test result, might as well add a option like --with-pr-comments to report the parsing result in the PR.

so basically, adopting this solution would only consist of adding a line calling comment_on_pr right after the one calling buildkite-agent annotate, with the same body.

There is a small issue here. What we want is having one single comment to report test failures. The script either adds a new comment or update an existing one since we don't want to keep adding comments to report failures in new test runs. Danger uses a danger_id to keep track of the comments, we'll probably need to do the same if we implement it ourselves.

@AliSoftware
Copy link
Contributor

But I would argue that it's not necessary to rule out using a thirdparty tool like danger to do the reporting.
DIY vs using thirdparty tools

That's totally fair and a good point.

Agreed. This step already parses the test result, might as well add a option like --with-pr-comments to report the parsing result in the PR.

Yeah, I think that if our annotate_test_failures command didn't exist yet, then might not have suggested to use the DIY, home-brewed solution and instead would probably be fine with using Danger 🤷 . The fact that we already have a command that parses the JUnit file then generates the right markdown comment (but as an annotation) is what made me think "let's reuse it"

What we want is having one single comment to report test failures. The script either adds a new comment or update an existing one since we don't want to keep adding comments to report failures in new test runs. Danger uses a danger_id to keep track of the comments, we'll probably need to do the same if we implement it ourselves.

Good point. That's why I mentioned using the comment_on_pr the fastlane action we have in the release-toolkit — which does handle this reuse mechanism via a reuse ID etc — later in my comment instead of the comment_on_pr bash-cache command:

Alternatively, the annotate_test_failures could write the markdown (generated from it parsing the junit files) into a file, and then in the run_unit_tests.sh script you could then call a fastlane lane that would use the comment_on_pr fastlane action (which supports the concept of reuse identifiers) similar to how it's already done for Installable Build comments here.

But I can see how that can then become more convoluted as using Danger directly, so… ⚖️

@crazytonyli
Copy link
Contributor Author

That's why I mentioned using the comment_on_pr the fastlane action we have in the release-toolkit

Oh I see, sorry I missed that. Got it confused with the one in the bash-cache repo 🥲

@mokagio
Copy link
Contributor

mokagio commented Oct 17, 2022

@crazytonyli I'm going to bump this to the next release because we'll be code freezing 21.0 today and it seems like this needs a bit more touches. Besides, it's a GitHub tooling PR, so it doesn't really matter which release it lands in 😄

@mokagio mokagio modified the milestones: 21.0, 21.1 Oct 17, 2022
@mokagio
Copy link
Contributor

mokagio commented Oct 17, 2022

Besides, it's a GitHub tooling PR, so it doesn't really matter which release it lands in 😄

I wonder whether we could benefit from a dedicated milestone for this kind of PRs, or if it's simpler to stick to the standard process and to do a milestone bump in edge-case scenario like this one.

🤔 ... currently leaning towards "stick to the standard" process, so we have one less question to answer when choosing the milestone. I.e. we won't have to ask "is this a PR that doesn't affect the user-facing application code".

@AliSoftware
Copy link
Contributor

Besides, it's a GitHub tooling PR, so it doesn't really matter which release it lands in 😄

I wonder whether we could benefit from a dedicated milestone for this kind of PRs, or if it's simpler to stick to the standard process and to do a milestone bump in edge-case scenario like this one.

🤔 ... currently leaning towards "stick to the standard" process, so we have one less question to answer when choosing the milestone. I.e. we won't have to ask "is this a PR that doesn't affect the user-facing application code".

What I also like in having a milestone even for Tooling and Core PRs is that:

  1. We get regular reminders every sprint that "hey this PR is still open and still needs to be reviewed and merged at some point" so we don't forget it and it doesn't risk getting stale — even if for those the milestone is not as hard of a requirement as for feature PR and isn't saying "this has to be in this version" but rather "it would be nice if it could not wait too long to land"
  2. As you suggest, even some Tooling or Core PRs might have some requirement of landing before a given version ships (e.g. fixing an issue with our tooling that we want fixed before the next code freeze or final build), while others don't ("fix this ASAP, but not blocking"), so I guess this could still be useful to make the distinction between a set milestone (specific version) and a generic "Future" one. But if we were to do that, I think that wouldn't apply just to Tooling/Core PRs anyway (see past discussions in Slack e.g. p1650387248793879/1650385638.490069-slack-C02KLTL3MKM) so that's a whole new discussion to have (likely in dedicated P2) 😅

@crazytonyli crazytonyli removed this from the 21.1 milestone Oct 27, 2022
@crazytonyli crazytonyli added this to the Someday milestone Oct 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants