Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer more knowledge about CI failures #58

Closed
wilzbach opened this issue Feb 22, 2017 · 8 comments
Closed

Infer more knowledge about CI failures #58

wilzbach opened this issue Feb 22, 2017 · 8 comments

Comments

@wilzbach
Copy link
Member

From @WebDrake

I would suggest that some CI are more important than others -- so e.g. style CI should not block moving to 'needs review', but test-suite failure should probably see the tag set to 'needs work' (maybe with a friendly message about how to address the test-suite failure the first time it happens).

@wilzbach
Copy link
Member Author

My reply:

AFAIK this is not exposed via the GH API and I can think of these three options: [to gain this knowledge]

@WebDrake
Copy link

@wilzbach Ah, OK. I may have been misinterpreting L33-50 of your ci.d file, where I was making the assumption that you were able to query the individual CI providers for status info.

If that's difficult (for now) then I agree that a minimum threshold is a good compromise. In all likelihood this will in practise amount to higher-priority and more stable tests passing (it seems to be stuff like the style and code-coverage tests that fail more frequently and sometimes arbitrarily).

@wilzbach
Copy link
Member Author

I may have been misinterpreting L33-50 of your ci.d file, where I was making the assumption that you were able to query the individual CI providers for status info.

No you haven't. We do query the APIs of the CI providers for the PR, but we already get the status info as part of the hook payload, see e.g. this file for an example:

https://github.com/dlang-bots/dlang-bot/blob/master/data/hooks/github/dlang_dmd_status_6324.json

Moreover we do query the status endpoint as well to get an overview of the state of all CI providers, see e.g. this file for an example:

https://github.com/wilzbach/dlang-bot/blob/07f9393c071d45d5bceec44bedf6280a18b5721d/data/payloads/github_repos_dlang_dmd_status_eb53933e2d0989f6da3edf8581bb3de4acac9f0e

(it seems to be stuff like the style and code-coverage tests that fail more frequently and sometimes arbitrarily).

Yeah this is still due to https://github.com/codecov/support/issues/360 (don't forget to vote for it)

If that's difficult (for now) then I agree that a minimum threshold is a good compromise. In all likelihood this will in practise amount to higher-priority and more stable tests passing

AFAIK @MartinNowak's plan are

  • superseed CircleCi with the new Dlang CI system, so that the number of CI goes down
  • set all CIs except CodeCov to "enforced" (and maybe even CodeCov)

That's also why I didn't want to put any specific assumption about the CI system into the bot as of now.

@MartinNowak
Copy link
Member

All CIs should be created equal, attention is the most precious resource we have.
Any "bad" CIs not worth being noticed or not presenting understandable errors, are rather a burden, as they lower the credibility of "good" CIs.
Requiring people to learn a complex (and undocumented) rating system based on 5 ✓/✗s sounds horrible. It would be great if GH had a 🚥 status including unstable builds, but they don't so we have to live with binary.
IMO there should be a single green or red dot on every PR. CIs that are too flaky, still experimental, or only informative should always be green.
We should also stop abusing required as a tool to let PRs with red "bad" CIs pass through. That's a conflicting message and thus confusing.

@MartinNowak
Copy link
Member

MartinNowak commented Jan 15, 2018

(and maybe even CodeCov)

Even that, coverage is important, we can configure sufficient tolerances. BTW, we agreed to only keep codecov/patch as that has the best SNR.

@MartinNowak
Copy link
Member

MartinNowak commented Jan 15, 2018

I would suggest that some CI are more important than others -- so e.g. style CI should not block moving to 'needs review', but test-suite failure should probably see the tag set to 'needs work'

As for the workflow ideas in the OP, such a distinction sounds reasonable.
While all CI failures should mean failure and not mergeable, they have different implications on the review workflow.
We should abstract this a bit, in case we later move linting and unittests to the same infrastructure, though even then we should be able to extract the information from the status message.

@MartinNowak MartinNowak added 3-mi and removed 1-do labels Jan 15, 2018
@wilzbach
Copy link
Member Author

Even that, coverage is important, we can configure sufficient tolerances.

We did overwrite CodeCov to be always green.

BTW, we agreed to only keep codecov/patch as that has the best SNR.

dlang/dmd#7711

While all CI failures should mean failure and not mergeable

Absolutely agreed and for the main D repos all CIs are now enforced to pass. Well except for gigantically slow Travis OSX builds for DMD. It has worked well for the last seven days.

@CyberShadow
Copy link
Member

All CIs should be created equal, attention is the most precious resource we have.

This sounds right to me today as well, so I think we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants