extend timeout for CI tests #123

galargh · 2021-11-16T13:24:58Z

I believe the tests in #112 just about miss the 10m mark and that by extending the timeout to 20m we will be able to get more consistent runs.

I have tested it in CI on my forked repo: https://github.com/galargh/go-ds-crdt/actions/runs/1467239280 (the one failure on mac was cause by network issues).

By merging this PR we can restore the common, integrated CI and retire the custom action (note that the custom testing action was also prone to the timeout issue in some runs - it's just less noticeable because we're testing on fewer platforms there).

welcome · 2021-11-16T13:24:59Z

Thank you for submitting this PR!
A maintainer will be here shortly to review it.
We are super grateful, but we are also overloaded! Help us by making sure that:

The context for this PR is clear, with relevant discussion, decisions
and stakeholders linked/mentioned.
Your contribution itself is clear (code comments, self-review for the
rest) and in its best form. Follow the code contribution
guidelines
if they apply.

Getting other community members to do a review would be great help too on complex PRs (you can ask in the chats/forums). If you are unsure about something, just leave us a comment.
Next steps:

A maintainer will triage and assign priority to this PR, commenting on
any missing things and potentially assigning a reviewer for high
priority items.
The PR gets reviews, discussed and approvals as needed.
The PR is merged by maintainers when it has been approved and comments addressed.

We currently aim to provide initial feedback/triaging within two business days. Please keep an eye on any labelling actions, as these will indicate priorities and status of your contribution.
We are very grateful for your contribution!

galargh · 2021-11-16T14:59:20Z

@hsanjuan do you think that would be an acceptable approach to testing in this repo?

hsanjuan · 2021-11-16T18:24:26Z

It is acceptable but I would just modity the actual action file directly for the test that takes long and add the necessary flag there.

galargh · 2021-11-16T19:28:47Z

@marten-seemann please correct me if I'm wrong. The actual go call that times out is defined here:

go-ds-crdt/.github/workflows/go-test.yml

Line 42 in d4af8d4

run: go test -v ./...

. We cannot modify this file directly because the automatic PRs from web3dt-bot (like this one #112) would overwrite it in the future. AFAICT custom actions within .github/actions/go-test-setup is the most fitting place to put the timeout override, at least for now.

marten-seemann · 2021-11-16T19:34:06Z

@galargh Your understanding is correct.
Without having any insight into the code being tested here, is it really necessary to have a test that runs for more than 10 minutes? That seems like something we’d want to avoid anyway.

hsanjuan · 2021-11-17T08:47:26Z

I have removed the "automatic" CI stuff made by the bot because it only introduced tests that failed and took overly long. And honestly, this was a repo with its CI that was working well and I had no problem with it until things were automated for me. I am not sure that the 10 minute limit was the reason that things were failing though.

Regarding the 10 minute test and the last failure, the datastore test-suite is quite heavy on this datastore so yes, I think it is probably normal that it takes around that time, also the reason why it runs separately from the other tests and the reason also why it is disabled when running with race detector.

marten-seemann · 2021-11-17T09:07:02Z

I have removed the "automatic" CI stuff made by the bot because it only introduced tests that failed and took overly long.

Please don't modify / delete any files deployed by web3-bot. It will be re-added automatically the next time the deploy script runs. Really, I don't think "let's not test at all" is the right answer to a failing test case.

hsanjuan · 2021-11-17T09:16:23Z

It's not "let's not test". Things are tested. The bot is just causing me work and solving problems I don't have.

How can I opt out of web3-bot?

galargh · 2021-11-17T09:25:23Z

I am not sure that the 10 minute limit was the reason that things were failing though.

Looking at the past runs it does seem so (here for example https://github.com/ipfs/go-ds-crdt/runs/3905630735). You can also see a set of passing runs for the automatic CI workflow here: https://github.com/galargh/go-ds-crdt/actions/runs/1467239280 (this is with the extended timeout).

I'd also like to add that the custom CI workflow is also prone to timeout issue. See these runs for example:

just about passed in 578.364s https://github.com/ipfs/go-ds-crdt/runs/4067156848
failed after 600.018s without any failing test cases https://github.com/ipfs/go-ds-crdt/runs/4023201973

So even if the custom CI workflow was to stay in this repo I think extending the timeout for it or optimising the test might be a good idea.

I'm not too well versed in the PL ways yet but there are definitely some advantages I can see in the automatic CI approach:

more thorough testing (matrix of platforms + architectures)
adherence to PL policy (e.g support for current and previous Go versions)
automatic updates to the workflow code

It definitely comes at a cost of less customisation options.

I've had one more idea looking at the setup in this repo. Maybe we could setup tests here so that if -short flag is provided on the command line then TestDatastoreSuite is not executed. Then, we could add the -short flag to the automatic CI setup which would remove the need for extending timeout. Finally, we could remove the build section from https://github.com/ipfs/go-ds-crdt/blob/master/.github/workflows/go.yml#L6 and leave only the suite part there. That way we'll still have quick multi-platform, multi-arch CI + custom long running CI. What do you think? I could make the required changes if that's something you were interested in.

marten-seemann · 2021-11-17T09:51:29Z

How can I opt out of web3-bot?

In principle, by removing this repo from https://github.com/protocol/.github. However, I would heavily advise against this. We've decided to standardize our workflows because having a separate config in every single repo just didn't work, especially given the large number of repos we have. We had a long discussion on how Go repositories should be tested, and looking at https://github.com/ipfs/go-ds-crdt/blob/master/.github/workflows/go.yml, it seems like a lot of use cases we care about remain untested here:

it doesn't test with the current Go version (which is the version we use to cut releases with!),
it doesn't test on 32-bit systems,
it doesn't test on OSX and Windows
additionally, it's *vulnerable to supply-chain attacks (via staticcheck)

@galargh's suggestion to pass the timeout flag to our go test calls sounds pretty reasonable to me, if the test case can only test things meaningfully in that time frame. I think we can increase the timeout globally (@galargh, do you want to create a PR in .github to do so?), maybe to 30 minutes or so? I assume that repo maintainers still have an incentive to keep their tests fast, as nobody likes waiting on CI.

hsanjuan · 2021-11-17T10:05:51Z

Ok, I am not against web3-bot existing, I am against it being added here and causing test failures when the original testing works based on one-glove-fits-all assumptions.

All the "advantanges" are mostly irrelevant here, for this library concretely.

hsanjuan · 2021-11-17T10:14:57Z

I've merged because the patch should potentially solve the test failures, also with the web3-ci, whenever it gets re-added.

I have no problem having it around if it works. I have problem with failures that come from lack of options for customization but are not signalling any real issues.

galargh · 2021-11-17T10:46:47Z

I have problem with failures that ... are not signalling any real issues.

I think that's a very valid point. FYI, right now, I'm making my way through all of the unmerged web3d-bot PRs to figure out what the reasons are for the auto CI failures.

marten-seemann · 2021-11-17T11:03:37Z

when the original testing works based on one-glove-fits-all assumptions

Not sure which "one-glove-fits-all assumptions" you mean (as I've said before, I think extending the test timeout is a very reasonable thing to do). When it comes to running tests on all platforms that we care about, I see that pretty much as a requirement.

I have no problem having it around if it works. I have problem with failures that come from lack of options for customization but are not signalling any real issues.

Happy to discuss customizations further. If you're missing a customization option, can you please open an issue in https://github.com/protocol/.github?
The goal of Unified CI is not to make your life hard, the goal is that we have to worry about CI (and updating CI) less! And we've achieved this for more than 150 repos across our entire stack already.

* bump go.mod to Go 1.16 and run go fix * run go mod tidy * run gofmt -s * update .github/workflows/automerge.yml * update .github/workflows/go-test.yml * update .github/workflows/go-check.yml * extend timeout for CI tests (#123) Co-authored-by: web3-bot <web3-bot@users.noreply.github.com> Co-authored-by: Piotr Galar <piotr.galar@gmail.com>

hsanjuan · 2021-11-17T12:57:51Z

When it comes to running tests on all platforms that we care about, I see that pretty much as a requirement.

This is a small library and these tests are meant to test the logic implemented on this library, and this does not deal with OS, arch or anything close to that so for me testing on MacOS or Windows or 32bit HERE has very little value. That testing happens or should happen below (i.e. in badger) or above (i.e. in cluster) where those things maybe relevant.

For the same reason, it is going to pass all those tests, just taking more time. I already had to adapt things on my code because the assumption of the unified-CI that everything can run with the race detector enabled and still need to run the tests as I had them before so the datastore test suite runs correctly.

the goal is that we have to worry about CI (and updating CI) less!

Yes, so I am trying to explain that this is causing me worries, because as maintainer/babysitter of this library I had no trouble at all with what I had, whereas now I am spending time with Unified CI stuff, and that was the reason I removed it.

I do thank you and @galargh for solving the issues with the timeouts.

galargh force-pushed the go-test-setup branch 2 times, most recently from 244f553 to 79fa39c Compare November 16, 2021 14:14

extend timeout for CI tests

d4af8d4

galargh force-pushed the go-test-setup branch from 79fa39c to d4af8d4 Compare November 16, 2021 14:17

galargh marked this pull request as ready for review November 16, 2021 14:57

hsanjuan merged commit fbf3c3a into ipfs:web3-bot/sync Nov 17, 2021

galargh deleted the go-test-setup branch November 17, 2021 10:06

marten-seemann pushed a commit that referenced this pull request Nov 17, 2021

extend timeout for CI tests (#123)

b121953

galargh mentioned this pull request Nov 17, 2021

Increase go-test timeout to 30m protocol/.github#226

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extend timeout for CI tests #123

extend timeout for CI tests #123

galargh commented Nov 16, 2021 •

edited

Loading

welcome bot commented Nov 16, 2021

galargh commented Nov 16, 2021

hsanjuan commented Nov 16, 2021

galargh commented Nov 16, 2021

marten-seemann commented Nov 16, 2021

hsanjuan commented Nov 17, 2021

marten-seemann commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

galargh commented Nov 17, 2021

marten-seemann commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

galargh commented Nov 17, 2021

marten-seemann commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

extend timeout for CI tests #123

extend timeout for CI tests #123

Conversation

galargh commented Nov 16, 2021 • edited Loading

welcome bot commented Nov 16, 2021

galargh commented Nov 16, 2021

hsanjuan commented Nov 16, 2021

galargh commented Nov 16, 2021

marten-seemann commented Nov 16, 2021

hsanjuan commented Nov 17, 2021

marten-seemann commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

galargh commented Nov 17, 2021

marten-seemann commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

galargh commented Nov 17, 2021

marten-seemann commented Nov 17, 2021

hsanjuan commented Nov 17, 2021

galargh commented Nov 16, 2021 •

edited

Loading