Fix CI to correctly use OCaml version matrix #296

sim642 · 2021-07-21T12:13:16Z

Closes #184.

Changes

This implements (a preliminary version) of the CI setup I proposed:

First (fast) workflow for checking testing the locked state. This matrix just has 2 OS-s, OCaml version and Apron are fixed by the lock file.
Second (slow) workflow for checking the unlocked state. This matrix uses 2 OS-s, multiple OCaml versions and availability of the optional Apron dependency.

TODOs

Some open questions still are:

The fast workflow should run on every push, but when should the slow one run? Nightly on schedule? And when invoked manually?
Should the fast workflow run domain and marshalling tests or should they be left for the slow workflow?
Which OCaml versions to test and thus support? Currently Goblint doesn't compile on 4.07 any more due to some newer Stdlib usage. 4.09 works, so we could make it the new minimum and have CI enforce enforce that.
Review caching. setup-ocaml@v2 comes with some support built-in.

…able

…ent variable" This reverts commit c6340b4.

vogler · 2021-07-27T12:56:59Z

The fast workflow should run on every push, but when should the slow one run? Nightly on schedule? And when invoked manually?

Yes, nightly, manually.

Should the fast workflow run domain and marshalling tests or should they be left for the slow workflow?

How long do they take? Depends on how well they're covered by the reg. tests and their likelihood of failing. I assume domain might be worth it, marshalling maybe in the nightly?

Which OCaml versions to test and thus support? Currently Goblint doesn't compile on 4.07 any more due to some newer Stdlib usage. 4.09 works, so we could make it the new minimum and have CI enforce enforce that.

Yes, 4.09 min should be fine.

Review caching. setup-ocaml@v2 comes with some support built-in.

You used it here - times are the same or is it worse than what we had before?

vogler · 2021-07-27T19:01:19Z

By running on push and pull_request we have 2*n runs where n are the number of runs from matrix.

We could also just run on push and opened PRs:
https://github.community/t/how-to-trigger-an-action-on-push-or-pull-request-but-not-both/16662/8

However, pull_request has the advantage that it checks the merge every time. The question is if we need that or only when ready to merge / after review.

sim642 · 2021-07-28T12:39:54Z

I'm surprised about the caching situation in #308, i.e. going back to v1 and doing it manually being so much faster. Because with regression tests, the situation seems to be the following:

Old (silently locked) workflow on Ubuntu and 4.12 takes ~7m 16s and has ~661MB cache: https://github.com/goblint/analyzer/runs/3180831067?check_suite_focus=true.
New explicitly locked workflow on Ubuntu and 4.12 takes ~5m 14s and has ~307MB cache in total: https://github.com/goblint/analyzer/runs/3123784005?check_suite_focus=true.

And from what I understand, the caching built into v2 doesn't actually cache the entire installed switch but just the local opam repository listing and downloads? How can the equivalent locked workflow be faster than the old one at all then?

Also, I guess that means we don't have to then explicitly put ${{ hashFiles('goblint.opam.locked') }} into the cache key like before, since it's not actually caching any of that? I was wondering about this when I meant to review the caching since that wouldn't work for the unlocked install anyway (newer dependencies might match the constraints, but wouldn't get installed if we have the entire switch cached). Not caching that is consistent with the fact that the setup-ocaml action is currently meant for unlocked installs: ocaml/setup-ocaml#166.

If caching the entire switch based on the lock file for the locked workflow brings additional speedup on top of what v2 currently offers, then I suppose our locked workflow could explicitly also do that, but I haven't looked into how well it would play with the v2 action. I don't like the idea of staying with the outdated v1, especially since somehow it seems to be slower despite caching the entire locked switch.

sim642 · 2021-07-28T12:58:37Z

By running on push and pull_request we have 2*n runs where n are the number of runs from matrix.

We could also just run on push and opened PRs:
https://github.community/t/how-to-trigger-an-action-on-push-or-pull-request-but-not-both/16662/8

However, pull_request has the advantage that it checks the merge every time. The question is if we need that or only when ready to merge / after review.

Currently I left it that way so I could see the unlocked workflow results each time, but I agree it's excessive. I think it'd be fine to limit the unlocked workflow to pull_request, nightly schedule on master and manual triggering.

vogler · 2021-07-28T15:47:50Z

I'm surprised about the caching situation in #308, i.e. going back to v1 and doing it manually being so much faster. Because with regression tests, the situation seems to be the following: ...

There are several things:

Old calls bash -x scripts/travis-ci.sh after avsm/setup-ocaml@v1 which tries to install opam via apt ppa again whereas the new one just does opam install . --deps-only --locked and ./make.sh nat. But that's only 18s wasted.
4.12.0 takes 7m16s, but 4.07.1 only takes 4m6s. Both use the same cache, but only 4.07.1 seems to be cached, whereas 4.12.0 is compiled fresh. That's ~3m6s wasted.

We currently compile 4.12.0 every run since only 4.07.1 is cached. See #296 (comment).

…piler, #296

…am as in #296

vogler · 2021-07-28T17:04:09Z

Now it's at 2m56s-3m22s: https://github.com/goblint/analyzer/runs/3184132409
Needed to add opam depext for v1 which was done manually in travis-ci.sh.

vogler · 2021-07-28T17:47:37Z

If caching the entire switch based on the lock file for the locked workflow brings additional speedup on top of what v2 currently offers, then I suppose our locked workflow could explicitly also do that, but I haven't looked into how well it would play with the v2 action.

With an existing cache of the switch, it complains:
https://github.com/goblint/analyzer/runs/3184417966#step:4:231

 /opt/hostedtoolcache/opam/2.0.9/x86_64/opam switch create . --no-install --packages ocaml-base-compiler.4.12.0
  [ERROR] There already is an installed switch named /home/runner/work/analyzer/analyzer

vogler · 2021-07-29T12:59:40Z

Now it's at 2m56s-3m22s: https://github.com/goblint/analyzer/runs/3184132409

With v2 and its cache only it now takes 6m59s for ubuntu (and 12m39s for macos):
https://github.com/goblint/analyzer/runs/3184910273

sim642 · 2021-08-11T15:42:40Z

These unlocked matrix builds really highlight how fragile our CI is. Regression test timeouts are not so infrequent on GitHub despite the fact that I never see them locally. Restarting all the jobs might cause them to go through, which means that it's just some kind of resource sharing thing on GitHub. Like, is Ruby's timeout using wall time instead of cpu time and that causes it to fail if unfair scheduling doesn't give enough time for Goblint to run?

EDIT: I guess we could also increase the regression test timeout to see if it helps.

sim642 added 9 commits July 21, 2021 12:49

Disable current tests workflow

9ab8bb7

Add workflow for testing locked OCaml

f34bb6b

Add octApron regression tests to locked workflow

7d77856

Override OPAMLOCKED for setup-ocaml in locked workflow

21064d0

Try disabling opam-pin in locked workflow instead of environment vari…

c6340b4

…able

Revert "Try disabling opam-pin in locked workflow instead of environm…

bcb9fe0

…ent variable" This reverts commit c6340b4.

Add setup-ocaml lock file issue link into comment

59802c0

Add workflow for testing unlocked OCaml

ac7fcaa

Make Apron conditional in unlocked workflow

b7329d7

sim642 added cleanup testing setup deps, CI, release labels Jul 21, 2021

Add custom name to unlocked workflow job

d6e6e67

vogler mentioned this pull request Jul 27, 2021

check indentation of changes on GitHub #308

Merged

vogler added a commit that referenced this pull request Jul 28, 2021

CI: separate cache per ocaml-version

72ee51d

We currently compile 4.12.0 every run since only 4.07.1 is cached. See #296 (comment).

vogler added a commit that referenced this pull request Jul 28, 2021

CI: fix: actually use ocaml-version for cache name not v2's ocaml-com…

5f7700e

…piler, #296

vogler added a commit that referenced this pull request Jul 28, 2021

CI: no longer use travis-ci.sh but just install deps with existing op…

92c3fdc

…am as in #296

vogler mentioned this pull request Jul 28, 2021

CI: macos opam setup fails #309

Closed

sim642 added 4 commits August 11, 2021 15:47

Merge branch 'master' into ocaml-ci

eb871c5

Add domaintest to locked & unlocked workflow

8ddf9ff

Remove failing OCaml 4.07 support

e78ac79

Add marshal regression testing to unlocked workflow

91a2e40

sim642 added 3 commits August 11, 2021 15:59

Allow manual trigger of unlocked workflow

73bf72f

Allow nightly trigger of unlocked workflow

f285bb8

Remove old tests workflow replaced by locked

e51b958

sim642 marked this pull request as ready for review August 11, 2021 14:13

sim642 added 2 commits August 11, 2021 17:20

Remove push trigger from unlocked workflow

d0b30ab

Update workflow badges in README

ec2963d

sim642 merged commit d512465 into master Aug 13, 2021

sim642 deleted the ocaml-ci branch August 13, 2021 08:14

vogler mentioned this pull request Oct 12, 2021

Repair CI for OS X goblint/cil#50

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CI to correctly use OCaml version matrix #296

Fix CI to correctly use OCaml version matrix #296

sim642 commented Jul 21, 2021 •

edited

vogler commented Jul 27, 2021

vogler commented Jul 27, 2021

sim642 commented Jul 28, 2021

sim642 commented Jul 28, 2021

vogler commented Jul 28, 2021

vogler commented Jul 28, 2021

vogler commented Jul 28, 2021

vogler commented Jul 29, 2021

sim642 commented Aug 11, 2021 •

edited

Fix CI to correctly use OCaml version matrix #296

Fix CI to correctly use OCaml version matrix #296

Conversation

sim642 commented Jul 21, 2021 • edited

Changes

TODOs

vogler commented Jul 27, 2021

vogler commented Jul 27, 2021

sim642 commented Jul 28, 2021

sim642 commented Jul 28, 2021

vogler commented Jul 28, 2021

vogler commented Jul 28, 2021

vogler commented Jul 28, 2021

vogler commented Jul 29, 2021

sim642 commented Aug 11, 2021 • edited

sim642 commented Jul 21, 2021 •

edited

sim642 commented Aug 11, 2021 •

edited