Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add back fiat-crypto-legacy to the CI #12554

Merged
merged 2 commits into from
Jun 25, 2020
Merged

Conversation

JasonGross
Copy link
Member

This partially reverts commit 35a1cc4.
(I've not added back the nix things, since I'm not sure what purpose
they serve, and I've adjusted the targets slightly.)

The CI build should no longer take an enormously long time to start, and
fiat-crypto-legacy code is actively being used to track down memory
issues in #12487. Additionally, f-c-l revealed a genuine bug in #7825,
and so I'd like to keep f-c-l in the CI at least until #7825 is
finished.

I've been maintaining compatibility of f-c-l with master anyway on the
side, in part to continue some performance experiments with it, and
expect to continue to do so at least until the end of this calendar
year, and it'd be nice for me to get advance warning when a PR is going
to break it on master. (It seems reasonable to me to take it off the CI
again once I'm no longer maintaining it / collecting data from it, and /
or once #7825 is finished.) (I'm open to other opinions about it not being good to add it back to the CI.)

Kind: infrastructure.

@JasonGross JasonGross added the kind: infrastructure CI, build tools, development tools. label Jun 21, 2020
@JasonGross JasonGross requested a review from a team as a code owner June 21, 2020 01:38
Copy link
Member

@ejgallego ejgallego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am quite supportive of this change; will mention it in the call tho just in case.

@ejgallego ejgallego self-assigned this Jun 21, 2020
@ejgallego ejgallego added this to the 8.13+beta1 milestone Jun 21, 2020
@SkySkimmer SkySkimmer added the needs: discussion Further discussion is needed. label Jun 22, 2020
@maximedenes
Copy link
Member

Maybe a bit of context would help preparing the discussion. The main reason why I personally wanted to remove fiat-crypto-legacy from the CI is that it was extremely hard / time consuming to investigate failures. Sure, occasionally it would catch a few bugs, but the cost / benefit ratio was not good for us.

So: if we put it back, what's the implicit contract? Who has to do the investigation in case of a failure on a PR?

@ejgallego
Copy link
Member

So: if we put it back, what's the implicit contract? Who has to do the investigation in case of a failure on a PR?

I understand that @JasonGross would take care of this, and we would keep it in the CI as long a @JasonGross can keep doing the maintenance job. It is not possible for us to maintain fcl effectively, I think we all agree.

I have a "soft" view in terms of the CI, in the sense that indeed:
a) we cannot maintain all devs ourselves
b) it is OK to remove a dev if it is lagging behind , it can be also very easily re-added following the normal process

In my view the principal mission of the CI is to allow for controlled change, so indeed having a diverse set of contributions here has the goal indeed to determine safety of a change.

Providing overlays is actually a service we have been doing to the community, but indeed, it is subtle. In some sense, we should not do the overlays, however that would carry a significant risk. I have done a few changes to later discovered that would not be possible to have the overlays work with them.

Indeed, in quite a few large mono-repos, the one who breaks is the one that fixes. But Coq is a bit special here I'm afraid. Additionally, our current overlay tooling is just bad.

@maximedenes
Copy link
Member

@JasonGross are there other people ready to help you with this work on systematically investigating f-c-l failures on all Coq PRs? It seems to me like a large amount of work, and I'd like to estimate the bus factor there.

Also for context: what has changed in f-c-l since we took the decision to remove it? You mention that it is faster to start its build (which indeed was one of the blocker). But has robustness also improved? Or do we (you) still risk having to inspect 5000+ lines of Ltac traces just to understand one change of behavior on the Coq side?

@JasonGross
Copy link
Member Author

are there other people ready to help you with this work on systematically investigating f-c-l failures on all Coq PRs?

No

It seems to me like a large amount of work, and I'd like to estimate the bus factor there.

The bus factor is 1. The amount of work is not so large, in my experience, maybe a couple of hours per PR that breaks f-c-l. If I'm too busy at the time, and f-c-l is the only blocker of a PR for, let's say, a week, I'm fine with moving it to allowed-fail, and then moving it back once I get around to fixing it. (Or just removing it, and I can add it back again once I fix it.)

I will note, btw, that creating overlays for most of f-c-l is no harder than creating overlays for fiat-crypto. It is only when the failing file lives in src/Specific/ that we hit the problem of massive amounts of hard-to-debug Ltac automation.

Also for context: what has changed in f-c-l since we took the decision to remove it?

Mainly just that I'm actually using it now for some performance measurements, and care for my own purposes about it continuing to work with the latest version of master.

You mention that it is faster to start its build (which indeed was one of the blocker). But has robustness also improved?

No

Or do we (you) still risk having to inspect 5000+ lines of Ltac traces just to understand one change of behavior on the Coq side?

It's more like 2M+ lines of Ltac traces, not 5k. git diff --no-index / diff -y --suppress-common-lines --speed-large-files is a life-saver here, since vimdiff takes about 13 hours on files this big (cf vim/vim#6269). But it's not so bad, as usually the breaking PR doesn't also introduce lots of other changes, and I generally still have enough of the project state in my head that seeing where the traces differ lets me extract a relatively small example without too much trouble, which then lets me figure out how to fix it. (I did this recently to track down a difference between 8.8 and master on some files I hadn't been building; I eventually tracked it down to the import-export fix in 8.11, which was causing a ring that previously worked without importing the ring structure to stop working. This led me to report #12553, btw.) The most painful part of the whole process is waiting for the hour or so that it takes to compile two different versions of Coq and then compile f-c-l with each of these Coq's.

@JasonGross
Copy link
Member Author

So: if we put it back, what's the implicit contract? Who has to do the investigation in case of a failure on a PR?

I understand that @JasonGross would take care of this, and we would keep it in the CI as long a @JasonGross can keep doing the maintenance job. It is not possible for us to maintain fcl effectively, I think we all agree.

Yes, that's my understanding as well

This partially reverts commit 35a1cc4.
(I've not added back the nix things, since I'm not sure what purpose
they serve, and I've adjusted the targets slightly.)

The CI build should no longer take an enormously long time to start, and
fiat-crypto-legacy code is actively being used to track down memory
issues in coq#12487.  Additionally, f-c-l revealed a genuine bug in coq#7825,
and so I'd like to keep f-c-l in the CI at least until coq#7825 is
finished.

I've been maintaining compatibility of f-c-l with master anyway on the
side, in part to continue some performance experiments with it, and
expect to continue to do so at least until the end of this calendar
year, and it'd be nice for me to get advance warning when a PR is going
to break it on master.  (It seems reasonable to me to take it off the CI
again once I'm no longer maintaining it / collecting data from it, and /
or once coq#7825 is finished.)
@Zimmi48
Copy link
Member

Zimmi48 commented Jun 24, 2020

See coq/bot#59.

@JasonGross
Copy link
Member Author

I think this should be ready for merge as soon as coq/bot#59 is merged

@JasonGross JasonGross added needs: merge of dependency This PR depends on another PR being merged first. and removed needs: discussion Further discussion is needed. needs: merge of dependency This PR depends on another PR being merged first. labels Jun 24, 2020
@JasonGross
Copy link
Member Author

Or perhaps it can even be merged now

@ejgallego
Copy link
Member

Or perhaps it can even be merged now

It seems ready to me @JasonGross , I'll merge tomorrow evening.

@ejgallego ejgallego merged commit d7d3c70 into coq:master Jun 25, 2020
@JasonGross JasonGross deleted the add-back-fcl branch June 25, 2020 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: infrastructure CI, build tools, development tools.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants