Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious documentation failures #7829

Closed
Zimmi48 opened this issue Jun 15, 2018 · 14 comments · Fixed by #7856
Closed

Spurious documentation failures #7829

Zimmi48 opened this issue Jun 15, 2018 · 14 comments · Fixed by #7856
Labels
kind: ci-failure Information about unexpected and random CI failures. kind: infrastructure CI, build tools, development tools.
Milestone

Comments

@Zimmi48
Copy link
Member

Zimmi48 commented Jun 15, 2018

We observe an alarming rate of failures of the documentation build on GitLab these days:

In all these cases, the error is the same:

Exception occurred:
  File "/usr/lib/python3/dist-packages/pexpect/expect.py", line 70, in timeout
    raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
...
Error while sending the following sentence to coqtop: From Coq Require Import ssreflect ssrfun ssrbool List.

What is especially weird is that I'm under the impression that these failures started appearing after the merge of #7284 but what is weird is that this PR didn't touch the faulty line. It only added https://github.com/coq/coq/pull/7284/files#diff-d80ca9da497679f1161bfbf7e7ab7ce1R509 in some other place.
So I'm tempted by two solutions. Either separate these imports in several lines to make it less likely to reach timeout (which is already at 10s), or revert the change I just linked to, to see if it changes anything (in which case, it looks like a bug).

@Zimmi48 Zimmi48 added the kind: infrastructure CI, build tools, development tools. label Jun 15, 2018
Zimmi48 added a commit to Zimmi48/coq that referenced this issue Jun 18, 2018
We split a Require Import in two to avoid reaching the timeout.
@MSoegtropIMC MSoegtropIMC added the kind: ci-failure Information about unexpected and random CI failures. label Aug 27, 2018
@azarens-git
Copy link

azarens-git commented Feb 10, 2021

I could confirm that this issue occurs with coq-8.13.0 when it is built on NetBSD 9.1_STABLE. Furthermore, this issue is not spurious, and occurs consistently.

@Zimmi48
Copy link
Member Author

Zimmi48 commented Feb 10, 2021

@azarens-git are you building on a local machine or on a VM? Is this always this particular line which is timing out?

@Zimmi48
Copy link
Member Author

Zimmi48 commented Feb 10, 2021

Can you in fact share more details? (The line in question was already split to fix the problem in our CI. Therefore, it is likely that what you were hitting is actually a separate issue.)

@azarens-git
Copy link

azarens-git commented Feb 10, 2021

The relevant build log is attached. If desired, I could also supply the build configuration, but there's nothing extraordinary there besides a regular invocation of dune and the replacements python -> python3.9 and sphinx-build -> sphinx-build-3.9

The build was started in pkgsrc-current, by means of cd /usr/pkgsrc/lang/coq && make with version of coq pulled to 8.13.0. The failure occurs consistently; there has been no single build when it did not occur.

The source tree is from coq's github repository, with tag V8.13.0.

===> Creating toolchain wrappers for coq-8.13.0
/bin/mkdir -p /usr/obj/pkgsrc/lang/coq/work/.buildlink/lib/pkgconfig
cd /usr/obj/pkgsrc/lang/coq/work/.buildlink/lib/pkgconfig && /bin/ln -sf enchant-2.pc enchant.pc
===> Configuring for coq-8.13.0
=> Substituting "py-sphinx-build" in configure.ml  doc/dune
=> Substituting "python" in configure.ml
=> Replacing python interpreter in tools/TimeFileMaker.py tools/make-both-single-timing-files.py tools/make-both-time-files.py tools/make-one-time-file.py dev/tools/github-check-prs.py dev/tools/update-compat.py doc/sphinx/conf.py doc/tools/coqrst/regen_readme.py.
INFO: [replace-interpreter] Nothing changed in tools/TimeFileMaker.py.
=> Checking for portability problems in extracted files
===> Building for coq-8.13.0
bash doc/stdlib/index-list.html
Building file index-list.prehtml... Done
coqdep user-contrib/Ltac2/Env.v.d
*** Warning: in file Env.v,
required library Init matches several files in path
(found Init.v in ../../theories/Classes and .; used the latter)
bash doc/stdlib/_index.html,doc/stdlib/html
Line 28, character 157, warning: unterminated "]"
Line 26, character 1, warning: missing newline after ">>" block
env doc/refman-html (exit 2)
(cd _build/default/doc && /usr/obj/pkgsrc/lang/coq/work/.tools/bin/env COQLIB=.. sphinx-build-3.9 -q -W -b html sphinx refman-html)

Extension error:
/usr/obj/pkgsrc/lang/coq/work/coq-8.13.0/_build/default/doc/sphinx/proofs/writing-proofs/rewriting.rst:: Error while sending the following to coqtop:
Time assert (id (fact 9) = fact 9) by reflexivity.
coqtop output:

Full error text:
Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7c689d0aa540>
command: /usr/obj/pkgsrc/lang/coq/work/coq-8.13.0/_build/install/default/bin/coqtop
args: [b'/usr/obj/pkgsrc/lang/coq/work/coq-8.13.0/_build/install/default/bin/coqtop', b'-q', b'-color', b'on']
buffer (last 100 chars): ''
before (last 100 chars): ''
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 25590
child_fd: 7
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
0: re.compile('\r\n[^< \r\n]+ < ')
Duplicate cmd name:  Extraction
Duplicate tacn name:  first (ssreflect)
Duplicate tacn name:  have
Duplicate tacn name:  apply (ssreflect)
Duplicate tacn name:  under
Duplicate tacn name:  over
Duplicate tacn name:  wlog
Duplicate tacn name:  set (ssreflect)
Duplicate tacn name:  unlock
Duplicate tacn name:  congr
Duplicate cmd name:  Hint View for apply
Duplicate cmd name:  Prenex Implicits
Duplicate exn name:  Not the right number of missing arguments
Duplicate exn name:  No such hypothesis in current goal
Duplicate exn name:  No such hypothesis
Duplicate exn name:  ‘ident’ is used in the hypothesis ‘ident’
Duplicate exn name:  No such hypothesis
Duplicate exn name:  No such hypothesis
Duplicate exn name:  ‘ident’ is already used
Duplicate exn name:  No such assumption
Duplicate exn name:  Not an inductive product
Duplicate exn name:  Not equal
Duplicate exn name:  Not equal (due to universes)
Duplicate exn name:  Unable to unify ‘term’ with ‘term’
Duplicate exn name:  No focused proof (No proof-editing in progress)
env doc/refman-pdf (exit 2)
(cd _build/default/doc && /usr/obj/pkgsrc/lang/coq/work/.tools/bin/env COQLIB=.. sphinx-build-3.9 -q -W -b latex sphinx refman-pdf)

Extension error:
/usr/obj/pkgsrc/lang/coq/work/coq-8.13.0/_build/default/doc/sphinx/proofs/writing-proofs/rewriting.rst:: Error while sending the following to coqtop:
Time assert (id (fact 9) = fact 9) by reflexivity.
coqtop output:

Full error text:
Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x794c54554c10>
command: /usr/obj/pkgsrc/lang/coq/work/coq-8.13.0/_build/install/default/bin/coqtop
args: [b'/usr/obj/pkgsrc/lang/coq/work/coq-8.13.0/_build/install/default/bin/coqtop', b'-q', b'-color', b'on']
buffer (last 100 chars): ''
before (last 100 chars): ''
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 28858
child_fd: 7
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
0: re.compile('\r\n[^< \r\n]+ < ')
Duplicate cmd name:  Extraction
Duplicate tacn name:  first (ssreflect)
Duplicate tacn name:  have
Duplicate tacn name:  apply (ssreflect)
Duplicate tacn name:  under
Duplicate tacn name:  over
Duplicate tacn name:  wlog
Duplicate tacn name:  set (ssreflect)
Duplicate tacn name:  unlock
Duplicate tacn name:  congr
Duplicate cmd name:  Hint View for apply
Duplicate cmd name:  Prenex Implicits
Duplicate exn name:  Not the right number of missing arguments
Duplicate exn name:  No such hypothesis in current goal
Duplicate exn name:  No such hypothesis
Duplicate exn name:  ‘ident’ is used in the hypothesis ‘ident’
Duplicate exn name:  No such hypothesis
Duplicate exn name:  No such hypothesis
Duplicate exn name:  ‘ident’ is already used
Duplicate exn name:  No such assumption
Duplicate exn name:  Not an inductive product
Duplicate exn name:  Not equal
Duplicate exn name:  Not equal (due to universes)
Duplicate exn name:  Unable to unify ‘term’ with ‘term’
Duplicate exn name:  No focused proof (No proof-editing in progress)
Done: 10971/10977 (jobs: 1)*** Error code 1

Stop.
make[1]: stopped in /usr/pkgsrc/lang/coq
*** Error code 1

Stop.
make: stopped in /usr/pkgsrc/lang/coq

@azarens-git
Copy link

I captured the chat between coqtop.py and coqtop binary just before the failure occurred:

> Set Coqtop Exit On Error.
< 
> Set Warnings "+default".
< 
> Unset Coqtop Exit On Error.
< 
> Set Warnings "default".
< 
> Goal False.
< 1 subgoal
<   
<   ============================
<   False

>   change_no_check True.
< 1 subgoal
<   
<   ============================
<   True

>   exact I.
< No more subgoals.

> Qed.
< Toplevel input, characters 0-4:
< > Qed.
< > ^^^^
< Error:
< The term "I" has type "True" while it is expected to have type "False".

> Abort All.
< 
> Set Coqtop Exit On Error.
< 
> Set Warnings "+default".
< 
> Unset Coqtop Exit On Error.
< 
> Set Warnings "default".
< 
> Goal True -> False.
< 1 subgoal
<   
<   ============================
<   True -> False

>   intro H.
< 1 subgoal
<   
<   H : True
<   ============================
<   False

>   change_no_check False in H.
< 1 subgoal
<   
<   H : False
<   ============================
<   False

>   exact H.
< No more subgoals.

> Qed.
< Toplevel input, characters 0-4:
< > Qed.
< > ^^^^
< Error:
< The term "fun H : True => H" has type "True -> True"
< while it is expected to have type "True -> False".

> Abort All.
< 
> Set Coqtop Exit On Error.
< 
> Set Warnings "+default".
< 
> Unset Coqtop Exit On Error.
< 
> Set Warnings "default".
< 
> Goal 0 <= 1.
< 1 subgoal
<   
<   ============================
<   0 <= 1

> unfold le.
< Toplevel input, characters 7-9:
< > unfold le.
< >        ^^
< Error: Cannot turn inductive le into an evaluable reference.

> Abort All.
< 
> Set Coqtop Exit On Error.
< 
> Set Warnings "+default".
< 
> Unset Coqtop Exit On Error.
< 
> Set Warnings "default".
< 
> Opaque Nat.add.
< 
> Goal 1 + 0 = 1.
< 1 subgoal
<   
<   ============================
<   1 + 0 = 1

> unfold Nat.add.
< Toplevel input, characters 0-14:
< > unfold Nat.add.
< > ^^^^^^^^^^^^^^
< Error: Nat.add is opaque.

> Abort All.
< 
> Set Coqtop Exit On Error.
< 
> Set Warnings "+default".
< 
> Goal ~0=0.
< 1 subgoal
<   
<   ============================
<   0 <> 0

> unfold not.
< 1 subgoal
<   
<   ============================
<   0 = 0 -> False

> Fail progress fold not.
< The command has indeed failed with message:
< Failed to progress.

> pattern (0 = 0).
< 1 subgoal
<   
<   ============================
<   (fun P : Prop => P -> False) (0 = 0)

> fold not.
< 1 subgoal
<   
<   ============================
<   0 <> 0

> Abort All.
< 
> Reset Initial.
< 
> Set Coqtop Exit On Error.
< 
> Set Warnings "+default".
< 
> Opaque id.
< 
> Goal id 10 = 10.
< 1 subgoal
<   
<   ============================
<   id 10 = 10

> Fail unfold id.
< The command has indeed failed with message:
< id is opaque.

> with_strategy transparent [id] unfold id.
< 1 subgoal
<   
<   ============================
<   10 = 10

> Abort All.
< 
> Reset Initial.
< 
> Set Coqtop Exit On Error.
< 
> Set Warnings "+default".
< 
> Fixpoint fact (n : nat) : nat :=
>   match n with
>   | 0 => 1
>   | S n' => n * fact n'
>   end.
< fact is defined
< fact is recursively defined (guarded on 1st argument)

> Strategy 1000 [id].
< 
> Goal True.
< 1 subgoal
<   
<   ============================
<   True

> Time assert (id (fact 8) = fact 8) by reflexivity.
< Finished transaction in 2.88 secs (1.141u,1.722s) (successful)
< 1 subgoal
<   
<   H : id (fact 8) = fact 8
<   ============================
<   True

> Time assert (id (fact 9) = fact 9) by reflexivity.

@azarens-git
Copy link

The result of applying all input commands directly is as follows.

---8<---

Unnamed_thm < 
Coq < 
Coq < 
Coq < 
Coq < Coq < Coq < Coq < Coq < fact is defined
fact is recursively defined (guarded on 1st argument)

Coq < 
Coq < Goal True.
1 subgoal
  
  ============================
  True

Unnamed_thm < Time assert (id (fact 8) = fact 8) by reflexivity.
Finished transaction in 2.743 secs (0.997u,1.742s) (successful)
1 subgoal
  
  H : id (fact 8) = fact 8
  ============================
  True

Unnamed_thm < Time assert (id (fact 8) = fact 8) by reflexivity.
Finished transaction in 0.537 secs (0.53u,0.003s) (successful)
1 subgoal
  
  H, H0 : id (fact 8) = fact 8
  ============================
  True

Unnamed_thm < Time assert (id (fact 9) = fact 9) by reflexivity.
Finished transaction in 72.36 secs (8.328u,14.418s) (successful)
1 subgoal
  
  H, H0 : id (fact 8) = fact 8
  H1 : id (fact 9) = fact 9
  ============================
  True

Unnamed_thm <     

The same version, but on Debian box:

Unnamed_thm < Finished transaction in 0.221 secs (0.101u,0.117s) (successful)
1 subgoal

  H : id (fact 8) = fact 8
  ============================
  True

Unnamed_thm < Finished transaction in 1.428 secs (1.421u,0.005s) (successful)
1 subgoal

  H : id (fact 8) = fact 8
  H0 : id (fact 9) = fact 9
  ============================
  True

Unnamed_thm <

The state of coqtop at that point is thus:

0.0 45.7 35478920 15320896 pts/5  I<+   6:09AM  0:26.44 ./coqtop -q -color on

(about ≈33.8Gb virtual memory and ≈14.6Gb residential memory)

on Debian box:

5.1  1.8 504708 300376 pts/27  S+   06:18   0:01 coqtop -q -color on

(about ≈492Mb virtual memory and ≈293Mb residential memory)

---8<---

I also noticed that during the build coqc tends to compile excessive amounts of RAM (around 100Gb).

---8<---

This is an interesting discrepancy. A question arises, whether it is natural and tolerable. If yes, then a reasonable solution would be to increase a time-out. However, if the opposite is true, then a different course of action would be desirable.

@Zimmi48
Copy link
Member Author

Zimmi48 commented Feb 10, 2021

No, that's not normal. Can you open a new issue (since this is really a different problem) with this example? It really looks out of the scope of a documentation issue to me.

@azarens-git
Copy link

No, that's not normal. Can you open a new issue (since this is really a different problem) with this example? It really looks out of the scope of a documentation issue to me.

Agreed. Let me do a few more tests, though, to gather more details.

@azarens-git
Copy link

I would still like to suggest that the timeout be increased, say to 120 seconds. Just to cover all bases (and machines where Coq is compiled into byte-code, etc.).

@Zimmi48
Copy link
Member Author

Zimmi48 commented Feb 10, 2021

This could lead to incredibly long compilation times for the doc without a clear diagnosis (whereas here you get a timeout failure and you understand and if you really need it, you can tweak the timeout locally). Note that compiling the doc is optional and a pre-compiled version being available on the web, it's not a big deal if it's not as robust as Coq compilation.

@azarens-git
Copy link

azarens-git commented Feb 10, 2021

Thank you for your input.

This could lead to incredibly long compilation times for the doc without a clear diagnosis (whereas here you get a timeout failure and you understand and if you really need it, you can tweak the timeout locally).

I suppose, a low default is not an unreasonable approach thereby. However, what about making the timeout easily settable. Could it be exposed in some sort of config file? Even an include file with just a constant would be acceptable, since it'll be easy for a package maintainer to patch it and then substitute values.

The local issues, like e.g. memory usage, could and should be caught through local tests.

Note that compiling the doc is optional and a pre-compiled version being available on the web, it's not a big deal if it's not as robust as Coq compilation.

I would disagree. The local documentation is quite important, especially for a program like Coq, which focuses on correctness, and a number of other academic virtues. I also found the need to work with coq in a disconnected environment as a real and relevant one, both for others and for myself. Also, a system administrator would in general find much more work upon himself, if the release versions of a package fail intermittently.


All in all, it is not unreasonable to say that making timeout configurable, while leaving the default as it is, would be a most pragmatic solution thereby.

@azarens-git
Copy link

azarens-git commented Feb 11, 2021

I have done some additional testing by building both packages with dune from opam on a NetBSD box and a Debian box. In both instances, I got a working coqtop that did not consume excessive amount of memory. In particular, it processed the input script previously referenced by me as follows:

original script:

Set Coqtop Exit On Error.
Set Warnings "+default".
Unset Coqtop Exit On Error.
Set Warnings "default".
Goal False.
  change_no_check True.
  exact I.
Qed.
Abort All.
Set Coqtop Exit On Error.
Set Warnings "+default".
Unset Coqtop Exit On Error.
Set Warnings "default".
Goal True -> False.
  intro H.
  change_no_check False in H.
  exact H.
Qed.
Abort All.
Set Coqtop Exit On Error.
Set Warnings "+default".
Unset Coqtop Exit On Error.
Set Warnings "default".
Goal 0 <= 1.
unfold le.
Abort All.
Set Coqtop Exit On Error.
Set Warnings "+default".
Unset Coqtop Exit On Error.
Set Warnings "default".
Opaque Nat.add.
Goal 1 + 0 = 1.
unfold Nat.add.
Abort All.
Set Coqtop Exit On Error.
Set Warnings "+default".
Goal ~0=0.
unfold not.
Fail progress fold not.
pattern (0 = 0).
fold not.
Abort All.
Reset Initial.
Set Coqtop Exit On Error.
Set Warnings "+default".
Opaque id.
Goal id 10 = 10.
Fail unfold id.
with_strategy transparent [id] unfold id.
Abort All.
Reset Initial.
Set Coqtop Exit On Error.
Set Warnings "+default".
Fixpoint fact (n : nat) : nat :=
  match n with
  | 0 => 1
  | S n' => n * fact n'
  end.
Strategy 1000 [id].
Goal True.
Time assert (id (fact 8) = fact 8) by reflexivity.
Time assert (id (fact 9) = fact 9) by reflexivity.

on NetBSD box:

Unnamed_thm < 1 subgoal
  
  ============================
  10 = 10

Unnamed_thm < 
Coq < 
Coq < 
Coq < 
Coq < Coq < Coq < Coq < Coq < fact is defined
fact is recursively defined (guarded on 1st argument)

Coq < 
Coq < 1 subgoal
  
  ============================
  True

Unnamed_thm < Finished transaction in 0.291 secs (0.15u,0.14s) (successful)
1 subgoal
  
  H : id (fact 8) = fact 8
  ============================
  True

Unnamed_thm < Finished transaction in 1.259 secs (1.259u,0.s) (successful)
1 subgoal
  
  H : id (fact 8) = fact 8
  H0 : id (fact 9) = fact 9
  ============================
  True

Unnamed_thm < 

on Debian box:

Coq < 1 subgoal

  ============================
  id 10 = 10

Unnamed_thm < The command has indeed failed with message:
id is opaque.

Unnamed_thm < 1 subgoal

  ============================
  10 = 10

Unnamed_thm <
Coq <
Coq <
Coq <
Coq < Coq < Coq < Coq < Coq < fact is defined
fact is recursively defined (guarded on 1st argument)

Coq <
Coq < 1 subgoal

  ============================
  True

Unnamed_thm < Finished transaction in 0.252 secs (0.132u,0.119s) (successful)
1 subgoal

  H : id (fact 8) = fact 8
  ============================
  True

Unnamed_thm < Finished transaction in 1.306 secs (1.302u,0.003s) (successful)
1 subgoal

  H : id (fact 8) = fact 8
  H0 : id (fact 9) = fact 9
  ============================
  True

Unnamed_thm <

All in all, I'm glad that the issue got caught while the documentation was being compiled.

@Zimmi48
Copy link
Member Author

Zimmi48 commented Feb 11, 2021

However, what about making the timeout easily settable.

Sounds good to me. Please open an issue for this specific request.

@azarens-git
Copy link

However, what about making the timeout easily settable.

Sounds good to me. Please open an issue for this specific request.

Thank you. This is a good suggestion. Opened issue #13848.

@Zimmi48 Zimmi48 added this to the 8.9+beta1 milestone Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: ci-failure Information about unexpected and random CI failures. kind: infrastructure CI, build tools, development tools.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants