Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toric variety timing test failing on julia 1.10-alpha1 #2539

Closed
lgoettgens opened this issue Jul 10, 2023 · 6 comments
Closed

Toric variety timing test failing on julia 1.10-alpha1 #2539

lgoettgens opened this issue Jul 10, 2023 · 6 comments
Labels
bug Something isn't working nightly error with julia nightly or pre-releases optimization Simpler/more performant code or more/better tests topic: toric schemes

Comments

@lgoettgens
Copy link
Member

lgoettgens commented Jul 10, 2023

Describe the bug
The following test fails sometimes on julia 1.10-alpha1 (https://github.com/oscar-system/Oscar.jl/actions/runs/5507326061/jobs/10037138193?pr=2533#step:6:1630).

duration = @elapsed stanley_reisner_ideal(ntv5)
@test duration < 10

To Reproduce
In julia 1.10-alpha1:

using Oscar, Test
ntv5 = normal_toric_variety(polarize(polyhedron(Polymake.polytope.rand_sphere(5, 60; seed=42))); set_attributes=false)
duration = @elapsed stanley_reisner_ideal(ntv5)
@test duration < 10

System (please complete the following information):

julia> Oscar.versioninfo(full=true)
OSCAR version 0.13.0-DEV
  combining:
    AbstractAlgebra.jl   v0.30.9
    GAP.jl               v0.9.7
    Hecke.jl             v0.18.16
    Nemo.jl              v0.34.7
    Polymake.jl          v0.10.0
    Singular.jl          v0.18.8
  building on:
    Antic_jll               v0.201.500+0
    Arb_jll                 v200.2300.0+0
    Calcium_jll             v0.401.100+0
    FLINT_jll               v200.900.6+0
    GAP_jll                 v400.1200.200+5
    Singular_jll            v403.203.202+0
    libpolymake_julia_jll   v0.10.2+0
    libsingular_julia_jll   v0.35.0+0
    polymake_jll            v400.1000.1+0
See `]st -m` for a full list of dependencies.

Julia Version 1.10.0-alpha1
Commit f8ad15f7b16 (2023-07-06 10:36 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
  Threads: 1 on 8 virtual cores
Official https://julialang.org/ release

cc @lkastner

@lgoettgens lgoettgens added the bug Something isn't working label Jul 10, 2023
@lgoettgens
Copy link
Member Author

I think this failure might be due to some compilation time included in the @elapsed output. Maybe one could first call stanley_reisner_ideal on some other input to compile it, and then test its runtime.

@benlorenz
Copy link
Member

I don't think this is related to compilation time, I would be really surprised if that jumped from below ten seconds to 40 seconds. I also saw this a few times with julia nightly (in CI) but with around 15-20 seconds instead of 40.

Have you been able to reproduce this locally? On my machine I get 1.5 seconds on the first call, i.e. plenty of margin to 10 seconds.

@lgoettgens
Copy link
Member Author

Have you been able to reproduce this locally? On my machine I get 1.5 seconds on the first call, i.e. plenty of margin to 10 seconds.

The highest I got was around 2.2 seconds, but most times around 1.5 seconds as well.

Then maybe some runner hickup?

@benlorenz
Copy link
Member

benlorenz commented Jul 10, 2023

Not really sure, I thought maybe memory / GC issues but if that were the case I would expect some other tests before that to take significantly longer as well. But that does not seem to be case.
I am currently doing some CI runs to check this: https://github.com/oscar-system/Oscar.jl/actions/runs/5509331514

@benlorenz
Copy link
Member

benlorenz commented Jul 10, 2023

Well, I did not expect a 10 seconds GC pause:
https://github.com/oscar-system/Oscar.jl/actions/runs/5510624622/jobs/10045095350#step:6:3307

I ran that test about 40 times in the CI and one time it got hit with a 10 seconds GC pause once and a 3.5 seconds pause a bit later:

GC live:    2880.903 MiB
JIT:          78.819 MiB
Max. RSS:   6029.176 MiB
Free mem:    184.719 MiB
Free pmem:   184.719 MiB
[ Info: i: 1 -- duration 0.870861302
[ Info: i: 2 -- duration 0.917865691
[ Info: i: 3 -- duration 1.26568221
[ Info: i: 4 -- duration 1.188737837
GC: pause 10726.64ms. collected 1879.166639MB. full 
[ Info: i: 5 -- duration 14.473311185
[ Info: i: 6 -- duration 0.40521015
[ Info: i: 7 -- duration 0.266128052
[ Info: i: 8 -- duration 0.253846305
[ Info: i: 9 -- duration 0.238968969
[ Info: i: 10 -- duration 0.27125293
[ Info: i: 11 -- duration 0.352951277
[ Info: i: 12 -- duration 0.313214449
[ Info: i: 13 -- duration 0.32096
[ Info: i: 14 -- duration 0.321512867
[ Info: i: 15 -- duration 0.282129481
[ Info: i: 16 -- duration 0.289936058
[ Info: i: 17 -- duration 0.364230346
GC: pause 3500.24ms. collected 1427.438326MB. incr 
[ Info: i: 18 -- duration 5.98371855
[ Info: i: 19 -- duration 0.306978372
[ Info: i: 20 -- duration 0.251166213
Test Summary:                                  | Pass  Total   Time
Normal toric varieties (set_attributes = true) |   10     10  55.0s
[ Info: i: 1 -- duration 0.310814275
[ Info: i: 2 -- duration 0.300577618
[ Info: i: 3 -- duration 0.290595061
[ Info: i: 4 -- duration 0.27208114
GC: pause 518.93ms. collected 1782.439136MB. incr 
[ Info: i: 5 -- duration 1.976023973
[ Info: i: 6 -- duration 0.314547121
[ Info: i: 7 -- duration 0.264213242
[ Info: i: 8 -- duration 0.26235665
[ Info: i: 9 -- duration 0.251494698
[ Info: i: 10 -- duration 0.243513433
[ Info: i: 11 -- duration 0.23962074
[ Info: i: 12 -- duration 0.217226548
GC: pause 238.03ms. collected 1466.542080MB. incr 
[ Info: i: 13 -- duration 1.381936208
[ Info: i: 14 -- duration 0.302096938
[ Info: i: 15 -- duration 0.258642647
[ Info: i: 16 -- duration 0.250137789
[ Info: i: 17 -- duration 0.236656203
[ Info: i: 18 -- duration 0.228981039
[ Info: i: 19 -- duration 0.227338446
[ Info: i: 20 -- duration 0.212192616
GC: pause 213.74ms. collected 1477.049784MB. incr 
Test Summary:                                   | Pass  Total   Time
Normal toric varieties (set_attributes = false) |   10     10  30.8s
-> Testing AlgebraicGeometry/ToricVarieties/normal_toric_varieties.jl took: runtime 81.665 seconds + compilation 6.530 seconds + recompilation 0.000 seconds, 7329.18 MB
GC live:    2549.578 MiB
JIT:          78.947 MiB
Max. RSS:   6247.332 MiB
Free mem:    282.309 MiB
Free pmem:   282.309 MiB

All the other jobs in that CI run for nightly+1.10 here have many small pauses instead of a few very long pauses instead. I don't really know why this happens.

But in any case I think there are two things to do:

  • We should really reduce some of our extremely memory intensive tests. We do get the occasional Terminated (15) in the tests which is probably killed due to memory starvation. [1]
    The Linux CI jobs are constantly very close to the memory-limit.
  • I will change the elapsed macro to something which ignores the GC time.

PS: The Linux runners have only about 7GB memory, macOS has 14GB, this would explain why this happens mostly on Linux.

Edit:

[1] Just to make sure that it is clear what I meant with killed jobs. It just happened with julia 1.8 here.

Test Summary:                                                       | Pass  Total     Time
Line bundle cohomologies and vanishing sets (set_attributes = true) |   30     30  6m18.2s

signal (15): Terminated
in expression starting at /home/runner/work/_actions/julia-actions/julia-runtest/latest/test_harness.jl:7
epoll_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
Error: The operation was canceled.

Also please note the test-duration for the line bundled at more than 6 minutes.

In a different run without crash this was taking "only" 3 minutes:

Test Summary:                                                       | Pass  Total     Time
Line bundle cohomologies and vanishing sets (set_attributes = true) |   30     30  3m19.8s
Test Summary:                                                        | Pass  Total   Time
Line bundle cohomologies and vanishing sets (set_attributes = false) |   30     30  37.0s
-> Testing AlgebraicGeometry/ToricVarieties/line_bundle_cohomologies.jl took: runtime 232.885 seconds + compilation 31.740 seconds + recompilation 0.000 seconds, 12775.23 MB

12GB allocations is one of our larger testsets.

@lkastner lkastner added optimization Simpler/more performant code or more/better tests topic: toric schemes labels Jul 10, 2023
@benlorenz benlorenz added the nightly error with julia nightly or pre-releases label Jul 11, 2023
@lgoettgens
Copy link
Member Author

Closed via #2542

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working nightly error with julia nightly or pre-releases optimization Simpler/more performant code or more/better tests topic: toric schemes
Projects
None yet
Development

No branches or pull requests

3 participants