Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add native support for Apple M1 #546

Closed
25 tasks done
fingolfin opened this issue Jul 9, 2021 · 20 comments
Closed
25 tasks done

Add native support for Apple M1 #546

fingolfin opened this issue Jul 9, 2021 · 20 comments

Comments

@fingolfin
Copy link
Member

fingolfin commented Jul 9, 2021

This would be nice to have, as Julia 1.7 will probably support it "officially", and so people will start asking for this.

So, here is a list of JLLs that someone will have to update with support for M1 (I've omitted a few which already have it; I may have missed some, though):

There is a catch, though: to support new architectures, the JLL must require Julia >= 1.6, as Julia <= 1.5 chokes on platform "triples" in Artifact.toml it doesn't know.

Luckily, we already know in principle how to do these things. Also, in the version with Julia >= 1.6 support, we could then make use of the new extended "platform triplets" ("tuplets" now, really), which are now able to arbitrary (?) key-value pairs, so that one can specify "julia_version=XXX". Here is an example of a JLL doing that (search for julia_version to find the relevant parts of the code). BTW this is another reason why I look forward to eventually dropping support for Julia <= 1.5...

@thofma
Copy link
Collaborator

thofma commented Jul 9, 2021

Luckily, we already know in principle how to do these things. Also, in the version with Julia >= 1.6 support, we could then make use of the new extended "platform triplets" ("tuplets" now, really), which are now able to arbitrary (?) key-value pairs, so that one can specify "julia_version=XXX". Here is an example of a JLL doing that (search for julia_version to find the relevant parts of the code). BTW this is another reason why I look forward to eventually dropping support for Julia <= 1.5...

Would this mean we could drop the version shenanigans in libsingular_julia_jll?

@fingolfin
Copy link
Member Author

Would this mean we could drop the version shenanigans in libsingular_julia_jll?

Yes (assuming we require Julia >= 1.6 throughout). I mean, personally, I'd be all for that. Perhaps we can just do it now for most of our packages and JLLs (dunno if this would be acceptable for Nemo now or not, ping @wbhart)

@wbhart
Copy link
Contributor

wbhart commented Jul 9, 2021

There's still no Julia LTS after 1.0.5. I am personally using 1.6, but I have to build docs on 1.5 due to the changes they made to the way arrays are printed. Those would be my main objections right now. Otherwise I am not opposed.

@thofma
Copy link
Collaborator

thofma commented Jul 9, 2021

The LTS thing would be also my concern. Let's hope that 1.6 becomes LTS once 1.7 is out.

@fingolfin
Copy link
Member Author

I asked on the Julia slack regarding LTS and how (un)likely it is that 1.6 would become the next LTS. I was told: "I would say quite likely" by one person. I expressed that I was looking forward to this as some of my colleagues have concerns about not supporting the LTS, to which Stefan Karpinski replied:

This is why I think that having an LTS may do more harm than good. It would be perfectly reasonable to just decide that a floor of 1.3 across a related set of packages was what is supported.

Anyway, for Oscar/Singular/GAP/Polymake we should be free to drop support for versions before 1.6, and doing that would remove at least one of the various major headaches we currently have with updating JLLs. (We still need to submit staggered PRs for each JLL, but at least one per JLL, not one per JLL and supported Julia version).

@benlorenz
Copy link
Member

While experimenting with some preliminary polymake binaries for Apple M1:

     Testing Running tests...
Test Summary: | Pass  Total
Polymake      | 7432   7432
     Testing Polymake tests passed 

julia> versioninfo()
Julia Version 1.7.0-rc2
Commit f23fc0d27a (2021-10-20 12:45 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin20.6.0)
  CPU: Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, cyclone)

I ran into some other errors when trying to test Oscar and it seems similar errors also appear when running just the Singular testsuite.

Singular prints several non-fatal errors during the tests, mostly about divisions by 0:

    singular error: div by 0
    singular error: div by 0
    singular error: div by 0
Test Summary:      | Pass  Total
n_Z.exact_division |    4      4
...
    singular error: 1/0
    singular error: div by 0
    singular error: div by 0
Test Summary:       | Pass  Total
n_Zp.exact_division |    5      5
...
    singular error: div by 0
    singular error: div by 0
    singular error: div by 0
Test Summary:       | Pass  Total
n_Zn.exact_division |    4      4
...
    singular error: div by 0
    singular error: div by 0
    singular error: div by 0
Test Summary:       | Pass  Total
n_GF.exact_division |    7      7
...
    singular error: div by 0
Test Summary:             | Pass  Total
n_transExt.exact_division |    5      5
...
    singular error: div by 0
    singular error: zero divisor found - your minpoly is not irreducible
    singular error: div by 0
    singular error: div by 0
    singular error: zero divisor found - your minpoly is not irreducible
    singular error: div by 0
Test Summary:           | Pass  Total
n_algExt.exact_division |    5      5
...
    singular error: not implemented
Test Summary:     | Pass  Total
poly.extended_gcd |    3      3

And then complains rather loudly about some missing NTL/FLINT stuff:

    singular error: multivariate factorization depends on NTL/FLINT(missing)
    singular error: NTL/FLINT missing: squarefreeFactorization
    singular error: NTL/FLINT missing: squarefreeFactorization
    singular error: NTL/FLINT missing: squarefreeFactorization
    singular error: multivariate factorization over Q(alpha) depends on NTL or FLINT (missing)
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: NTL/FLINT missing:Farey
    singular error: multivariate factorization over Z/pZ(alpha) depends on NTL/Flint(missing)
    singular error: NTL/FLINT missing: squarefreeFactorization
Test Summary:          | Pass  Total
poly.test_spoly_factor |   12     12
Test Summary: | Pass  Total
poly.hash     |    3      3
Test Summary: | Pass  Total
poly.errors   |    2      2
poly.to_univariate: Error During Test at /Users/lorenz/.julia/dev/Singular/test/poly/poly-test.jl:601
  Test threw exception
  Expression: touni(S, (1 + 2y) ^ 6) == (1 + 2t) ^ 6
  multivariate factorization depends on NTL/FLINT(missing).  NTL/FLINT missing: squarefreeFactorization.  NTL/FLINT missing: squarefreeFactorization.  NTL/FLINT missing: squarefreeFactorization.  multivariate factorization over Q(alpha) depends on NTL or FLINT (missing).  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  NTL/FLINT missing:Farey.  multivariate factorization over Z/pZ(alpha) depends on NTL/Flint(missing).  NTL/FLINT missing: squarefreeFactorization
  Stacktrace:
    [1] error(s::CxxWrap.StdLib.StdStringAllocated)
      @ Base ./error.jl:33
    [2] check_error
      @ ~/.julia/dev/Singular/src/libsingular/errors.jl:3 [inlined]
    [3] divexact(x::n_Q, y::n_Q; check::Bool)
      @ Singular ~/.julia/dev/Singular/src/number/n_Q.jl:247
    [4] divexact
      @ ~/.julia/dev/Singular/src/number/n_Q.jl:244 [inlined]
    [5] pow_multinomial(a::AbstractAlgebra.Generic.Poly{n_Q}, e::Int64)
      @ AbstractAlgebra ~/.julia/packages/AbstractAlgebra/yXuIT/src/Poly.jl:772
    [6] ^(a::AbstractAlgebra.Generic.Poly{n_Q}, b::Int64)
      @ AbstractAlgebra ~/.julia/packages/AbstractAlgebra/yXuIT/src/Poly.jl:813
    [7] literal_pow(#unused#::typeof(^), x::AbstractAlgebra.Generic.Poly{n_Q}, #unused#::Val{6})
      @ AbstractAlgebra ~/.julia/packages/AbstractAlgebra/yXuIT/src/NCRings.jl:79
    [8] macro expansion
      @ /Volumes/Julia-1.7.0-rc2/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
    [9] macro expansion
      @ ~/.julia/dev/Singular/test/poly/poly-test.jl:601 [inlined]

Is this known?

@tthsqe12
Copy link
Contributor

tthsqe12 commented Nov 15, 2021

Singular prints several non-fatal errors during the tests, mostly about divisions by 0:

These are indeed fatal and will be checked in the next singular.jl verions. Something is very wrong here. Does the flint test code (make check) pass on this setup? How about the Singular test code (not Singular.jl)?

And then complains rather loudly about some missing NTL/FLINT stuff:

Of course the errors from singular here are self-explanatory. I would just like to add that they may or may not be printing to the screen at the correct time.

@benlorenz
Copy link
Member

Singular prints several non-fatal errors during the tests, mostly about divisions by 0:

These are indeed fatal and will be checked in the next singular.jl verions. Something is very wrong here. Does the flint test code (make check) pass on this setup? How about the Singular test code (not Singular.jl)?

I don't really know, this is just using the binaries that were built on Yggdrasil with the recipes max added. Not really sure how to run the check in this environment, as they binaries are cross-compiled. Is there some way to run these (flint or singular) against an existing (installed) library?

And then complains rather loudly about some missing NTL/FLINT stuff:

Of course the errors from singular here are self-explanatory. I would just like to add that they may or may not be printing to the screen at the correct time.

Why would this be missing on Apple M1 only?

@tthsqe12
Copy link
Contributor

tthsqe12 commented Nov 15, 2021

I don't really know, this is just using the binaries that were built on Yggdrasil with the recipes max added. Not really sure how to run the check in this environment, as they binaries are cross-compiled. Is there some way to run these (flint or singular) against an existing (installed) library?

It would probably be easier to just run the test suit for Nemo.jl if you can. It doesn't catch everything, but should fail if there is something seriously wrong with the bins.

Why would this be missing on Apple M1 only?

Not sure, the logs in Singular.v402.101.100.aarch64-apple-darwin.tar.gz indicate that flint was NOT found, but the logs, for example, in Singular.v402.101.100.aarch64-linux-gnu-cxx03.tar.gz indicate that flint was found. Both are configured with --with-flint=/workspace/destdir, so, it appears to be an apple-specific bug.

@benlorenz
Copy link
Member

Good catch, I would have expected the configuration to fail if --with-flint is specified but no flint is found. It seems the build for Singular_jll was using the wrong Dependency specification for FLINT_jll and used the oldest compatible 200.800 version instead of the new 200.800.100 (or 200.800.300) series which is the first to support Apple M1. This can be seen in the log here: https://dev.azure.com/JuliaPackaging/Yggdrasil/_build/results?buildId=14218&view=logs&j=415a8029-d79a-51a4-adf1-f542096f4bd3&t=0846dbfb-036c-5562-66f3-2c0bc2e711b5
Especially this is an indication of a problem:

25l25h┌ Warning: Dependency FLINT_jll does not have a mapping for artifact FLINT for platform aarch64-apple-darwin20-libgfortran5-cxx11

I think we need something like this for Singular? (@fingolfin)

    Dependency("FLINT_jll", v"200.800.301", compat = "~200.800.301"),

Or .101 at the end instead?

@benlorenz
Copy link
Member

Now that the list in the first post is complete I did some more experiments:

With self-built Singular_jll binaries with the updated FLINT compat the Singular.jl tests succeed, Polymake.jl, Nemo.jl, Hecke.jl and AbstractAlgebra.jl also seem to work.
(All packages tested with their current master, but their dependencies were at the latest release)

Oscar dies here (no idea where that message comes from and no there is no backtrace):

MPolyQuo.ideals |   19     19
Bad input data, stopped computation.
ERROR: Package Oscar errored during testing

GAP (this might be fixed by the switch to jll based GAP packages?):

compat        |    7      7
#I  no such package package is not available. Check that the name is correct
#I  and it is present in one of the GAP root directories (see '??RootPaths')
#I  no such package package is not available. Check that the name is correct
#I  and it is present in one of the GAP root directories (see '??RootPaths')
#I  Getting PackageInfo URLs...
#I  Retrieving PackageInfo.g from https://gap-packages.github.io/io/PackageInfo.g ...
#I  Downloading archive from URL https://github.com/gap-packages/io/releases/download/v4.7.2/io-4.7.2.tar.gz ...
#I  Saved archive to /var/folders/m4/b5jpt2d10cx2z6cx0r6hbtgr0000gr/T//gaptempdirqSCbOt/io-4.7.2.tar.gz.pkgman
#I  Extracting to /Users/lorenz/.julia/gaproot/v4.12/pkg/io-4.7.2 ...
#I  Running compilation script on /Users/lorenz/.julia/gaproot/v4.12/pkg/io-4.7.2 ...
#I  Possible error detected: see log at /var/folders/m4/b5jpt2d10cx2z6cx0r6hbtgr0000gr/T//gaptempdirjJK8SU/exec-log.txt
#I  Compilation failed (package may still be usable)
#I  Package availability test failed
#I  (for IO 4.7.2)
#I  Removed directory /Users/lorenz/.julia/gaproot/v4.12/pkg/io-4.7.2
packages: Test Failed at /Users/lorenz/GAP.jl/test/packages.jl:7
  Expression: GAP.Packages.install("io", interactive = false)

The error log ends like this:

configure: creating ./config.status
config.status: creating Makefile
config.status: creating gen/pkgconfig.h
Running 'make' 'clean' 
rm -rf bin/aarch64-apple-darwin20-julia64-kv8 gen
Running 'make' 
autoheader not available, proceeding with stale config.h
touch gen/pkgconfig.h.in
touch: gen/pkgconfig.h.in: No such file or directory
make: *** [gen/pkgconfig.h.in] Error 1

WARNING: Failed to build io-4.7.2

For Hecke and AbstractAlgebra I got two similar errors once but they seem to have disappeared now:

Hecke:

Test Summary: | Pass  Total
Map           |  358    358
 37.102377 seconds (49.16 M allocations: 19.656 GiB, 2.09% gc time, 26.30% compilation time)

signal (11): Segmentation fault: 11
in expression starting at /Users/lorenz/.julia/packages/Hecke/ye5vG/test/Misc/OrdLocalization.jl:9
ntuple at ./ntuple.jl:0
unknown function (ip: 0x2ce8f1607)
jl_apply_generic at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
getindex at ./range.jl:373
typed_hvcat at ./abstractarray.jl:2038
hvcat at ./abstractarray.jl:2020 [inlined]
gcdx at /Users/lorenz/.julia/packages/Hecke/ye5vG/src/Misc/OrdLocalization.jl:392
jl_apply_generic at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
macro expansion at /Users/lorenz/.julia/packages/Hecke/ye5vG/test/Misc/OrdLocalization.jl:65 [inlined]
macro expansion at /Users/administrator/src/julia/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
macro expansion at /Users/lorenz/.julia/packages/Hecke/ye5vG/test/Misc/OrdLocalization.jl:62 [inlined]
macro expansion at /Users/administrator/src/julia/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
top-level scope at /Users/lorenz/.julia/packages/Hecke/ye5vG/test/Misc/OrdLocalization.jl:11
jl_toplevel_eval_flex at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_toplevel_eval_flex at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_toplevel_eval_in at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
eval at ./boot.jl:373 [inlined]

AbstractAlgebra:

Test Summary:                          | Pass  Total
Generic.Mat.can_solve_with_solution_lu |  262    262
Test Summary:        | Pass  Total
Generic.Mat.solve_ff |  200    200

signal (11): Segmentation fault: 11
in expression starting at /Users/lorenz/.julia/packages/AbstractAlgebra/yXuIT/test/generic/Matrix-test.jl:1832
ntuple at ./ntuple.jl:0
unknown function (ip: 0x282c1522f)
jl_apply_generic at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
getindex at ./range.jl:373
typed_hvcat at ./abstractarray.jl:2038
hvcat at ./abstractarray.jl:2020 [inlined]
macro expansion at /Users/lorenz/.julia/packages/AbstractAlgebra/yXuIT/test/generic/Matrix-test.jl:1940 [inlined]
macro expansion at /Users/administrator/src/julia/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
top-level scope at /Users/lorenz/.julia/packages/AbstractAlgebra/yXuIT/test/generic/Matrix-test.jl:1833
jl_toplevel_eval_flex at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_toplevel_eval_flex at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_toplevel_eval_in at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
eval at ./boot.jl:373 [inlined]
include_string at ./loading.jl:1196
jl_apply_generic at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
_include at ./loading.jl:1253
include at ./client.jl:451
jl_apply_generic at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
do_call at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
eval_body at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_interpret_toplevel_thunk at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_toplevel_eval_flex at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_toplevel_eval_flex at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
jl_toplevel_eval_in at /Volumes/Julia-1.7.0/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
eval at ./boot.jl:373 [inlined]

@thofma
Copy link
Collaborator

thofma commented Dec 6, 2021

The ntuple segfaults are are not related to our stuff, but are known woes with julia on M1 (https://github.com/JuliaLang/julia/issues?q=is%3Aissue+is%3Aopen+apple+label%3A%22apple+silicon%22).

@fingolfin
Copy link
Member Author

Regarding the Bad input data, from a full non-crashing log, I see this:

MPolyQuo.ideals |   19     19
[ Info: The system has no solution.
Test Summary: | Pass  Total
msolve        |   13     13
Test Summary:   | Pass  Total

So it looks like this might be code in msolve by @ederc.

Regarding the GAP error building io: it calls autoheader, it shouldn't, I don't understand why it would do that, but it should be easy enough to resolve. And indeed the use of package JLLs should fix it. I'm on it.

@benlorenz
Copy link
Member

So it looks like this might be code in msolve by @ederc.

That error message does indeed look like one of those from msolve.
After removing the msolve test group all other Oscar tests succeed!

@ederc
Copy link
Member

ederc commented Dec 10, 2021

Yes, this seems to come from msolve. Problem for me is that the error does not appear when applying the test cases to msolve directly (of course tested on an M1 machine). What is the easiest way for me to build some kind of Oscar on M1 such that I can see what happens in the testsuite inside julia? At the moment the problem seems to come from corrupted input data since the error happens when initializing data in msolve, before the real computations even start.

@thofma
Copy link
Collaborator

thofma commented Dec 10, 2021

There is a machine sitting around in Kaiserslautern. The friendly system administrator on your floor will give you access.

@ederc
Copy link
Member

ederc commented Dec 10, 2021

Getting access to a machine is not the problem, but how can I get Oscarresp. its testsuite working on it? I cannot get Oscar running as it is on 1.7 and M1 right now, what is the easiest way to get the testsuite run?

@ederc
Copy link
Member

ederc commented Dec 10, 2021

Ah sorry, my bad, I hadn't updated all my packages on the M1 system. Working on it over the weekend.

@ederc
Copy link
Member

ederc commented Dec 11, 2021

The Bad input data error for msolve is fixed with #882.

@benlorenz
Copy link
Member

I re-ran the tests and all Oscar test pass now on the latest master, thanks everyone!
So I think we can close this ticket now.

The only thing that remains are the GAP packages for the GAP.jl tests but those have separate issues anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants