-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Travis-CI config to build aarch64 wheels #54
Conversation
@bashtage Any suggestions? |
One possibility is that the linear algebra library might not be present or may be slow. Looking at the log it seems to take 8 minutes to install statsmodels, so I don't think the CPU is that fast. You could just disable tests on arm, or run a small subset of tests. |
It takes about 2x as long as it does on AMD64. This is going to make the run far too long to pass in a 50-minute window if that performance is consistent. |
@bashtage How can I verify the presence/absence of the linear algebra library? Is there any mention of it in the logs? |
It is probably there. You should set the build dependency for NumPy to the latest 1.19.x that has an aarch64 wheel. It looks like it is upgrading to 1.19.2, but the build dependency is 1.18.5. |
Could you move the x86 builds off travis please? You can move to azure like scipy-wheels (and this file too or to github workflows like dipy-wheels |
@bashtage I ran the wheel build process in a local Graviton2 instance using "--durations=20".
Run 2:
It looks like test_mle takes the longest time. Any suggestion on what this might be due to? |
3rd run:
Not really sure whats going on. Each run has a different run time. And the longest running tests are not the same. |
These all make use of compiled code generated from Cython. This seems extremely slow. 2953.54s is insane. Are optimzations turned off? Or do you have Cython line tracing for coverage enabled? |
Add the environmental variables
|
Right now the Cython any wheel is being used. There is an arch specific wheel for x86 but not yet for aarch64. Would that cause performance issues? Also, I'll check with the environment variables. |
No. I think the environmental variables may be the issue. It is possible to end up with with ncpu by ncpu threads all thrashing each other if tests are run in parallel. |
@bashtage Looks like the tests are still taking their sweet time. https://travis-ci.com/github/janaknat/statsmodels-wheels/builds/208604526 |
You might also try replacing |
Ok. I was initially trying with |
You need to set the number of threads to 1 when you are using many processes on many cores. |
So these are needed (probably only the first, but why not).
|
I've exported those environment variables: https://travis-ci.com/github/janaknat/statsmodels-wheels/jobs/459594184#L181 |
@bashtage I ran the tests on a local Graviton2 instance with '-n auto'. It's been running for over 2 hours now. |
Output of run with '-n auto':
|
Can you post a build log to a gist? What compiler is used? Does it know about gravitron, or is it generating generic aarch64 code? Have you tried using whatever the state of the art compiler is? |
@bashtage I'm using multibuild to build the wheel. I have a bash script that closely resembles .travis.yml.
multibuild uses pypa/manylinux2014_aarch64 container to build the wheels. |
I can't find anything about the FPU on N1, not even something simple like a SPEC fp rate. I suspect it is very weak. |
Something is wrong with the multibuild environment. I directly ran statsmodels tests on an AWS instance outside of multibuild and
becomes
So about 500x faster. |
Try taking the wheel file produced by multibuild in running it directly. You can run the test suite using
assuming you have installed statsmodels and its dependencies (I just installed everything as root). |
Or if you get the wheel and upload it somewhere I can try. |
A full run on an 8 VCPU gravitron
The slowest tests are totally different from yours
|
maybe the "local Graviton2 instance" is some kind of qemu emulation? |
That would explain it. I think NumPy builds its arch64 wheels on a different platform. |
What happens on travis if you copy NumPy? https://github.com/MacPython/numpy-wheels/blob/9bfb24a19e922d59b3109dd911a4486b1a7d1630/.travis.yml |
I ran the test again with 1 thread in pytest. Of the 3 python builds, 2 pass. In Run 1 python 3.8 failed to finish. In the next run python 3.7 failed to finish. There is a known issue with pthreads on aarch64. Checking if the test container - multibuild/xenial_arm64v8 has a version that does not contain the fix. Run1: https://travis-ci.com/github/janaknat/statsmodels-wheels/builds/209443060 |
@janaknat GH actions seems to support ARM, so it might be possible to do ARM along with the rest. https://azure.microsoft.com/en-gb/updates/azure-devops-pipelines-introduces-support-for-linuxarm64/ |
@bashtage I believe that is support for self hosted runners? |
Apparently on the roadmap for Q1 2021: |
Any news on this? Building statsmodels takes upwards of 1 hour on an M1 macbook. Can I help with debugging / diagnostics? |
No; it seems to take too long to use CI. |
@bashtage The last run finished but had around 40 failures. |
I see that Debian has a pre-built artifact in their repositories and I'm trying to hack my way around |
conda-forge supports osx-arm64 with statsmodels. This appears to be the best choice, and should be easy to install. |
Travis-CI allows for the creation of aarch64 wheels.
The tests are however failing because of the 50 minute timeout. The pytest command says it is using "--skip-slow", yet the tests are really slow. Since arm64-graviton2 is being used it is less likely to be due to a slow CPU.
Any suggestions on fixing this?
Build log: https://travis-ci.com/github/janaknat/statsmodels-wheels/builds/206644851