New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ATLAS: multiple definition of `ATL_SetAtomicCount' #15045
Comments
comment:1
I'm pretty sure this is http://sourceforge.net/p/math-atlas/support-requests/907/ |
This comment has been minimized.
This comment has been minimized.
comment:2
Its probably not something that is reproducable on modern hardware, so progress upstream has been slow. For now, I propose we just plow ahead and fall back to static libraries. |
Author: Volker Braun |
Attachment: atlas-p3-p4.diff.gz diff for review only |
This comment has been minimized.
This comment has been minimized.
Upstream: Reported upstream. No feedback yet. |
comment:4
Replying to @vbraun:
2009 might not meet your definition of "modern", but it's far from ancient either. |
comment:5
Replying to @vbraun:
If static libraries always work, why don't we always only use static libraries? Conversely, if static libraries don't always work, wouldn't this "fall back" cause other problems? |
comment:6
Because static libraries suck. It won't cause other problems for now. If we want to use the upstream shared libraries (with a different naming convention than the static libraries) then we'll run into troubles. But by then we hopefully have found a way to not hardcode atlas/blas/lapack library names in the Sage build system. |
comment:7
It still feels bad to build the shared libraries "sometimes" depending on some non-deterministic condition. If static libraries work (even if they don't work so well), I think it's better to stick with them in all cases. |
comment:8
I would agree with you if we would want to stick with a hard-coded ATLAS in Sage until the end of time. But IMHO we should work towards a more configurable solution that will just fall back to reference / openblas as appropriate if atlas doesn't build. And perhaps not build atlas by default if it takes too long. And then static libraries would be a huge pain in the butt. |
comment:9
Replying to @vbraun:
If you want to work towards that goal, then the problem on this ticket must be fixed in a proper way anyway. I don't see how using shared libraries "sometimes" is closer to the stated goal than "never" using shared libraries. I am particularly bothered that the choice is non-deterministic and difficult to predict, so Sage builds on the same machine will be substantially different (one with shared ATLAS and one with static ATLAS). |
comment:10
Once we have an alternative to ATLAS that we can use, we disable the static library fallback for ATLAS. Its easy. Until then, we have this crutch. I agree that its ugly as sin but not shipping the new ATLAS would be a mistake imho. I just don't want to work on changing the blas build system in Sage until we have switched to git. It'll be much easier if I don't have to copy tarballs around just for a change to the build script. |
comment:11
Replying to @vbraun:
I am not saying that we shouldn't ship the new ATLAS. I am proposing to always use the static ATLAS libraries always instead of only when the linker problem appears. |
comment:12
Replying to @jdemeyer:
I think that just means more changes to undo later. And mind you the shared libraries build apparently fine on most systems, its just some AMD platforms where presumably the performance impact of holding mutexes is different. |
comment:13
Replying to @vbraun:
Really? If you comment out the current approach, then it's essentially just removing |
comment:14
Replying to @vbraun:
So far, Sage has worked equally well for all CPUs of a given architecture, let's keep it that way. |
comment:15
As long as the library names are the same (which is currently the case) it does not make a difference for the build system if the library is shared or static. Deliberately disabling shared libraries would just mean to make it suck for everybody, and not just for the small percentage where its the only option. In other words, I'm against it ;-) |
comment:16
Replying to @vbraun:
And I'm against more special cases (especially non-deterministic) in the Sage build system ;-) |
comment:17
Noted, but the ATLAS spkg has always been trying different ways to build and falling back if they fail. Which always sucked, but the problem is that we don't have a deterministic alternative to deploy. |
comment:18
Replying to @vbraun:
Those "different ways" only were about tuning parameters and CPU instruction sets, right? Which is far less fundamental than static vs. dynamic libraries. Imagine we go with your solution and in the future somebody decides to change the Sage build system or some package such that it only works for a dynamic ATLAS library. Initial testing might reveal that everything works, even though it doesn't work in case your ATLAS spkg decides to use static libraries. This is what I want to avoid. |
comment:19
I disagree. Pretty much the only way to achieve your scenario would be to hardcode the shared library file name somewhere. But those already differ on Linux vs. OSX. What I'm mainly objecting to is the basic premise that we should keep this monstrosity alive for any longer than necessary. Things have to be fixed either in ATLAS or by fixing the Sage BLAS build system. As soon as that is done, we can switch off the static library fallback again. |
comment:20
Replying to @vbraun:
If there is no observable difference between shared and static libraries, then why do you care so much about shared libraries?
The problem is that it needs to be kept alive in any case precisely because of this ticket. |
comment:21
There is no difference at the linker command line. There is of course a price to be paid later in that you can never benefit from blas/atlas upgrades if you have linked statically. This ticket is just a workaround until we have a better fix. Unless you think we don't need a workaround for now, in which case I'm happy to wait for either upstream to fix it or we have switched to git... |
comment:22
I asked |
comment:23
For atlas issue 907, the relevant code would be:
Looking at
The file I suspect the problem is that
hich presumably runs the code in Perhaps just patch the system? ATLAS/tune/threads/tune_count.c:244
- printf("\nNO REAL ADVANTAGE TO ASSEMBLY, FORCING USE OF MUTEX\n");
- ATL_assert(!system("make iForceUseMutex"));
+ printf("\nNO REAL ADVANTAGE TO ASSEMBLY, BUT WE LEAVE IT IN ANYWAY\n"); I think there is some hope that this will eliminate the (rare) build fails. |
comment:24
I definitely cannot work on that at the moment but this is reallyt reminiscent of #10508 comment:455 Ok, I just read the ticket description, and our zorkaround was working because of our custoñ way of building shared libraries. Now that we also build the upstream shared libraries, our workaround is not enough anymore.... |
comment:25
From what I remember the main problem is that ATLAS builds a (static) lib for tuning, and then just update it to get the final (static) library so it ends up including two objects files with the same symbols (because of the atoñic.inc ñagic which does notinclude the same pieces of code in tyhe tuning and in the final phases, and because old object files of the tuning phase are still lurking around when the final phase is going on). It presumably does so because updating the static archive is faster than rebuilding everything from scratch. So another solution would be to erase the static lib built for tuning (and the corresponding object files) when the tuning phase is finished and let everything be built again when the real building phase starts. |
comment:26
See #10508 comment:446 which might be clearer than the above :) |
comment:27
Replying to @nbruin:
So would it be a plan to
so that our logfiles at least tells us what the performance penalty is (well, there's none because ATLAS doesn't build dynamic libs without). |
comment:28
I've made the change to force |
This comment has been minimized.
This comment has been minimized.
Attachment: atlas-p4-p5.diff.gz diff for review only |
comment:29
Congratulations on getting this fixed. Just for the record: I think the mutex code can still get built on platforms where no assembly alternative is available. In that case, I don't think we should particularly expect conflicting symbols either, because there are no other implementations available that the mutex code can conflict with in that case. This is why I figured it was best to patch the code that takes action depending on the tuning result and not the target in the makefile. The comment in the SPKG.txt doesn't strictly contradict this, but I had to read the comment a second time to convince myself of that. |
comment:30
Nils: do you want to formally review this ticket? |
Reviewer: Nils Bruin |
comment:31
I have no idea what p4 is about but the change from p4 to p5 as posted on this ticket looks pretty reasonable. I haven't built or downloaded this new package, but other people have and it's a blocker, so a positive review helps things along. |
Merged: sage-5.12.beta5 |
This non-reproducible problem which can occur during the ATLAS build (with atlas-3.10.1.p3.spkg from #14754) was supposed to be fixed but it actually is not fixed:
full ATLAS build log
Upstream: http://sourceforge.net/p/math-atlas/support-requests/907/
New spkg: http://boxen.math.washington.edu/home/vbraun/spkg/atlas-3.10.1.p5.spkg
Upstream: Reported upstream. No feedback yet.
CC: @jpflori
Component: packages: standard
Author: Volker Braun
Reviewer: Nils Bruin
Merged: sage-5.12.beta5
Issue created by migration from https://trac.sagemath.org/ticket/15045
The text was updated successfully, but these errors were encountered: