-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CMake build of ExaTensor #1
Comments
Been thinking about this. We need to set it up so that CMake will use the ExaTensor environment variables by default, i.e. you prepend your cmake command with BLASLIB=... cmake .., for example. You can do this with the CMake $ENV{BLASLIB} syntax. So check if its available, and if not, use defaults for the given platform. |
Okay, so I guess this is more or less what is already done for the OSX build if I understand correctly. Currently in the ExaTensor CMake file the OSX build grabs the e.g. CPP compiler from an environment variable in these lines to use in the build, whereas in the linux targets in e.g. this line the values are hard coded in. So all of these hard coded variables should have some reference variable similar to the current OSX target lines. |
Yes that's right. |
From what I see now in the build system, the ExaTensor MPI path is imported from CMake, which is good. However, the BLAS library path in general includes more than one path, depending on the BLAS library implementation. ExaTensor currently supports four different BLAS libraries: ATLAS, MKL, ACML, and ESSL. Each of them requires its own way to set either one or multiple paths for linking. Ideally, we need to import this information from CMAKE as well and then pass to ExaTensor. Otherwise, we will have to configure ExaTensor separately, risking to mix different BLAS libraries (e.g., ExaTensor built with MKL while exatn linking with ATLAS). |
Also, I think it is better to make EXA_TALSH_ONLY=YES by default in exatn build system. I sent an email last week with the following suggestion for differrnt build options (from the simplest to the hardest):
|
Implementation-wise, I would suggest the following:
|
Okay, I think this makes sense - the issue is ensuring that user defined implementations of MPI/BLAS don't conflict with each other in exatn and exatensor. Additionally, it should be ensured that user definitions don't conflict with default CMake assumptions.
I certainly agree with this as when I was starting last week trying to build on my OSX environment part of the struggles of building was ensuring that the compiler between MPI and exatn were matching.
I will work on this implementation. Additionally we should add this in the readme so that it is transparent to the user what the CMake configuration is or is not assuming if environment variables are or aren't declared. |
By the way, at the moment the build requires MPI and BLAS to be found, see this line. I assume then that this can be changed in order to implement your suggestion? I'm not sure if these lines were put here for a historical reason, or something else. |
You can remove the REQUIRED arg on those |
But look through the other CMakeLists files and make sure that if something like MPI_CXX_INCLUDE_DIRS is used you update to wrap it with if(MPI_FOUND) or something like that |
Right, this is why I asked - removing these lines completely for example causes the build to fail likely for the reasons you suggest. |
In branch devel_dil, I plugged in the TAL-SH backend to the tensor runtime package and most tests failed the linking step because our CMAKE files are not configured to address the dependencies of the TAL-SH library: CUDA libraries (if CUDA is enabled) and OpenMP runtime library (compiler specific). The devel_dil branch currently does not build. This is a stopper for me as I can't continue development until this works. We need to somehow resolve the systematic misconfiguration of CMAKE files throughout. I began fixing CUDA libraries part, but realized that I have no idea how to add OpenMP runtime library which is compiler specific. I see in a number of places where we hardcoded libgomp, but this is GNU only and will fail with other compilers. Any ideas how to fix this systematically throughout would be appreciated. |
I wouldn't call this a misconfiguration of CMake. This is a new addition, we just have to update the CMake build system to enable it. There is no systematic way to avoid these issues. We just have to implement the CMake code to enable new additions to the build as the come online. As for OpenMP, cmake provides
@osbornjd can help address these link issues tomorrow. |
I see, ok, will try tomorrow again (with openmp). |
One issue we'll have to figure out is that under this model, we have introduced a cyclic dependency: numerics depends on runtime (num_server -> tensor_runtime), but runtime also depends on numerics (graph -> tensor_operation). One proposal to break this would be to move Numerics Server to the exatn package. |
I say this because I think this is whats causing the errors on devel_dil:
Adding
|
Yes, agree, we need to move NumericsServer from numerics to exatn. Then NumericsServer will properly depend on numerics and runtime packages. |
We need to update tpls/CMakeLists.txt to provide a more robust build of the ExaTensor submodule.
The text was updated successfully, but these errors were encountered: