Recipes which depend on blas and lapack #80

jjhelmus · 2016-03-07T16:38:37Z

I have a recipe that I am looking to submit that has compiled extensions which links to blas and lapack. Does anyone have experience building package which depend on these or have any suggestions on how get these to build in a portable manner? I'm only looking to support Linux and OS X with the package which hopefully makes the process a bit easier.

pelson · 2016-03-07T17:03:29Z

No experience here I'm afraid. @msarahan any advice?

jakirkham · 2016-03-16T15:55:25Z

I have some experience with this. Was going to submit OpenBLAS with NumPy, SciPy, and scikit-learn support. Also can add cvxopt support, but will wait for things like FFTW ( #106 ) to go into feedstocks first. Can add numexpr to the mix, but haven't written this one yet. @abergeron and I wrote the OpenBLAS recipe in conda recipes. Though I don't think I like the way I did things in that version and am contemplating how to rewrite it after discussions with @mcg1969. He has some good insight after the solver overhaul and the public release of mkl. In general and am open to some discussion about improving this scheme going forward.

The main idea is one should use a feature to select the underlying BLAS/LAPACK. This is how mkl works. Possibly two features to handle cases where one might want to use a different LAPACK with the BLAS and such. As mkl has become publicly available, I have found that I have needed to require nomkl and tracking it in alternative BLAS/LAPACK's.

While features could be used for lots of different things, the main use case seems to be select an implementation of a common API like BLAS/LAPACK. Though I think this could be coded a little more explicitly when choosing feature names. For instance, I kind of like the idea of all BLAS features being named blas_* so there would be blas_openblas, blas_atlas, etc. Similarly this could be done with LAPACK's (i.e. lapack_*). There should be a way of implementing the existing mkl and nomkl in this schema probably by implementing it above them, but I would like to see those become part of this scheme also. Going forward this could be instructive in terms of how features gets shaped in conda for these use cases like better exclusionary power (only one BLAS style feature at a time).

Any thoughts on this?

patricksnape · 2016-03-23T15:47:54Z

@jakirkham I like the proposal - though I would say that as a community guideline we should direct people towards using either MKL or OpenBLAS - since most people are not dogmatic about what BLAS they use, they just want it to be fast!

I've actually struggled against this before because often recipes only require setting a key/path to the chosen BLAS library but the rest of the recipe is the same. All the more reason (IMHO) that features should be passed through to scripts in some way, perhaps as ENV variables? Would make it easier to maintain one recipe and just switch on certain feature options - not that this repo has any control over that!

pelson · 2016-03-23T16:09:12Z

@patricksnape - we can change the recipe behaviour based on external env vars if we need to...

$ > cat meta.yaml 
package:
  {% if os.environ.get('feature') == 'foo' %}
  name: foo
  {% else %}
  name: bar
  {% endif %}

$> feature=foo conda build .
BUILD START: foo--0

$ > conda build .
BUILD START: bar--0

jakirkham · 2016-03-23T16:11:57Z

Nice proposal @pelson. Is there already a spec on how we add and select these different features? Like a yaml file we add them to or something? If not, we should definitely work on one before we start adding this stuff (my opinion at least 😄).

patricksnape · 2016-03-23T17:09:10Z

@pelson Awesome. This is the kind of stuff that would make a killer little blog post somewhere - really useful information, thanks. These kind of 'best practises' would be great to compile in the Wiki or somewhere similar.

183amir · 2016-04-12T20:41:12Z

Hey guys, assuming that we acquire a license for mkl, how do we compile against mkl.
I mean, do we install it in our CI machines? How do we handle the license and serial number? If we compile with mkl, can we distribute our binaries?
Would we need extra recipes for packages that build against mkl? How different would those be?

jakirkham · 2016-04-13T00:01:17Z

These are all great questions. Probably the first step would be to talk to people at Continuum and see how they handle this problem. After that we may need to talk with someone at Intel to understand the licensing constraints in this situation.

msarahan · 2016-04-13T00:11:34Z

Yes, you'd need to install it with the license key on the CI machines. I don't think you can bundle it outside of their installer (this would probably defeat their licensing software, if it would work at all). I don't know how permissible it is to install something like that in a docker image - or even if it's possible to do their activation process in an unattended way. This is really a question for Intel. FWIW, with docker, you could install MKL in one image, and then create linked images with docker-compose so that you don't need the overhead of MKL for every build.

Packages that link against MKL don't need to be different other than providing a way of adding mkl as a dependency. This is the default in Anaconda, but it would be nicer to make a set of features for BLAS, and then choose one. This would give mutual exclusivity, which might be what you want. These schemes are not completely supported in conda right now, though. I think the ideal would be some jinja2 placeholder for blas, and which blas gets used is a matter of configuration.

msarahan · 2016-04-13T17:07:38Z

Also, something I didn't express clearly: the hard issue here is installation of MKL on the build side of the story. Redistribution of runtimes is permissible (please check the license yourself, but I'm pretty sure this is the case.)

183amir · 2016-04-13T17:13:04Z

I also wanted to know that if we should compile with the same mkl version that for example numpy was compiled with if we link against numpy.

jakirkham · 2016-04-13T17:22:42Z

Basically, let's stick with nomkl for now. The problem is we don't get headers for the mkl case. We can look into what Intel would allow us to do with mkl.

Ideally though I would like to have us use something like OpenBLAS. This is already what nomkl means on Linux. Accelerate on Mac has well known issues so it would be wise to switch to OpenBLAS. On Windows, there is no nomkl so we should make it OpenBLAS too. The first step would be to add a working OpenBLAS package. I will try to get on this soon.

Once we have that we can discuss the implications for features and feedstocks going forward.

mcg1969 · 2016-04-13T17:30:25Z

The problem here is that features really don't provide mutual exclusivity by themselves. In theory you can install the nomkl feature and still have the MKL libraries installed. And to be honest we ought to allow OpenBLAS and MKL to be installed in the same environment; we just need to make sure that only one is selected within a given process.

We need the key-value functionality for features that we discussed in another thread, or my metapackage solution we discussed elsewhere. Consider this second option for a moment. We would create a set of metapackages with the name "python_blas". One version of this package would be built for each blas variant: python_blas-mkl-0.tar.bz2 for MKL, python_blas-openblas-0.tar.bz2 for OpenBLAS, etc.

Then any package that needs to link to BLAS directly builds different variants for each BLAS type, and includes the corresponding version of the python_blas package as a dependency, as well as the BLAS library itself. Having to include both is a bit messy, so an alternative would be for python_blas itself to have those dependencies built in. This complicates the versioning strategy a bit but it's still doable.

This approach utilizes the natural mutual exclusivity offered by package names to ensure that only one BLAS is being used in a given Python dependency chain.

jakirkham · 2016-04-13T17:32:15Z

And to be honest we ought to allow OpenBLAS and MKL to be installed in the same environment; we just need to make sure that only one is selected within a given process.

Sorry to derail this a bit. I have heard you bring this up before, @mcg1969, but I'm having trouble understanding why one would want to do this. What is the use case here? Are there some cases you have encountered where this is helpful?

mcg1969 · 2016-04-13T17:34:43Z

I'm thinking of cases like, say, Python & R in the same environment. Why require them both to link to the same BLAS? I mean, sure, it would be convenient, but it puts more burden on the builders and the users. If it happens to be convenient to use MKL by default with Python and OpenBLAS by default with R, well, why not?

The argument is more clear with C runtime libraries. It's not easy to ensure that every program you're using in an environment links to the same C runtime. But obviously we need to make sure that every Python package does.

jjhelmus · 2016-04-13T17:39:27Z

Redistribution of runtimes is permissible

Permissible yes, allowed with some/many open source licenses, no. Refer to the Intel EULA for details but from what I recall the last time the NumPy developers talked about distributing MKL linked NumPy binaries the sticky points were:

Section 3(C) which includes a restrictions against " reverse engineer, decompile,
or disassemble the Materials;" and specifically excludes the use of MKL with software which would become subject to an "Excluded License" which explicitly mentions GPL, LGPL, MPL, and the CPL. BSD licensed software seems to be alright but the resulting binary would not be BSD.
The indemnification of Intel against any damages in section 8.

<Free Software Soapbox>
I will further add that any software distributed with the MKL license does not meet the definition of Free Software as it restricts at least two of the four essential freedoms, specially the freedom to study how the program works (freedom 1) and the freedom to distribution modified versions to others (freedom 3).
</Free Software Soapbox>

jakirkham · 2016-04-13T17:43:17Z

I'm thinking of cases like, say, Python & R in the same environment. Why require them both to link to the same BLAS? I mean, sure, it would be convenient, but it puts more burden on the builders and the users. If it happens to be convenient to use MKL by default with Python and OpenBLAS by default with R, well, why not?

I am concerned about performance issues here, but would need to think a bit more to come up with a reasonable example.

Though, as @jjhelmus mentions, licensing is a concern. I do regularly interact with GPL programs that need a BLAS and NumPy. IANAL, but I feel like having MKL around is a murky area. Especially as there would be linkage through to the GPL'd library.

The argument is more clear with C runtime libraries. It's not easy to ensure that every program you're using in an environment links to the same C runtime. But obviously we need to make sure that every Python package does.

Doesn't this case become a slippery slope? How are we sure that some C/C++ library isn't later getting used by some Python library with C/C++ bindings? It seems like it would be very hard to avoid this interface between other C runtimes from ever showing up.

mcg1969 · 2016-04-13T17:50:01Z

I'm not going to address the licensing concerns. We would have these same packaging & dependency issues with BLAS or C runtimes no matter what the licenses are.

I am concerned about performance issues here, but would need to think a bit more to come up with a reasonable example.

Performance issues certainly matter, but convenience does too. We can't necessarily control who is building every package we would like to use. So as long as things don't break to run the two BLAS versions in separate processes, we should not be preventing conda from installing them into the same environment.

How are we sure that some C/C++ library isn't later getting used by some Python library with C/C++ bindings?

Well, if dependencies aren't set correctly, there's nothing we can do. What we want here is the ability to get those dependency relationships right, and hopefully we can instruct people to do so, or find ways to automate those determinations in conda build.

But the ship has sailed on multiple C runtimes. We simply cannot synchronize on a single C runtime within conda environments involving mixed platforms like Python, R, lua, node, etc.

mcg1969 · 2016-04-13T17:52:32Z

In fact, the C runtime problem is really the controlling example here. We don't have a choice but to get that one right, and if it helps us solve the BLAS problem too, even better.

183amir · 2016-05-04T08:09:21Z

Hey guys, I went ahead and tried to compile bob.math with mkl and it seems to be working. Here is my recipe:

{% set version = "2.0.3" %}

package:
  name: bob.math
  version: {{ version }}

source:
  fn: bob.math-{{ version }}.zip
  url: https://pypi.python.org/packages/source/b/bob.math/bob.math-{{ version }}.zip
  md5: 0f010af6ce20fe6614570edff94e593f

build:
  number: 3
  skip: true  # [win]
  script: python -B setup.py install --single-version-externally-managed --record record.txt
  script_env:
   - LD_LIBRARY_PATH
   - LIBRARY_PATH
   - MIC_LD_LIBRARY_PATH
   - NLSPATH
   - CPATH

requirements:
  build:
  - python
  - setuptools
  - bob.core
  - boost
  - cmake
  - numpy x.x
  - pkg-config

  run:
  - python
  - bob.core
  - boost
  - numpy x.x

test:
  requires:
  - nose

  imports:
  - bob
  - bob.math

  commands:
  - nosetests -sv bob.math

about:
  home: http://github.com/bioidiap/bob.math
  license: Modified BSD License (3-clause)
  summary: LAPACK and BLAS interfaces for Bob

extra:
  recipe-maintainers:
  - 183amir

I installed mkl in our docker image with a student license.
At first I tried to add the mkl as a feature and dependency but looks like that is not what Continuum does anymore.
And to make the environment variables available I ran something like: source /intel/mkl/bin/mklvars.sh -v intel64
Now what I don't like about this recipe is this part:

build:
  script_env:
   - LD_LIBRARY_PATH
   - LIBRARY_PATH
   - MIC_LD_LIBRARY_PATH
   - NLSPATH
   - CPATH

I guess having a package like mkl-nonfree would make the process a lot easier.
I think if we want to add mkl in conda-forge, we need to create a private repository of mkl-nonfree and not upload it on anaconda.org but make it available in our CI builds so it can be used during compile time. Probably we have to apply for the open source license.

jakirkham · 2016-07-25T14:45:32Z

So we have a way to do BLAS now and it works ok. Details can be found in this hackpad. There is certainly room for growth, but it will probably involve an enhancement proposal (once that framework is ironed out).

jakirkham · 2016-07-25T14:45:50Z

Closing, but feel free to discuss more as appropriate.

jakirkham mentioned this issue Mar 17, 2016

qimage2ndarray: Add recipe to build qimage2ndarray #107

Merged

jakirkham self-assigned this Mar 22, 2016

patricksnape mentioned this issue Mar 24, 2016

Recipes that use SSE/AVX instructions #196

Closed

jakirkham mentioned this issue Mar 24, 2016

Save CI time by using nomkl #197

Closed

jakirkham mentioned this issue Apr 10, 2016

Add bob.math recipe #304

Merged

183amir mentioned this issue Jul 15, 2016

WIP: Adds mkl-nonfree #753

Closed

jakirkham closed this as completed Jul 25, 2016

jakirkham removed their assignment Mar 29, 2018

isuruf mentioned this issue Jul 31, 2018

Add cudatoolkit #6240

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recipes which depend on blas and lapack #80

Recipes which depend on blas and lapack #80

jjhelmus commented Mar 7, 2016

pelson commented Mar 7, 2016

jakirkham commented Mar 16, 2016

patricksnape commented Mar 23, 2016

pelson commented Mar 23, 2016

jakirkham commented Mar 23, 2016

patricksnape commented Mar 23, 2016

183amir commented Apr 12, 2016

jakirkham commented Apr 13, 2016

msarahan commented Apr 13, 2016

msarahan commented Apr 13, 2016

183amir commented Apr 13, 2016

jakirkham commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

jakirkham commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

jjhelmus commented Apr 13, 2016

jakirkham commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

183amir commented May 4, 2016 •

edited

jakirkham commented Jul 25, 2016

jakirkham commented Jul 25, 2016

Recipes which depend on blas and lapack #80

Recipes which depend on blas and lapack #80

Comments

jjhelmus commented Mar 7, 2016

pelson commented Mar 7, 2016

jakirkham commented Mar 16, 2016

patricksnape commented Mar 23, 2016

pelson commented Mar 23, 2016

jakirkham commented Mar 23, 2016

patricksnape commented Mar 23, 2016

183amir commented Apr 12, 2016

jakirkham commented Apr 13, 2016

msarahan commented Apr 13, 2016

msarahan commented Apr 13, 2016

183amir commented Apr 13, 2016

jakirkham commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

jakirkham commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

jjhelmus commented Apr 13, 2016

jakirkham commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

mcg1969 commented Apr 13, 2016

183amir commented May 4, 2016 • edited

jakirkham commented Jul 25, 2016

jakirkham commented Jul 25, 2016

183amir commented May 4, 2016 •

edited