Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade CMakeLists.txt to use external copies of dependencies & no downloading #19

Open
7 tasks
mr-c opened this issue Oct 1, 2015 · 19 comments
Open
7 tasks

Comments

@mr-c
Copy link
Contributor

mr-c commented Oct 1, 2015

Hello!

I'm packaging salmon and many of its dependencies for Debian in support of blahah/transrate#160

I have a messy patch to enable the use of external libraries instead of bundled or downloaded copies at http://anonscm.debian.org/cgit/debian-med/salmon.git/plain/debian/patches/dependency-fix

As I'm not a CMake expert I was only able to make it work for Debian instead of a generic solution that would fall back to the shipped copies or downloading as it is now.

Perhaps you all are better with CMake than I am? A generic solution would be best so I don't have to adjust the patch with every change to the CMakeLists.txt

Specifically it would be great to support

I also have a patch to support the latest release of jellyfish that I can turn into a pull request, should you want it: http://anonscm.debian.org/cgit/debian-med/salmon.git/plain/debian/patches/jellyfish-update

Thanks!

@rob-p
Copy link
Collaborator

rob-p commented Oct 1, 2015

Hi @mr-c,

First of all, thank you for your monumental effort in supporting salmon as a Debian package. I think this is fantastic. I'm also not a CMake expert, but I've failed with it enough times to start to get a handle. I have actually moved to the latest version of Jellyfish (v2.2.3) in the quasimapping branch, which will be merged back into master as soon as we've finished porting the bootstrapping feature from sailfish-master.

Regarding supporting external versions of the libraries — this absolutely makes sense. What is the standard location where they are assumed to be installed? In this case I can search with a FindPackage before I attempt to download them. From the list above, I see two potential problem libraries:

  • bwa — We actually use a modified version of bwa that accepts and uses an extra parameter that specifies the suffix array sampling frequency. Specifically, we, by default, use a denser sampling of the suffix array to trade off extra space usage for more speed in lookup. The standard bwa, therefore, probably wouldn't work.
  • jellyfish — I actually tried to use jellyfish without obtaining the source and building it early on in development. I ran into an issue where the config.h file generated during compile wasn't installed by Jellyfish, and this caused runtime failures when Salmon was running. It seems to me that either (1) config.h should be installed with jellyfish by default or (2) it shouldn't be necessary to use jellyfish as a library. However, as far as I know, this issue persists in the latest version of jellyfish (if you want to use it as a library as we do, and not just as a k-mer counter).

For the remaining libraries, we just use the standard versions, so this should be OK.

@mr-c
Copy link
Contributor Author

mr-c commented Oct 2, 2015

Hey @rob-p, you are quite welcome; thanks for the warm response.

Being system installed packages most headers and dynamic libraries will be installed using the standard prefixes: /usr/include, /usr/lib/${multiarch-tuple}/,

I've updated the checklist above with links to the file lists so you can see the paths yourself.

Interestingly I was able to build, run, and pass all the included tests using the version of BWA in the Debian archive. As for Jellyfish I had to update our package to include 'json.h' which had gotten dropped.

@sjackman
Copy link
Contributor

sjackman commented Mar 7, 2016

cmake fails when extracting external dependencies for me. See #10. I'd also like to be able to use the existing installed versions of dependencies. For me they're installed by Linuxbrew.

@rob-p
Copy link
Collaborator

rob-p commented Mar 8, 2016

Hi guys,

Which existing dependencies would you like to be able to use? There are some of these libraries that cannot be replaced by already installed variants. Specifically,

  • BWA --- since the version that is pulled in and used actually requires we expose certain functionality for our lightweight alignment procedure (though this dependency may go away all together if we deprecate lightweight alignment in favor of quasi-mapping).
  • Jellyfish --- here, we require the ability to use jellyfish as a library. Specifically, we rely on some headers that are not installed with the standard package. Perhaps here there could be some synergy with Guillaume on making all of the things Salmon uses part of the standard Jellyfish install, but, at least currently, this isn't the case.

The CMake build system already looks for existing versions of the following before fetching them:

  • Boost
  • tbb
  • jemalloc

So, the the remaining guys are libgff (which is just some small libraryification of a gff parser that I put together a while ago, I don't know that it's in any package manager --- is it? It doesn't even have an associated install script) and staden IO lib. For Staden, I'd be happy to have it look for an existing installation, but there is no FindStaden.cmake that I know of, and I don't really know how to write FindX.cmake files appropriately. However, I'd be happy to learn and / or accept pull requests.

@mr-c
Copy link
Contributor Author

mr-c commented Mar 11, 2016

It would be good to get your modifications accepted into BWA & Jellyfish

Re: libgff I did indeed package it as a separate library for Debian: https://packages.debian.org/source/sid/libgff

staden-io ships io_lib-config which reports on how to compile against the library

mcrusoe@mrcdev:~$ io_lib-config 
Usage: io_lib-config [option]

where 'option' is any one of:

  --cflags      C and preprocessor flags (eg -I/foo/include)
  --libs        Link-line parameters, eg -L/foo/lib -lstaden-read
  --version     List io_lib version number

mcrusoe@mrcdev:~$ io_lib-config --cflags
-I/usr/include
mcrusoe@mrcdev:~$ io_lib-config --libs
-L/usr/lib/x86_64-linux-gnu -lstaden-read -Wl,-z,relro -lm -lpthread   -lcurl -lz
mcrusoe@mrcdev:~$ io_lib-config --version
1.14.7

@rob-p
Copy link
Collaborator

rob-p commented Mar 11, 2016

Oh wow; I had no idea about libgff :).

Regarding Jellyfish, there's not a source "change" required upstream, rather the fact that I seem to require the config.h file that is not installed during the "normal" Jellyfish install process. I don't know if you have any idea how one might get around that.

Regarding staden, thanks for brining this to my attention. It will probably take a bit for me to wrap my head around the right way to access this information in CMake, but I'll see what I can manage to cobble together on that front (I really wish there was something better, with a less horrendous "language" than CMake, but nothing I know of exists that works nearly as well "out of the box").

@mr-c
Copy link
Contributor Author

mr-c commented Mar 11, 2016

I don't think config.h is needed when compiling against prebuilt libjellyfish*.so; when building salmon for Debian I explicitly patched out the include and everything built just fine.

http://anonscm.debian.org/cgit/debian-med/salmon.git/plain/debian/patches/jellyfish-update

@rob-p
Copy link
Collaborator

rob-p commented Mar 11, 2016

Very interesting! It used to be necessary, but perhaps Giullaume fixed this upstream at some point and I just wasn't aware. I'll see if I can merge those changes into Salmon.

@sjackman
Copy link
Contributor

As a user, I prefer autotools to cmake. cmake breaks way more often in my hands/experience than autotools. As a developer, I've only used autotools, so my previous opinion is probably biased.

@mr-c
Copy link
Contributor Author

mr-c commented Mar 11, 2016 via email

@rob-p
Copy link
Collaborator

rob-p commented Mar 11, 2016

I used autotools back in 2008 or so (when I was still working in computer graphics). I found it to be horribly archaic (not that CMake is a bastion of clarity). Also, as opaque as CMake sometimes is, I at least found it easier to discover how to force it to do what I wanted than with autotools. That being said, I feel like it is one of these situations where, if you are a wizard with the tool, everything seems relatively easy and straightforward (e.g. Guillaume uses autotools for Jellyfish, and he seems to have internalized all of the quirks fairly well). I guess I long for a configuration and build DSL that has an actual nice language, rather than the somewhat crazy invocations required by autotools and CMake. Then again, the annals of history are strewn with the wreckage of deprecated and defunct attempts to make better build systems.

@sjackman
Copy link
Contributor

The languages of both autotools and CMake are pretty terrible. I actually like the Make language; I think it gets a bad wrap.

Other than config.h was there any other files of Jellyfish that were missing from the install that you needed?

@rob-p
Copy link
Collaborator

rob-p commented Mar 11, 2016

No, I believe that config.h was it. Otherwise, I just use the already installed headers and the pre-compiled library libjellyfish-2.0.

@sjackman
Copy link
Contributor

I'll check that Homebrew installs the Jellyfish config.h file.

@rob-p
Copy link
Collaborator

rob-p commented Mar 12, 2016

I actually think that (as @mr-c points out), the config.h file isn't necessary with newer versions of Jellyfish. I'm removing the dependency upstream.

@rob-p
Copy link
Collaborator

rob-p commented Mar 12, 2016

Of course, I'm also bumping the version requirement so that we require a Jellyfish version that fixes the "empty read" bug (gmarcais/Jellyfish#55).

@sjackman
Copy link
Contributor

Ah, great.

gentoo-bot pushed a commit to gentoo/sci that referenced this issue Apr 21, 2018
The building procedure is still a mess.

COMBINE-lab/salmon#19

Package-Manager: Portage-2.3.28, Repoman-2.3.9
@mmokrejs
Copy link

mmokrejs commented Jun 24, 2018

And for us, who have blocked download on a computational cluster cmake silently continues even when scripts/fetchRapMap.sh failed (see error code 403 below). That is bad. Please propagate the error back to cmake so it dies immediately. Actually, remove the download altogether. Improve Requirements documentation and put a link to it there instead.

$blah/salmon-0.10.2 $ cmake -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DBoost_NO_BOOST_CMAKE=BOOL:ON -DFETCH_BOOST=TRUE CMakeLists.txt
-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /apps/gentoo/usr/bin/cc
-- Check for working C compiler: /apps/gentoo/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /apps/gentoo/usr/bin/c++
-- Check for working CXX compiler: /apps/gentoo/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
Making Release build
running $blah/salmon-0.10.2/scripts/fetchRapMap.sh 2>&1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (56) Received HTTP code 403 from proxy after CONNECT
-- Found ZLIB: /apps/gentoo/usr/lib/libz.a (found version "1.2.11")
-- Looking for lzma_auto_decoder in /apps/gentoo/usr/lib/liblzma.a
-- Looking for lzma_auto_decoder in /apps/gentoo/usr/lib/liblzma.a - found
-- Looking for lzma_easy_encoder in /apps/gentoo/usr/lib/liblzma.a
-- Looking for lzma_easy_encoder in /apps/gentoo/usr/lib/liblzma.a - found
-- Looking for lzma_lzma_preset in /apps/gentoo/usr/lib/liblzma.a
-- Looking for lzma_lzma_preset in /apps/gentoo/usr/lib/liblzma.a - found
-- Found LibLZMA: /apps/gentoo/usr/include (found version "5.2.3")
Found liblzma library: /apps/gentoo/usr/lib/liblzma.a
===========================================
-- Found BZip2: /apps/gentoo/usr/lib/libbz2.a (found version "1.0.6")
-- Looking for BZ2_bzCompressInit
-- Looking for BZ2_bzCompressInit - found
Found libbz2 library: /apps/gentoo/usr/lib/libbz2.a
===========================================
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Could NOT find Boost
BOOST_INCLUDEDIR =
BOOST_LIBRARYDIR =
Boost_FOUND = 0
Build system will fetch and build Boost
==================================================================
Setting Temporary Boost paths
BOOST INCLUDE DIR = $blah/salmon-0.10.2/external/install/include
BOOST INCLUDE DIRS = $blah/salmon-0.10.2/external/install/include
BOOST LIB DIR = $blah/salmon-0.10.2/external/install/lib
BOOST LIBRARIES =
Build system will build libdivsufsort
==================================================================
Build system will fetch and build the Cereal serialization library
==================================================================
Build system will fetch and build BWA (for Salmon)
==================================================================
-- Found TBB: /apps/gentoo/usr/include (found suitable version "2018.0", minimum required is "2018.0") found components:  tbb tbbmalloc tbbmalloc_proxy
TBB_LIBRARIES = /apps/gentoo/usr/lib/libtbbmalloc_proxy.so;/apps/gentoo/usr/lib/libtbbmalloc.so;/apps/gentoo/usr/lib/libtbb.so
Build system will compile libgff
==================================================================
==================================================================
Build system will compile Staden IOLib
==================================================================
Build system will fetch SPDLOG
==================================================================
-- Found PkgConfig: /apps/gentoo/usr/bin/pkg-config (found version "0.29.2")
-- Found Jemalloc: /apps/gentoo/usr/lib/libjemalloc.so (found version "")
Found Jemalloc library --- using this memory allocator
CPACK_SOURCE_IGNORE_FILES = /src/PCA.cpp;/src/PCAUtils.cpp;/build/;/scripts/AggregateToGeneLevel.py;/scripts/ExpressionTools.py;/scripts/GenerateExpressionFiles.sh;/scripts/ParseSoftFile.py;/scripts/PlotCorrelation.py;/scripts/junk;/scripts/sfstrace.log;/scripts/SFPipeline.py;/bin/;/lib/;/sample_data/;PublishREADMEToWebsite.sh;/external/;/src/obsolete/;/include/obsolete/;WebsiteHeader.txt;/experimental_configs/;.git/
TBB_LIBRARIES = /apps/gentoo/usr/lib/libtbbmalloc_proxy.so;/apps/gentoo/usr/lib/libtbbmalloc.so;/apps/gentoo/usr/lib/libtbb.so
-- Configuring done
CMake Error at src/CMakeLists.txt:158 (add_executable):
  Cannot find source file:

    $blah/salmon-0.10.2/external/install/src/rapmap/RapMapFileSystem.cpp

  Tried extensions .c .C .c++ .cc .cpp .cxx .cu .m .M .mm .h .hh .h++ .hm
  .hpp .hxx .in .txx


CMake Error at src/CMakeLists.txt:160 (add_executable):
  Cannot find source file:

    $blah/salmon-0.10.2/external/install/src/rapmap/rank9b.cpp

  Tried extensions .c .C .c++ .cc .cpp .cxx .cu .m .M .mm .h .hh .h++ .hm
  .hpp .hxx .in .txx


CMake Error at src/CMakeLists.txt:158 (add_executable):
  No SOURCES given to target: salmon


CMake Error at src/CMakeLists.txt:160 (add_executable):
  No SOURCES given to target: unitTests


-- Build files have been written to: $blah/salmon-0.10.2
$blah/salmon-0.10.2 $ make
make: *** No targets specified and no makefile found.  Stop.
$blah/salmon-0.10.2 $

gentoo-bot pushed a commit to gentoo/sci that referenced this issue Jun 25, 2018
Document zillions of cryptic dependencies of a lousy package

Kingsford-Group/libgff#1
COMBINE-lab/salmon#236
COMBINE-lab/salmon#19

Incomplete install docs are deemed to be at:
http://salmon.readthedocs.io/en/latest/building.html#requirements-for-building-from-source

Try the command below to see yourself:
salmon-0.10.2 $ find . -type f | xargs grep curl 2>/dev/null

Thanks to @kiwifb for comments at
  a1d0487

Package-Manager: Portage-2.3.40, Repoman-2.3.9
rob-p pushed a commit that referenced this issue Jun 25, 2018
@outpaddling
Copy link

And for us, who have blocked download on a computational cluster cmake silently continues even when scripts/fetchRapMap.sh failed (see error code 403 below).

Dists downloading their own dependencies is also forbidden in package managers such as FreeBSD ports and pkgsrc (which is cross-platform and I personally use on Mac, NetBSD, and RHEL). Trusting upstream scripts to pull stuff off the Internet is a security risk, so the package managers perform and validate (via checksum) all downloads in a separate stage. It would be nice not to have to hack out the download code from a build system in order to create and maintain a package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants