Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving pre-built DeepVariant binaries for conda packages #29

Closed
chapmanb opened this issue Jan 2, 2018 · 41 comments
Closed

Improving pre-built DeepVariant binaries for conda packages #29

chapmanb opened this issue Jan 2, 2018 · 41 comments

Comments

@chapmanb
Copy link

chapmanb commented Jan 2, 2018

Hi all;
Thanks for all the help getting an initial conda package in place for DeepVariant (#9) through bioconda.

I wanted to follow up with some suggestions that would help make the pre-built binaries more portable as part of this process, in order of helpfulness for portability:

An alternative to points 1 and 3 is making it easier to build DeepVariant as part of the conda build process. The major blocker here is the clif dependency which is difficult to build and the pre-built binaries require unpacking into /usr. If we could make this relocatable and easier to install globally we could build with portable binaries and adjustable numpy as part of the bioconda preparation process.

Thanks again for all the help.

@bgruening
Copy link

@chapmanb thanks for writing this up. I would prefer to get clif into conda-forge and we reuse this to build a relocatable version of DeepVariant. If this is not possible it would be great to address your points so we get relocatable binaries.

Thanks!

@chapmanb chapmanb changed the title Improving pre-build DeepVariant binaries for conda packages Improving pre-built DeepVariant binaries for conda packages Jan 5, 2018
@chapmanb
Copy link
Author

chapmanb commented Jan 5, 2018

Björn -- I'm agreed. I tried to look into building clif but it was too intense (https://github.com/google/clif#building) and had to give up. Right now the pre-built version assumes unpacking into /usr so is also not an option for a conda package. If you have time to investigate and think you can tackle that would be great. I've already gotten bazel up to date so should be able to try building with clif available.

@depristo
Copy link

depristo commented Jan 5, 2018

I've pinged the @mrovner and @gpshead about CLIF. I'm hopeful they'll chime in here.

@mrovner
Copy link

mrovner commented Jan 8, 2018

We understand that building CLIF is a high tall for our users and putting efforts to make it easier.

pichuan pushed a commit that referenced this issue Jan 30, 2018
-- Bazel goes from 0.8.1 to 0.9.0.
-- Both TensorFlows (custom whl and C++ library) goes from 1.4.x to 1.5.0
-- -- As a result some tests update to absltest.main()
-- Numpy goes from 1.13 to 1.12 at the request of our users to be more compatible with bioconda. (See discussion in #29)

PiperOrigin-RevId: 182827446
@pichuan
Copy link
Collaborator

pichuan commented Apr 27, 2018

Update (1) on numpy version:

Turns out getting back to numpy 1.12 is harder than I thought, because TensorFlow 1.4 requires numpy 1.14.

When I try to revert the prereq script to numpy 1.12, I keep getting this message:
"tensorflow 1.4.1 has requirement numpy>=1.12.1, but you'll have numpy 1.12.0 which is incompatible."

which makes me uncomfortable to pin numpy back at 1.12.

I did try changing the numpy version to 1.12, and build with build_release_binaries.sh. It seems to build, and call_variants step (which is the main step that uses Tensorflow) seems to run with similar speed. The hap.py numbers are exactly the same.
However, it seems like bad practice to ignore that warning message that shows up in red.
Thoughts?

@pgrosu
Copy link

pgrosu commented Apr 28, 2018

Hi Pichuan and Brad,

So it looks like Bioconda's requirement for numpy 1.12 originated from this issue:

bioconda/bioconda-recipes#3961

which got merged into this PR:

bioconda/bioconda-recipes#4888

But it seems the driver for the 1.12 version was CNVkit, which just requires >= 1.9 based on the setup here:

https://github.com/etal/cnvkit/blob/master/setup.py#L19

Maybe updating Bioconda first with 1.14 might be a good start, and seeing if that PR passes Travis, otherwise just updating the DV scripts with a virtualenv instance, so that their local environment remains pristine. The alternative is for folks to use it via the Docker image.

What do you think?

Hope it helps,
Paul

@chapmanb
Copy link
Author

Pi-Chuan -- thanks so much for looking at this.
Paul -- thank you for digging into the bioconda history on the numpy pin. I suspect we could update numpy in bioconda but no one has yet taken that on.

Given that and the issues with tensorflow pinning, I'd advise just sticking with 1.14 and we can work on the dependency issues in bioconda. Apologies, I hadn't meant to cause a lot of work and didn't realize about the pinning preventing this.

If we can get older glibc compatible binaries that'll cover most of the issues and we can work around the numpy problems by installing in an isolated conda environment for now.

Thanks again for looking into the numpy and glibc work.

@pichuan
Copy link
Collaborator

pichuan commented Apr 28, 2018

Another update on CLIF dependency:
@chapmanb , as you noticed, CLIF is an issue here. We pre-built our own CLIF and directly used it in the DeepVariant build. I also just realized that we didn't release the script that we used to build CLIF, which should totally be released. I was planning to push out a 0.6.1 today, but now it's late so I'm going to wait until Monday for my own sanity and not breaking things over the weekend. However, if it's helpful I'll paste the content here right now. Note that this is used to build for Ubuntu. I did start looking into whether we can modify it for CentOS 6, but stuck at how to get protoc and hasn't resumed my work yet. I'll just paste our script for Ubuntu and hopefully that could be helpful if you want to look into building a CentOS compatible CLIF.

Next week I'll push a 0.6.1 that has this under the tools/ directory. And I'll also see if I can figure out how to build it for CentOS6.

# Builds OSS CLIF binary for DeepVariant.
#
# This script should be run on a cloud VM. Known to work on some versions of
# Linux OS.
#
# OSS CLIF takes a very long time to build (10+ minutes) since it needs to
# compile parts of clang and LLVM. To save this build time, we use this script
# to build CLIF, install it in /usr/local/clif, and then packages up
# /usr/local/clif and shared protobuf libraries from /usr/local/lib into a tgz
# called oss_clif.latest.tgz.
#
# This oss_clif.latest.tgz is used by build-prereq.sh to build DeepVariant.
# Various versions that we built and released can be found under:
# https://console.cloud.google.com/storage/browser/deepvariant/packages/oss_clif
#
# We do recognize that this should be temporary, and will update when there is
# an official solution from CLIF.
# GitHub issues such as https://github.com/google/deepvariant/issues/29 has
# some relevant pointers.

set -eux -o pipefail

# Figure out which linux installation we are on to fetch an appropriate version
# of CLIF binary. Note that we only support now Ubuntu (14 and 16), and Debian.
if [[ $(python -mplatform) == *"Ubuntu-16"* ]]; then
  export DV_PLATFORM="ubuntu-16"
  # For ubuntu 16 we install cmake
  sudo -H apt-get -y install cmake
elif [[ $(python -mplatform) == *"Ubuntu-14"* ]]; then
  export DV_PLATFORM="ubuntu-14"
  # For ubuntu 14 we install cmake3
  sudo -H apt-get -y install cmake3
elif [[ $(python -mplatform | grep '[Dd]ebian-\(rodete\|9.*\)') ]]; then
  export DV_PLATFORM="debian"
   # For recent debian, we install cmake.
   sudo -H apt-get -y install cmake
else
  export DV_PLATFORM="unknown"
  exit "unsupported platform"
fi

CLIF_DIR=/usr/local/clif
CLIF_PACKAGE="oss_clif.${DV_PLATFORM}.latest.tgz"

# Install prereqs.
sudo -H apt-get -y install ninja-build subversion
sudo -H apt-get -y install virtualenv python-pip pkg-config
sudo -H pip install 'pyparsing>=2.2.0'
sudo -H pip install 'protobuf>=3.4'

echo === building protobufs

sudo -H apt-get install -y autoconf automake libtool curl make g++ unzip
wget https://github.com/google/protobuf/releases/download/v3.4.1/protobuf-cpp-3.4.1.tar.gz
tar xvzf protobuf-cpp-3.4.1.tar.gz
(cd protobuf-3.4.1 &&
  ./autogen.sh &&
  ./configure &&
  make -j 32 &&
  make -j 32 check &&
  sudo make -j 32 install &&
  sudo ldconfig)

echo === building CLIF

git clone https://github.com/google/clif.git
sed -i 's/\$HOME\/opt/\/usr\/local/g' clif/INSTALL.sh
sed -i 's/-j 2//g' clif/INSTALL.sh
(cd clif && sudo ./INSTALL.sh)

echo === creating package tgz

sudo find ${CLIF_DIR} -type d -exec chmod a+rx {} \;
sudo find ${CLIF_DIR} -type f -exec chmod a+r {} \;
tar czf "${CLIF_PACKAGE}" /usr/local/lib/libproto* "${CLIF_DIR}"

echo === SUCCESS: package is "${CLIF_PACKAGE}"
                                                                                

@bgruening
Copy link

@pichuan numpy should not be a problem. We can pin the recipe the 1.14 for this package if its needed.

@chapmanb
Copy link
Author

Pi-Chuan -- thanks for this. We'd ideally build with CLIF directly in bioconda to avoid you needing to have these custom builds, but will hold off on that until there is an easier to build/install CLIF dependency. Happy to test the new version with reduced glibc requirements when it's ready.

Björn -- We do pin to 1.14 now in DeepVariant, with the downside that it's not compatible in a shared environment with other looks that pin to the bioconda 1.12 default. I can work around this for now by having DeepVariant in a separate environment, but would love to synchronize bioconda to 1.14 at some point.

Thanks again for all this work and help.

@pichuan
Copy link
Collaborator

pichuan commented Apr 30, 2018

Hi Brad,
just to confirm again:
if you want to build CLIF on your own, would something similar to:
#29 (comment)
be useful for you? What is the requirement on your side? Is it for this to build on CentOS6? Is there something else? Thanks.

@chapmanb
Copy link
Author

Pi-Chuan -- sorry, that's right, we would want to build on CentOS6 to be compatible with bioconda. They have a restricted build environment for portability so we'd need to have all the dependencies installable by bioconda (rather than system packages). I had looked at this earlier and realized all the pre-requisites so got afraid of tackling it. It's definitely a help to have that information but I think would still take a bit of work to port over.

@pichuan
Copy link
Collaborator

pichuan commented May 1, 2018

Update:
the build_clif_package.sh (the same in #29 (comment)) is now in included in v0.6.1:
https://github.com/google/deepvariant/blob/r0.6/tools/build_clif_package.sh
https://github.com/google/deepvariant/releases/tag/v0.6.1

I haven't looked more into the CentOS6 build. I'll send another update when I make progress on that.

@pichuan
Copy link
Collaborator

pichuan commented May 2, 2018

Hi @chapmanb , another update:

I went through a lot of hacky steps and built CLIF. I'm actually not sure whether it's actually usable or not, so if you have a setup that quickly give it a try, that will be great.

Here's the instruction on how to get pyclif to run on a CentOS 6 machine:

# Get a machine
gcloud beta compute instances create "${USER}-centos6" \
--scopes "compute-rw,storage-full,cloud-platform" \
--image-family "centos-6" --image-project "centos-cloud" \
--machine-type "custom-64-131072" \
--boot-disk-size "300" --boot-disk-type "pd-ssd" \
--zone "us-west1-b"

# ssh into it
gcloud compute ssh ${USER}-centos6    --zone us-west1-b
##### On the GCE instance #####
# Install Python 2.7
sudo yum install -y centos-release-SCL
sudo yum install -y python27
source /opt/rh/python27/enable

gsutil -m cp gs://deepvariant/packages/oss_clif/oss_clif.centos-6.9.latest.tgz /tmp/
(cd / && sudo tar xzf "/tmp/oss_clif.centos-6.9.latest.tgz")
sudo ldconfig  # Reload shared libraries.

(I had to build with Python 2.7. Didn't figure out how to build with 2.6. Let me know if you actually need Python 2.6?)

Once you do this, you can run /usr/local/clif/bin/pyclif and should see the usage:

$ /usr/local/clif/bin/pyclif
usage: pyclif [-h] [--py3output] [--matcher_bin MATCHER_BIN] [--nc_test]
              [--dump_dir DUMP_DIR] [--binary_dump] [--modname MODNAME]
              [--prepend PREPEND] [--include_paths INCLUDE_PATHS]
              [--ccdeps_out MODNAME.cc] [--ccinit_out MODNAME_init.cc]
              [--header_out MODNAME.h] [--cc_flags CC_FLAGS] [--indent INDENT]
              input_filename
pyclif: error: too few arguments

Please let me know once you have a chance to try it.
CentOS 6 is tricky. It feels like everything is old :(
Let me know what other things are blocking you.

@pichuan
Copy link
Collaborator

pichuan commented May 2, 2018

And, another thing I did is build bazel 0.11.0 with the older GLIBC.

On my CentOS 6 GCE instance:

$ ldd --version
ldd (GNU libc) 2.12
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

I basically followed https://gist.github.com/truatpasteurdotfr/d541cd279b9f7bf38ce967aa3743dfcb , but use bazel version 0.11.0 instead.
And in the echo 'cd /tmp/bazel-0.4.5-dist && bash ./compile.sh && cp output/bazel /usr/local/bin' | scl enable devtoolset-3 bash command I had to add sudo to the cp command.

After this, I have a bazel 0.11.0:

$ /usr/local/bin/bazel version
Extracting Bazel installation...
Build label: 0.11.0- (@non-git)
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Nov 5 12:47:48 +50302 (1525237217268)
Build timestamp: 1525237217268
Build timestamp as int: 1525237217268

I haven't tried building with it, though.

@pgrosu
Copy link

pgrosu commented May 2, 2018

Pichuan, when in doubt try building it statically - libc is not really that large a library.

@pgrosu
Copy link

pgrosu commented May 2, 2018

Pichuan, to increase the ease use and expand adoption within the Bioinformatics community it might not hurt to have a collection of customized build-and-test environments at Google that match a variety of environment configurations that users have in place, or that common packages recommend out here. Sometimes folks will be curious to try out some new Bioinformatics software package, and the faster they get it to a running state on their own machines, the happier the experience enabling the community for that package to grow faster. Basically most people just want to use stuff - and want a turn-key solution - though some of us like tinkering with puzzles :) If their experience is good on something local - or even a cluster - then they'll see the obvious need to try it out on a Cloud environment.

I sort of did it from the other side. Many times when I tested most of the GoogleGenomics tools, I would try them out in some real-world scenarios, I usually ran them against a variety of configurations. That helped with having better error messages, control flow decisions, documentation or additional features.

Basically you have developed a great software - which is evolving - and now comes the service component of supporting it, which is just as important.

Just a friendly recommendation,
~p

@chapmanb
Copy link
Author

chapmanb commented May 2, 2018

Pi-Chuan;
Thanks for this update and work. For bazel, we'd gotten that updated and building on conda with CentOS 6 so are good to go with that dependency, it's really just CLIF that we're struggling with for building.

For clif, this is great progress, thank you. I resuscitated my bioconda build script and gave it a try with this. It's making better progress but unfortunately needs to reconstitute the system wide python install within the build environment which we can't do in conda. Everything there is in an isolated work directory so won't have the system shared libraries it wants:

$ /opt/conda/conda-bld/deepvariant_1525283132666/work/deepvariant-0.6.1/usr/local/clif/bin/python
/opt/conda/conda-bld/deepvariant_1525283132666/work/deepvariant-0.6.1/usr/local/clif/bin/python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory

and the python libraries included symlink to the system wide ones you built against:

lrwxrwxrwx  1 conda conda   56 May  2 17:54 _weakrefset.py -> /opt/rh/python27/root/usr/lib64/python2.7/_weakrefset.py

I'm not sure if it's possible to make this more relocatable with any python as part of the build process. Sorry, I know it's a lot more work to make it relocatable like this but will allow install on all the systems we support where users don't have root privileges to rely on system libraries. Thanks again for helping with this.

@pichuan
Copy link
Collaborator

pichuan commented May 2, 2018

@chapmanb Thanks for giving it a try.
Before I built I did something like:

# Install Python 2.7
sudo yum install -y centos-release-SCL
sudo yum install -y python27
source /opt/rh/python27/enable

I think starting from there it just assumes python is in /opt/rh/python27/root/usr/bin/python. I'll take a look and see if I can make it recognize python at any path.
Is there a convention that people use to build something so that they can point to other Python locations?

@mrovner
Copy link

mrovner commented May 2, 2018 via email

@pichuan
Copy link
Collaborator

pichuan commented May 2, 2018

@mrovner Thanks Mike. That is at the time of building, correct? If I choose with a python interpreter, will the user (Brad) need to also have python at the same location?
I already built one here for CentOS6: gs://deepvariant/packages/oss_clif/oss_clif.centos-6.9.latest.tgz
But it seems like @chapmanb is having trouble using it.
Ideally we'll be able to specify the location differently at run time than the one at build time. Do you think that's possible?

@mrovner
Copy link

mrovner commented May 2, 2018 via email

@chapmanb
Copy link
Author

chapmanb commented May 3, 2018

Pi-Chuan and Mike;
Thanks for all this background and help. I'm trying to fit this into the conda recipe bazel build for DeepVariant but am not sure how to take advantage of using the local anaconda python in that context. The error I'm seeing is that bazel can't find pyclif_proto:

(17:56:01) INFO: Found 1 target...
(17:56:01) [0 / 7] [-----] BazelWorkspaceStatusAction stable-status.txt
(17:56:01) ERROR: missing input file '@clif//:clif/bin/pyclif_proto'
(17:56:01) ERROR: /opt/conda/conda-bld/deepvariant_1525283132666/work/deepvariant-0.6.1/third_party/nucleus/protos/BUILD:165:1: //third_party/nucleus/protos:variants_pyclif_clif_rule: missing input file '@clif//:clif/bin/pyclif_proto'
Target //deepvariant:binaries failed to build
(17:56:01) ERROR: /opt/conda/conda-bld/deepvariant_1525283132666/work/deepvariant-0.6.1/third_party/nucleus/protos/BUILD:165:1 1 input file(s) do not exist

which I thought was triggered by the difficulty running pyclif without having the local python installed. It could also be due to not installing is in /usr/local/bin since I have to remain sandboxed in the work directory, but I did adjust the PATH to include the download location.

Sorry I'm stuck here due to me limited knowledge of bazel tweaking. Either understanding how to handle a root install of the pre-build pyclif or tweaking to use the local python would be helpful. Alternatively, if you can already build DeepVariant on a CentOS6 system yourself I could use the pre-build binaries the way we're doing now, just with the build against an older glibc. Thanks again for the help with this.

@mrovner
Copy link

mrovner commented May 3, 2018 via email

@pichuan
Copy link
Collaborator

pichuan commented May 4, 2018

@chapmanb I've been spending most of the last 2 days on this, and unfortunately currently stuck at where you are as well:
https://gist.github.com/pichuan/c0d2e6cf59a0f5ae373054410e477b59

I might have to call it a day today and look at it again tomorrow...

@pichuan
Copy link
Collaborator

pichuan commented May 4, 2018

OK. I noticed that my pyclif_proto is in /usr/local/bin/, not /usr/local/clif/bin. Not knowing if that really is an issue, I did the following:

sudo ln -sf /usr/local/bin/pyclif_proto /usr/local/clif/bin/pyclif_proto

And added that to my experimental build-prereq.sh

Now I'm seeing a different error:

(06:15:00) INFO: Found 80 targets and 33 test targets...
(06:15:00) ERROR: /home/pichuan/.cache/bazel/_bazel_pichuan/01047f0bd74be1f8c2eae71c8557726c/external/nsync/BUILD:441:1: C++ compilation of rule '@nsync//:nsync_cpp' failed (Exit 1): gcc failed: error executing command
  (cd /home/pichuan/.cache/bazel/_bazel_pichuan/01047f0bd74be1f8c2eae71c8557726c/execroot/com_google_deepvariant && \
  exec env - \
    PATH=/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/pichuan/bin \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/usr/local/bin/python2.7 \
    PYTHON_LIB_PATH=/usr/local/lib/python2.7/site-packages \
    TF_NEED_CUDA=0 \
    TF_NEED_OPENCL_SYCL=0 \
  /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK -Wno-maybe-uninitialized -Wno-unused-function -msse4.1 -msse4.2 -mavx -O3 -MD -MF bazel-out/k8-opt/bin/external/nsync/_objs/nsync_cpp/external/nsync/internal/common.pic.d -fPIC -iquote external/nsync -iquote bazel-out/k8-opt/genfiles/external/nsync -iquote external/bazel_tools -iquote bazel-out/k8-opt/genfiles/external/bazel_tools -isystem external/nsync/public -isystem bazel-out/k8-opt/genfiles/external/nsync/public -isystem external/bazel_tools/tools/cpp/gcc3 -x c++ '-std=c++11' -DNSYNC_ATOMIC_CPP11 -DNSYNC_USE_CPP11_TIMEPOINT -I./external/nsync//platform/c++11 -I./external/nsync//platform/gcc -I./external/nsync//platform/x86_64 -I./external/nsync//public -I./external/nsync//internal -I./external/nsync//platform/posix '-D_POSIX_C_SOURCE=200809L' -pthread -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/nsync/internal/common.c -o bazel-out/k8-opt/bin/external/nsync/_objs/nsync_cpp/external/nsync/internal/common.pic.o)
cc1plus: error: unrecognized command line option "-std=c++11"
cc1plus: warning: unrecognized command line option "-Wno-maybe-uninitialized"
cc1plus: warning: unrecognized command line option "-Wno-free-nonheap-object"
(06:15:00) INFO: Elapsed time: 0.386s, Critical Path: 0.05s
(06:15:00) FAILED: Build did NOT complete successfully

@pichuan
Copy link
Collaborator

pichuan commented May 4, 2018

After trying to install a few things that I failed to install before, linking a few paths, I got to this error that concerns me:

ImportError: /lib64/libc.so.6: version `GLIBC_2.17' not found (required by /usr/local/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)

It's possible that TensorFlow itself requires a newer version of GLIBC than what's on CentOS 6.
I did some search and found some old thread that could be relevant:
tensorflow/tensorflow#527

@chapmanb Is it possible at all to install this on a different OS? This is getting to a point that I'm worried I'm going down a path with no good ending in sight..

@chapmanb
Copy link
Author

chapmanb commented May 4, 2018

Pi-Chuan;
Thanks for following up with the additional work on this. For the pyclif requirement, it sounds like you're hitting the same problem as me: I'm not sure where to put it so that bazel can find it. Do you understand what locations it looks in and if we can tweak them? I'm not able to put it into /usr/local like you did, since we're confined to our work directory. Can I stick it in an arbitrary location and let bazel know?

For tensorflow, it seems like that was built on system with a more recent glibc than on CentOS6. It is a pain to have the full dependency try be compatible with older glibc. This is part of what makes conda nice, is that you're guaranteed to have this (well, as long as the dependency exists). It looks from that thread you linked that the conda package for tensorflow is all good on CentOS6 if installing from there for your build is doable.

Thanks again for helping tackle this; I look forward to working on actual fun things instead of compiling and porting.

@pichuan
Copy link
Collaborator

pichuan commented May 4, 2018

  1. @chapmanb : I think this is what you're looking for! @depristo pointed it out me, and I felt dumb for not thinking just to edit the WORKSPACE (and instead I linked the file to the "right place" instead).
    In the WORKSPACE file of DeepVariant, you can see this at the bottom.
    I tried changing the path:
new_local_repository(
    name = "clif",
    build_file = "third_party/clif.BUILD",
    path = "/home/pichuan",
)

And I make sure the two files are there:

$ ls /home/pichuan/clif/bin/
pyclif  pyclif_proto

After this change, it seems to run past the part where it can't find clif! Basically the missing input file '@clif//:clif/bin/pyclif_proto' error was no longer there after this change.

  1. You're right -- I just tried installing TensorFlow with conda install tensorflow on CentOS6. It's so easy and smooth. That's great. However, I'm not sure which directory I should point to as a replacement for the pointer in our WORKSPACE file:
# Import tensorflow.  Note path.
local_repository(
    name = "org_tensorflow",
    path = "../tensorflow",
)

So I'm currently block on that. Maybe you'll have better luck once you get past 1). Please let me know.

@chapmanb
Copy link
Author

chapmanb commented May 6, 2018

Pi-Chuan;
Awesome, thanks so much for the WORKSPACE path tip. That was perfect and exactly solved the clif issue. Nice one.

The issue I'm running into now is that the build setup assumes that the libraries and include files are available in standard locations (/usr, I'm guessing) and within conda these will be inside the conda environment. So as soon as I compile htslib we get errors about now finding zlib.h, which is present in $PREFIX/include instead of /usr/include.

I've tried hacking this include directory into the htslib copts:

sed -i.bak "s|\"-Wno-error\",|\"-Wno-error\", \"-I${PREFIX}/include\",|" third_party/htslib.BUILD

but bazel is too smart and won't let us continue with non-bazel defined references:

(00:28:31) ERROR: /home/conda/.cache/bazel/_bazel_conda/b3bf6b0de2935c6a10ef1e7d7b61873f/external/htslib/BUILD.bazel:209:1: in cc_library rule @htslib//:htslib: The include path '/opt/conda/conda-bld/deepvariant_1525566343740/_b_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/include' references a path outside of the execution root.

So at this point I'm stuck by my lack of knowledge of how to incorporate this into the bazel build instructions. I couldn't find any conda bazel builds that already do this as a template and am not familiar enough with it to build up on my own.

Would it be possible to make the dependencies you're installing with apt as explicit bazel targets like clif? If so, then I could adjust paths to the conda $PREFIX rather than /usr. What do you think about that approach? Other bazel tips/tricks would be very welcome. Thanks again for helping with this.

@pichuan
Copy link
Collaborator

pichuan commented May 7, 2018

@chapmanb
I'm actually pretty new to this whole build thing myself. And I'm not really that familiar with bazel myself.
Do you have some instructions on how to reproduce all the way to the errors you hit? Having that will be useful for me to try to figure this out.

And, I'll try to see if I find some bazel experts internally to look at your questions as well. Maybe this is a very trivial question for people who have seen it before...

@chapmanb
Copy link
Author

chapmanb commented May 8, 2018

Pi-Chuan;
Thanks for checking around to see if we can get a bazel expert involved. I think that would be the best way forward, as I'm just hacking around and don't have a strong understanding of how best to attack this. It's a general issue where the dependencies are present in a non-system directory and how to inject that into a build.

I'm trying to build this inside of bioconda. If you want to get it setup there are instructions here: https://bioconda.github.io/contributing.html and I could share the current recipe I'm working from. Although I don't want to make you wade into a new build system and get familiar with that if we can get more high level bazel advice and sort through improving the DeepVariant build process to handle this case.

@pichuan
Copy link
Collaborator

pichuan commented May 8, 2018

@chapmanb In terms of finding zlib when building with bazel, I wonder if things like these are useful:
tensorflow/tensorflow#2536
bazelbuild/bazel#1353

I wonder if directly asking on the bazel GitHub issues is the best way:
https://github.com/bazelbuild/bazel/issues
Since I'm not familiar with bazel, the only way I would be able to help is to probably repeat what you did and see if I can get unstuck.

Do you mind asking on Bazel issues first?

@jerowe
Copy link

jerowe commented May 19, 2018

+1 for work being done. Thanks!

I cannot use the binaries:

ImportError: /lib64/libm.so.6: version `GLIBC_2.23' not found (required by /tmpdata/Bazel.runfiles_52e5Mr/runfiles/com_google_deepvariant/third_party/nucleus/io/python/../../../../_solib_k8/libexternal_Shtslib_Slibhtslib.so)

On CentOS Linux release 7.2.1511 (Core), HPC cluster if that makes a difference.

@jerowe
Copy link

jerowe commented May 19, 2018

FYI @bgruening - I don't see deepvariant on the galaxy repo.

https://depot.galaxyproject.org/singularity/ <- WHERE HAS THIS BEEN ALL MY LIFE?

@bgruening
Copy link

@jerowe what do you mean with Galaxy repos? The Singularity image store?

@jerowe
Copy link

jerowe commented May 22, 2018

@bgruening , yes I mean the singularity stores. There is no deep variant in there. ;-(

I almost have clif built as a conda package, but its kind of hacky.

@chapmanb
Copy link
Author

Jillian;
Thanks for offering to help with this. Unfortunately we're still stuck on building with bazel using conda dependencies. I have a build that works around the clif issue, but don't know how to modify the DeepVariant bazel build infrastructure so that we can use an arbitrary prefix. Right not it assumes libraries are present in /usr so fails to pick up the anaconda zlib and friends. We've been trying to identify someone who knows bazel to help walk us through how to change the DeepVariant build to support this.

Until we can get a native CentOS6 build it unfortunately will have issues on CentOS due to compiling against more recent glibc on Ubuntu in the pre-built binaries.

If you have any bazel expertise or want to dig into this, that would be much appreciated.

@jerowe
Copy link

jerowe commented May 23, 2018

@chapmanb , I might be able to help with the bazel builds, and if not I have some other talented folks around who could possibly be bribed. ;-)

Do you have a start on it somewhere?

chapmanb added a commit to chapmanb/bioconda-recipes that referenced this issue May 23, 2018
Currently blocked on not detecting conda installed libraries (zlib and
friends). See discussion in google/deepvariant#29
@chapmanb
Copy link
Author

Jillian;
Awesome, thanks so much. Here is a branch with where we're at right now:

https://github.com/chapmanb/bioconda-recipes/tree/deepvariant-compile/recipes/deepvariant

Lots of hacking in there to reference the conda python with pyclif but that works and then should get stuck on not detecting zlib during the htslib compile.

Let me know if you have any questions and thanks again for helping with this.

@pichuan
Copy link
Collaborator

pichuan commented Aug 7, 2019

Hi all,
I know that this was not fully resolved, but by having it open forever also doesn't seen very effective.
I'm going to close it for now. But please do feel free to comment here. I will continue to read and reply anything here.
If there are suggestions on how to re-engage this effort, also feel to let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants