Skip to content
This repository has been archived by the owner on May 10, 2024. It is now read-only.

PARQUET-267: Detach thirdparty code from build configuration. #16

Closed
wants to merge 7 commits into from

Conversation

kalaxy
Copy link
Contributor

@kalaxy kalaxy commented Jan 8, 2016

This is based off of pull request #14.

@kalaxy kalaxy changed the title Detach thirdparty code from build configuration. PARQUET-267: Detach thirdparty code from build configuration. Jan 8, 2016
- cd build
- THRIFT_HOME=$HOME/local cmake ..
- cd $HOME/build_dir
- THRIFT_HOME=$HOME/build_dir/thirdparty/installed/ SNAPPY_HOME=$HOME/build_dir/thirdparty/installed/ LZ4_HOME=$HOME/build_dir/thirdparty/installed/ cmake $TRAVIS_BUILD_DIR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now; at some point we can roll up this into a

./set-sandbox.env.sh

So if you want to use some other toolchain, you can do

./setup-my-toolchain.sh

which would be responsible for setting the right FOO_HOME variables (per the README instructions you added)

@kalaxy kalaxy force-pushed the libparquet-library-update-build branch from 72263e9 to d39fb4d Compare January 11, 2016 18:59
@kalaxy
Copy link
Contributor Author

kalaxy commented Jan 11, 2016

I've rebased off of the new master, supporting both mac and linux builds. Please have a look and let me know if you have other concerns.

@wesm, did I resolve your concern about the finding both static and dynamic libraries? These cmake modules support it, it just takes two calls.

@nongli
Copy link
Contributor

nongli commented Jan 12, 2016

I think it would be helpful to add something like this and update the README for the simple path (with some more error checking):

~/Projects/parquet-cpp $ cat setup.sh
#!/bin/bash

TP_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)

mkdir -p $TP_DIR/build
cp -r $TP_DIR/thirdparty $TP_DIR/build
cd $TP_DIR/build
./thirdparty/download_thirdparty.sh
./thirdparty/build_thirdparty.sh

export SNAPPY_HOME=$TP_DIR/build/thirdparty/installed
export LZ4_HOME=$TP_DIR/build/thirdparty/installed

cd $TP_DIR

I tried to build thrift with these steps and it didn't work (error below). If this works for you, I can try to track down what's wrong with my env (OSX). I'm also fine if we don't manage thrift in this patch.

n file included from src/thrift/concurrency/ThreadManager.cpp:22:
In file included from ./src/thrift/concurrency/ThreadManager.h:24:
./src/thrift/cxxfunctional.h:93:18: error: no member named 'bind' in namespace 'std'; did you mean 'find'?
    using ::std::bind;
          ~~~~~~~^~~~
                 find
/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/algorithm:862:1: note: 'find' declared here
find(_InputIterator __first, _InputIterator __last, const _Tp& __value_)
^
In file included from src/thrift/concurrency/ThreadManager.cpp:22:
./src/thrift/concurrency/ThreadManager.h:193:9: warning: class member cannot be redeclared [-Wredeclared-class-member]
  class Task;

@wesm
Copy link
Member

wesm commented Jan 12, 2016

@kalaxy it's be nice to adopt the approach that Kudu uses for finding both dynamic and static libraries and deciding which one to link outside the FindXXX.cmake files. See for example:

https://github.com/cloudera/kudu/blob/master/CMakeLists.txt#L702

It's pretty nice -- see the ADD_THIRDPARTY_LIB helper function. Ideally all of the FindXXX.cmake modules would find both kinds of libraries without having to call them twice.

Not necessary for right now but food for thought

@kalaxy
Copy link
Contributor Author

kalaxy commented Jan 12, 2016

@nongli, compiling thrift 0.9.1 on mac isn't very easy. You can see from homebrew's 0.9.1 recipe that they applied several patches to make it work. It also requires numerous dependencies. One in particular, openssl, is no longer shipped by default on OSX in the later versions of the OS. So even upgrading to the 0.9.3 would still require us to find (such as via homebrew paths) or install openssl ourselves on mac builds. As I've said before, it seems beyond the scope of a library to manage build dependencies like that. But, I understand that it seems pretty important to you. I think for now we might want to just disable thrift in the thridparty scripts on mac builds. (We could also disable the scripts on mac altogether since the alternative of brew install thrift lz4 snappy is so convenient and easy.)

I can add the setup script and update the readme. As far as more error checking, I can add checking for directory existance and set -e to ensure that any command failure causes the script to fail. Let me know if you were thinking of something more than that.

@kalaxy
Copy link
Contributor Author

kalaxy commented Jan 12, 2016

@wesm, yeah I you are convincing me that think finding both can make sense, especially since parquet-cpp will probably be using both. As a application writer you probably only care about one or the other and thus a generic module that branches finding the one you are interested in is probably good. But as a library writer, you probably want access to both for building a shared object library and a static object library, so why not have the cmake_module find both. If this were a generic module distributed with cmake, then I think going the first route is probably most standard, but as it is checked in and specific to this code then going the second route is more tailored to what this project needs.

In refinement, I think my main issue with the master cmake configuration is that it essentially hardcodes the thirdparty directory where both snappy and lz4 need to exist there. I wanted a module that could default to system libraries as well as individually configure lz4 and snappy to non-hard coded locations. I think I was trying to follow standard practices with respect to finding dynamic vs shared libraries, but maybe that was overkill for this project. I'll look into backing out the dynamic vs static changes as it seems more inline with this project's goals and add to this pull request.

@wesm
Copy link
Member

wesm commented Jan 12, 2016

@kalaxy sounds good and much appreciated

lz4 has been version bumped and is now downloaded via
download_thirdparty.sh

This also updates the lz4 api call since the checked in lz4 code
appears to have been extremely old.
@kalaxy kalaxy force-pushed the libparquet-library-update-build branch 2 times, most recently from 8dc9156 to f40f200 Compare January 12, 2016 22:44
@kalaxy
Copy link
Contributor Author

kalaxy commented Jan 12, 2016

I've updated the cmake_modules to find both dynamic and shared libraries. They are patterned after kudu's cmake_modules except they allow customization of where to find the libraries via env variables.

Also update travis ci to follow the same pattern as in the build
instructions.
@kalaxy kalaxy force-pushed the libparquet-library-update-build branch from f40f200 to d096c64 Compare January 12, 2016 22:53
@kalaxy
Copy link
Contributor Author

kalaxy commented Jan 12, 2016

@nongli, I took a closer look at your suggested script and other than adding set -e I'm not sure what error checking you were thinking of. Can you give some more guidance there?

@wesm
Copy link
Member

wesm commented Jan 12, 2016

@kalaxy awesome, looks great, no further comments here

# build thrift
if [ -n "$F_ALL" -o -n "$F_THRIFT" ]; then
if [ "$(uname)" == "Darwin" ]; then
echo "thrift compilation under OSX is not currenlty supported."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this exit 1? Does this mean the set up just doesn't work oob on osx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't quite sure how to handle this. I meant for the exit 1 to indicate that it couldn't install thift on OSX and if it were asked to do so then it would be an error. I suppose I could make it just a warning that doesn't fail the script. Does that sound better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, let's make this just a warning. We can turn it into an error if/when we building thrift always works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I turned this into an exit success, unless thrift is specified explicitly. I figured an error would still be accurate in that case. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

@wesm
Copy link
Member

wesm commented Jan 19, 2016

Looks good here, @nongli?

@asfgit asfgit closed this in 0149a7a Jan 19, 2016
@nongli
Copy link
Contributor

nongli commented Jan 19, 2016

Merged. Thanks!

@kalaxy kalaxy deleted the libparquet-library-update-build branch January 20, 2016 17:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants