New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Stan Math Support to Shogun #4216
Conversation
@@ -0,0 +1,18 @@ | |||
include(ExternalProject) | |||
ExternalProject_Add( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should fix the commit sha, instead of following the master HEAD. basically set
GIT_TAG
and set it to the last stable release of stan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set up the GIT_TAG as required
src/shogun/CMakeLists.txt
Outdated
include(external/StanMath) | ||
SHOGUN_INCLUDE_DIRS(SCOPE PUBLIC SYSTEM | ||
$<BUILD_INTERFACE:${STAN_INCLUDE_DIR_STAN_MATH}> | ||
$<INSTALL_INTERFACE:include/shogun/lib/external/stan1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the stan1
stan2
stan3
stands for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the names to be more descriptive.
great! there are some minor improvements needed |
@vigsterkr Done with the required improvements. |
@FaroukY lemme check why do the CIs fail |
@FaroukY does this work for you locally? maybe because boost is in your include paths? as you can see from here this does not compile: stan misses boost... lemme try to pull these commits and see locally what's happening and get back to you |
indeed the problem is that - i've just tested this with a machine that has boost installed and it worked fine... i'll test and try to come up with a fix in a clean docker container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's definitely a problem with INSTALL files missing.
src/shogun/CMakeLists.txt
Outdated
include(external/StanMath) | ||
SHOGUN_INCLUDE_DIRS(SCOPE PUBLIC SYSTEM | ||
$<BUILD_INTERFACE:${STAN_INCLUDE_DIR_STAN_MATH}> | ||
$<INSTALL_INTERFACE:include/shogun/lib/external/Stan> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who and what will install those the folder here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would actually start a new dir: include/shogun/third_party
and start putting stuff there that are external... we have too many things in lib
src/shogun/CMakeLists.txt
Outdated
) | ||
SHOGUN_INCLUDE_DIRS(SCOPE PUBLIC SYSTEM | ||
$<BUILD_INTERFACE:${STAN_INCLUDE_DIR_BOOST}> | ||
$<INSTALL_INTERFACE:include/shogun/lib/external/Stan_Boost> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who and what will install those the folder here?
src/shogun/CMakeLists.txt
Outdated
) | ||
SHOGUN_INCLUDE_DIRS(SCOPE PUBLIC SYSTEM | ||
$<BUILD_INTERFACE:${STAN_INCLUDE_DIR_CVODES}> | ||
$<INSTALL_INTERFACE:include/shogun/lib/external/Stan_Cvodes> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who and what will install those the folder here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please rebase this over latest develop and change this to include/shogun/third_party/Stan_Cvodes
cmake/external/StanMath.cmake
Outdated
add_dependencies(libshogun StanMath) | ||
|
||
set(STAN_INCLUDE_DIR_STAN_MATH ${CMAKE_BINARY_DIR}/StanMath/src/StanMath) | ||
set(STAN_INCLUDE_DIR_BOOST ${CMAKE_BINARY_DIR}/StanMath/src/StanMath/lib/boost_1.66.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be boost_1.64.0
@vigsterkr I think the problem is indeed that I had boost in my path. Can you share the Dockerfile that you used to test this? I want to make sure that I get it working on a general machine before the I push my commits. |
Since we can find weights from [1 1 1]^T to [1 1]^T in a perceptron, | ||
this error should be very close to zero after 100 epochs. | ||
*/ | ||
TEST(StanPerceptronTest, sample_perceptron) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while I like this test, I think we should have something super simple. Like the gradient of the gaussian kernel wrt to bandwidth, left/right argument. Something that we know in closed form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But please de-prioritise the test until all the cmake stuff is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test is very fine, lets concentrate on the actual inclusion of the lib
} | ||
|
||
//Error should be very close to 0.0 | ||
EXPECT_NEAR(last_error, 0.0, 1e-6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might fail randomly if the data is not separable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but in this case, there is only one data point ( [1 1 1]^T) so the data is linearly separable (in an infinite number of ways) so it shouldn't fail no?
std::random_device rd{}; | ||
std::mt19937 gen{rd()}; | ||
normal_distribution<> d{0,1}; | ||
Matrix<var, 2, 3> W1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use N and D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you mean here. What's N and D? Do you mean the variable names?
std::mt19937 gen{rd()}; | ||
normal_distribution<> d{0,1}; | ||
Matrix<var, 2, 3> W1; | ||
for(int i=0; i<2; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change it.
{ | ||
error += (outputs(i,0)-1)*(outputs(i,0)-1); //square error | ||
} | ||
error.grad(); //Calculate the gradient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
superfluous comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will get rid of it.
Quite cool to have stan in. Let's get going with this patch! |
@FaroukY i've use the same docker image that is being used for travis CI jobs: |
@FaroukY basically the problems are the following:
|
@@ -0,0 +1,73 @@ | |||
#include <gtest/gtest.h> | |||
#include <stan/math.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not so sure if this is the best way... namely that you are including stan like this.
meaning what if for some reason stan/math.hpp
is in the include path already... probably the -isystem
inclusion will take care of this, but maybe it'd be better that we explicitly specify in the include path that it's shogun's third-party include path... on the other hand probably this will actually break everything else within stan's headers....
it's more important actually that stan usage (or a model that is actually using stan) is being tested from a libshogun example. namely that the cmake target flags are set correctly when shogun is being used from a third-party lib, or just a simple CLI. note that it requires that either you:
- either include stan only in cpp files of shogun, or
- you expose the include flags in ShogunTarget.cmake; currently it's
PUBLIC SYSTEM
which should (? does it really? should be tested) reflect the correct exposure of include flags.... but i'm not so sure if we want to expose stan to the public, i would presume it's safe to think that we only use stan in the implementation of the models.
@FaroukY any update? |
@vigsterkr In progress. I just set up my environment with Docker (was new to it) and started changing the cmake and fixing the problems. I am a bit tied down these coming few days but I am working a bit on it everyday. Also was handling the other prs. |
@vigsterkr It appears that the only thing that was wrong here is that I miswrote the boost directory from 1.66 to 1.64. Once I changed 1.66 to 1.64, the build compiled and finished. For now, I will:
Also, it looks like there are two jobs in Travis which timed out (one during the testing phase) so we can assume the build worked for it. |
@FaroukY there's still a major part missing: installation of the Stan headers in the right folder. the tests are working because you rely on the fact that everything is available in build time. but once you ship this, the stan related directories are not going to be available as you dont copy them with |
@vigsterkr I am new to cmake so I'm a bit confused here. When I make and make install on my Docker machine, this seems to build and install correctly. What do you mean by "but once you ship this, the stan related directories are not going to be available" and how can I test this. Also, any guidance on what the next steps might be would be super helpful since I've been reading on Cmake for quite a few days now and I'm not sure where to go from here. |
@FaroukY ok so when you do |
@vigsterkr Yes, just tried it. I have the file directories |
@vigsterkr Actually it was my mistake. I forgot to commit the changes to this branch facepalm so now it has the Install commands. Should be okay now. Installs the 3 libraries into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there has been some changes in develop... see my comments
src/shogun/CMakeLists.txt
Outdated
|
||
SHOGUN_INCLUDE_DIRS(SCOPE PUBLIC SYSTEM | ||
$<BUILD_INTERFACE:${STAN_INCLUDE_DIR_STAN_MATH}> | ||
$<INSTALL_INTERFACE:include/shogun/lib/external/Stan> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please rebase this over latest develop and change this to include/shogun/third_party/Stan
src/shogun/CMakeLists.txt
Outdated
) | ||
SHOGUN_INCLUDE_DIRS(SCOPE PUBLIC SYSTEM | ||
$<BUILD_INTERFACE:${STAN_INCLUDE_DIR_BOOST}> | ||
$<INSTALL_INTERFACE:include/shogun/lib/external/Stan_Boost> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please rebase this over latest develop and change this to include/shogun/third_party/Stan_Boost
src/shogun/CMakeLists.txt
Outdated
) | ||
SHOGUN_INCLUDE_DIRS(SCOPE PUBLIC SYSTEM | ||
$<BUILD_INTERFACE:${STAN_INCLUDE_DIR_CVODES}> | ||
$<INSTALL_INTERFACE:include/shogun/lib/external/Stan_Cvodes> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please rebase this over latest develop and change this to include/shogun/third_party/Stan_Cvodes
cmake/external/StanMath.cmake
Outdated
set(STAN_INCLUDE_DIR_CVODES ${CMAKE_BINARY_DIR}/StanMath/src/StanMath/lib/cvodes_2.9.0/include) | ||
|
||
|
||
INSTALL( DIRECTORY ${STAN_INCLUDE_DIR_STAN_MATH} DESTINATION include/shogun/lib/external/Stan ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this you should actually copy these files into ${THIRD_PARTY_INCLUDE_DIR}/Stan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these installs.... if you do a cmake copy thing then you will not have problems with the logs in travis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check how COMMAND ${CMAKE_COMMAND} -E copy_if_different
is being used in shogun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the best place to do this copy would be ExternalProject_Add
's INSTALL_COMMAND
@vigsterkr Did the required changes and its not rebased with develop. However, I had a look at travis and its failed because the log messages of installing Stan is exceeding 4Mb (Since there are many files that are being installed). But the build itself is building fine. |
@FaroukY yeah first we need a rebase over develop and then fix those things i've mentioned i'll think about how to fix travis limitations. |
… the internal boost and cvodes that stan uses
@vigsterkr Did the required changes and the problem is not there anymore. However, I'm failing one of the tests: |
@FaroukY the segfault is unrelated, there's currently a bug in develop branch, so we can ignore that |
@FaroukY please check/fix the errors that clang-format reports, see: https://travis-ci.org/shogun-toolbox/shogun/jobs/375012262#L722 |
@vigsterkr Done, clang-format passes now. |
In this example, we want to learn the weights W such that the square | ||
Error loss from the output of the perceptron to [1 1]^T is minimized. | ||
Since we can find weights from [1 1 1]^T to [1 1]^T in a perceptron, | ||
this error should be very close to zero after 100 epochs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat job documenting!
std::mt19937 gen{rd()}; | ||
normal_distribution<> d{0, 1}; | ||
Matrix<var, 2, 3> W1; | ||
for (int i = 0; i < 2; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to update, just informational. A small improvement would be to query the matrix dimensions instead of hardcoding them in the loops.
@FaroukY perfect! merging! |
I've added the ability to now use the Stan Math library in Shogun. I've also written a unit test to make sure it compiles and works properly (The unit test is a perceptron learning algorithm with gradient descent)
The CmakeLists.txt was changed to add the include directories of stan headers, and the last file is the unit test.