Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Widen int to 64-bit signed integer in generated code #2381

Closed
bob-carpenter opened this issue Aug 14, 2017 · 16 comments
Closed

Widen int to 64-bit signed integer in generated code #2381

bob-carpenter opened this issue Aug 14, 2017 · 16 comments

Comments

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Aug 14, 2017

Summary:

Replace all the uses of int in the generated code with long (see below).

Description:

The current restriction to integers is hobbling our ability to do things like negative binomial or even Poisson random variates and to represent some inputs.

Care will have to be taken to make sure 32-bit architectures use 64-bit integers here. I'm not sure where we're at with Windows on this issue. If all our potential platforms do not treat long as 64 bits, we'll need to use 64-bit specific types. In C++11, we can use int_64_t, but we could also use the Boost equivalent, though that's seriously clunky if we use it everywhere compared to the language standard.

Additional Information:

All of the integer argument functions in in stan-dev/math should also be updated to accept long inputs.

Current Version:

v2.16.0

@bob-carpenter bob-carpenter added this to the v3 milestone Aug 14, 2017
@bob-carpenter bob-carpenter self-assigned this Aug 14, 2017
@bob-carpenter bob-carpenter changed the title Widen int to long int in generated code Widen int to 64-bit signed integer in generated code Sep 22, 2017
@bgoodri
Copy link
Contributor

bgoodri commented Oct 22, 2017

I almost have Stan Math converted in a safe-ish way (On a Mac, you have to put '' after each -i in sed)

git checkout -b feature/int_64_t
git grep -F -l '(int' stan | xargs sed -i 's/(int/(int64_t/g'
git grep -F -l 'long int' stan | xargs sed -i 's/long int/int64_t/g'
git grep -F -l 'inline int' stan | xargs sed -i 's/inline int/inline int64_t/g'
git grep -F -l '<int' stan | xargs sed -i 's/<int/<int64_t/g'
git grep -F -l '<long,' stan | xargs sed -i 's/<long,/<int64_t,/g'
git grep -F -l 'int>' stan | xargs sed -i 's/int>/int64_t>/g'
git grep -F -l 'int&' stan | xargs sed -i 's/int\&/int64_t\&/g'
git grep -F -l ' int ' stan | xargs sed -i 's/ int / int64_t /g'
sed -i 's/var(long x) : /var(int x) : /g' stan/math/rev/core/var.hpp

# Fix overzealous replacements
git grep -F -l 'unsigned int64_t' stan | xargs sed -i 's/unsigned int64_t/unsigned int/g'
sed -i 's/int64_t/int/g' stan/math/rev/core/operator_unary_decrement.hpp
sed -i 's/int64_t/int/g' stan/math/rev/core/operator_unary_increment.hpp
sed -i 's/int64_t/int/g' stan/math/fwd/core/fvar.hpp
sed -i 's/int64_t/int/g' stan/math/rev/scal/fun/boost_fpclassify.hpp
sed -i 's/int64_t/int/g' stan/math/rev/mat/fun/Eigen_NumTraits.hpp

sed -i 's/int64_t/int/g' stan/math/rev/mat/fun/cholesky_decompose.hpp
sed -i 's/(int64_t/(int/g' stan/math/rev/mat/functor/cvodes_utils.hpp

git grep -F -l "template" | xargs sed -i -e '/template/ s/int64_t/int/g'
sed -i 's/int64_t/int/g' stan/math/prim/mat/fun/mdivide_left_tri.hpp
sed -i 's/int64_t/int/g' stan/math/prim/mat/fun/mdivide_right_tri.hpp
sed -i 's/int64_t/int/g' stan/math/prim/mat/fun/trace_gen_inv_quad_form_ldlt.hpp
sed -i 's/int64_t/int/g' stan/math/rev/mat/fun/trace_gen_inv_quad_form_ldlt.hpp
sed -i 's/int64_t/int/g' stan/math/rev/mat/fun/trace_gen_quad_form.hpp
sed -i 's/int64_t/int/g' stan/math/prim/mat/err/check_matching_dims.hpp
sed -i 's/int64_t/int/g' stan/math/prim/mat/fun/matrix_exp_pade.hpp
sed -i 's/int64_t/int/g' stan/math/prim/mat/fun/MatrixExponential.h
sed -i 's/static_cast<int64_t>/static_cast<int>/g' stan/math/rev/mat/functor/integrate_ode_bdf.hpp
git checkout stan/math/rev/mat/functor/cvodes_ode_data.hpp
sed -i 's/<int>/<int64_t>/g' stan/math/rev/mat/functor/cvodes_ode_data.hpp

# These have `inline` and `int` on separate lines as the return type
sed -i 's/inline/inline int64_t/g' stan/math/fwd/scal/fun/is_inf.hpp
sed -i '/int$/d' stan/math/fwd/scal/fun/is_inf.hpp
sed -i 's/inline/inline int64_t/g' stan/math/fwd/scal/fun/is_nan.hpp
sed -i '/int$/d' stan/math/fwd/scal/fun/is_nan.hpp
sed -i 's/inline/inline int64_t/g' stan/math/prim/scal/fun/logical_lt.hpp
sed -i '/int$/d' stan/math/prim/scal/fun/logical_lt.hpp

# Remove now duplicative stuff
git checkout stan/math/prim/scal/meta/ad_promotable.hpp
sed -i 's/A long may be promoted to a double./An int64_t may be promoted to a double/' stan/math/prim/scal/meta/ad_promotable.hpp
sed -i 's/<long, double>/<int64_t, double>/' stan/math/prim/scal/meta/ad_promotable.hpp

# Header stuff
git grep -F -l "int64_t" | xargs sed -i 's/^namespace stan/#include <cstdint>\n\nnamespace stan/'
sed -i 's/#define STAN_MATH_PRIM_MAT_FUN_EIGEN_HPP/#define STAN_MATH_PRIM_MAT_FUN_EIGEN_HPP\n#define EIGEN_DEFAULT_DENSE_INDEX_TYPE int64_t/' stan/math/prim/mat/fun/Eigen.hpp
sed -i '/Eigen.hpp/d' stan/math/prim/mat.hpp
sed -i 's/#define STAN_MATH_PRIM_MAT_HPP/#define STAN_MATH_PRIM_MAT_HPP\n\n#include <stan\/math\/prim\/mat\/fun\/Eigen.hpp>/' stan/math/prim/mat.hpp
sed -i 's/#define STAN_MATH_REV_MAT_HPP/#define STAN_MATH_REV_MAT_HPP\n\n#include <stan\/math\/prim\/mat\/fun\/Eigen.hpp>/' stan/math/rev/mat.hpp
sed -i 's/#define STAN_MATH_FWD_MAT_HPP/#define STAN_MATH_FWD_MAT_HPP\n\n#include <stan\/math\/prim\/mat\/fun\/Eigen.hpp>/' stan/math/fwd/mat.hpp

# tests
git grep -F -l '<int>' test | xargs sed -i 's/<int>/<int64_t>/g'
git grep -F -l "static double apply_base(int x) {" test/ | xargs sed -i 's/static double apply_base(int x) {/static double apply_base(int64_t x) {/'

The remaining occurrences of the "word" int can be found via

git grep -F -n -w "int" stan

@bob-carpenter
Copy link
Contributor Author

Wow, that's awesome. I think we should be able to patch this up pretty easily. Not sure how to verify that the tests do what we want, but I suppose we should just trust our unit and integration tests at this point. @syclik ---you've done a lot of this before---any ideas?

We'll have to cleanup the residuals---assigning an int64_t to an int will be just fine by C++, but we want to eliminate all of that. So I'm surprised Eigen is balking at all here.

I'm pretty sure Eigen uses ptrdiff_t for indexing by default, but that's one of those platform-dependent types, as opposed to int64_t (to which it resolves on most modern platforms). We might have some badly behaved uses of Eigen that the default won't catch---can't we define that in the makefile?

@syclik
Copy link
Member

syclik commented Oct 23, 2017 via email

@bgoodri
Copy link
Contributor

bgoodri commented Oct 25, 2017

Okay, I can trigger this outside of Stan by trying to compile

#define EIGEN_DEFAULT_DENSE_INDEX_TYPE int64_t

#include <Eigen/Dense>
#include <iostream>

template <int64_t R, int64_t C>
inline double dot_self(const Eigen::Matrix<double, R, C>& v) {
  return v.squaredNorm();
}

int main() {
  Eigen::VectorXd x(3);
  x.setRandom();
  std::cout << "norm = " << dot_self(x) << std::endl;
  return 0;
}

I get:

$ g++ -O0 -g -I/usr/include/eigen3 -o eigen_test.o eigen_test.cpp
eigen_test.cpp: In function ‘int main()’:
eigen_test.cpp:14:39: error: no matching function for call to ‘dot_self(Eigen::VectorXd&)’
   std::cout << "norm = " << dot_self(x) << std::endl;
                                       ^
eigen_test.cpp:7:15: note: candidate: template<long int R, long int C> double dot_self(const Eigen::Matrix<double, R, C>&)
 inline double dot_self(const Eigen::Matrix<double, R, C>& v) {
               ^
eigen_test.cpp:7:15: note:   template argument deduction/substitution failed:
eigen_test.cpp:14:39: note:   mismatched types ‘long int’ and ‘int’
   std::cout << "norm = " << dot_self(x) << std::endl;
                                       ^
eigen_test.cpp:14:39: note:   ‘Eigen::VectorXd {aka Eigen::Matrix<double, -1, 1>}’ is not derived from ‘const Eigen::Matrix<double, R, C>’

@bgoodri
Copy link
Contributor

bgoodri commented Oct 31, 2017

Upon further review, the problem seems to be although inserting

#define EIGEN_DEFAULT_DENSE_INDEX_TYPE int64_t

allows the rows and / or columns of a dynamically sized Eigen::Matrix to be int64_t, the constructor

Matrix<typename Scalar, int RowsAtCompileTime, int ColsAtCompileTime>

is hard-coded for ints and you cannot implicitly cast an int64_t to an int. Of course, RowsAtCompileTime and ColsAtCompileTime only take the values -1 or 1 in Stan, so I guess the thing to do is not change from int to int64_t in those templates?

@bob-carpenter
Copy link
Contributor Author

Yes, it sounds like the template types should remain int as you wrote in the second code block above. I hadn't even thought about casting in template parameter type matching.

@bgoodri
Copy link
Contributor

bgoodri commented Nov 1, 2017

This is almost working now. Specifically, make test-headers works. There are some test failures under test/unit/math/fwd and test/unit/math/prim/scal/fun pertaining to promotion. Such as

./stan/math/prim/scal/fun/log_modified_bessel_first_kind.hpp:241:22: error: call of overloaded ‘log1p(const int&)’ is ambiguous
         lgam += log1p(v);
                 ~~~~~^~~
In file included from ./stan/math/prim/scal/fun/log1m.hpp:4:0,
                 from ./stan/math/prim/scal/fun/binary_log_loss.hpp:4,
                 from ./stan/math/prim/scal.hpp:67,
                 from test/unit/math/prim/scal/prob/von_mises_log_test.cpp:1:
./stan/math/prim/scal/fun/log1p.hpp:27:19: note: candidate: double stan::math::log1p(double)
     inline double log1p(double x) {
                   ^~~~~
./stan/math/prim/scal/fun/log1p.hpp:40:19: note: candidate: double stan::math::log1p(int64_t)
     inline double log1p(int64_t x) {
                   ^~~~~

So, it looks like if you call a function with an int it does not know whether to promote it to a double or an int64_t. This can be resolved by deleting the function signature for inline double log1p(int64_t x) (and a bunch of similar specializations of functions that are defined in cmath), but as I recall, we recently put those in to avoid ambiguous behavior.

@bob-carpenter
Copy link
Contributor Author

bob-carpenter commented Nov 1, 2017 via email

@bgoodri
Copy link
Contributor

bgoodri commented Nov 1, 2017 via email

@bob-carpenter
Copy link
Contributor Author

bob-carpenter commented Nov 1, 2017 via email

@bgoodri
Copy link
Contributor

bgoodri commented Nov 1, 2017 via email

@bgoodri
Copy link
Contributor

bgoodri commented Nov 1, 2017

Also, there are issues converting an int to a fvar<>

In file included from ./stan/math/fwd/scal.hpp:58:0,
                 from ./stan/math/mix/scal.hpp:19,
                 from test/unit/math/mix/scal/fun/lmgamma_test.cpp:1:
./stan/math/fwd/scal/fun/lmgamma.hpp: In instantiation of ‘stan::math::fvar<typename stan::return_type<T, long int>::type> stan::math::lmgamma(int64_t, const stan::math::fvar<T>&) [with T = stan::math::fvar<stan::math::var>; typename stan::return_type<T, long int>::type = stan::math::fvar<stan::math::var>; int64_t = long int]’:
test/unit/math/mix/scal/fun/lmgamma_test.cpp:46:35:   required from here
./stan/math/fwd/scal/fun/lmgamma.hpp:19:9: error: conversion from ‘int’ to non-scalar type ‘stan::math::fvar<stan::math::var>’ requested
       T deriv = 0;
         ^~~~~

I think this has to do with the constructor not knowing how to promote:

      template <typename V>
      fvar(const V& v,
           typename boost::enable_if_c<ad_promotable<V, T>::value>::type*
           dummy = 0)
        : val_(v), d_(0.0) {
        if (unlikely(is_nan(v)))
          d_ = v;
      }

@bgoodri
Copy link
Contributor

bgoodri commented Nov 1, 2017

I think I fixed the ad_promotable thing.

@bob-carpenter
Copy link
Contributor Author

If `x` is an `int64_t`, then the compiler should know to promote that to a double because it can't demote that to an int.

With C++, you can go both ways:

  int64_t a = 10;
  int b = 5;
  a = b;
  b = a;

@bob-carpenter
Copy link
Contributor Author

I think I fixed the ad_promotable thing.
That'd be great. We need to do better doc and edge cases on all of these---@seantalts and @bbbales2 have been cleaning some of them up.

@rok-cesnovar
Copy link
Member

This is a code-gen issue that I am not sure is still relevant with stanc3 anymore, but feel like it is not. Please reopen and we can move it if you feel otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants