Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rcpp is inconsistent with RInternals integers types #369

Closed
fplaza opened this issue Sep 8, 2015 · 5 comments
Closed

Rcpp is inconsistent with RInternals integers types #369

fplaza opened this issue Sep 8, 2015 · 5 comments

Comments

@fplaza
Copy link
Contributor

@fplaza fplaza commented Sep 8, 2015

Rcpp is inconsistent with RInternals integers types.
At some places in the code, 'size_t' and 'int' are used instead of 'R_xlen_t '

For all I know, it causes the following issues:

  • inability of creating in Rcpp a matrix with more than 2^31-1 elements(runtime error)
  • inability of querying a matrix with more than 2^31-1 (segfault caused by an integer overflow)

I proposed a time ago some patches #338
Some of them were subject of discussion because they changed the Rcpp interface.
However, some changes are mandatory to solve this issue:

Dimension::prod

From

inline int prod() const {
    return std::accumulate( dims.begin(), dims.end(), 1, std::multiplies<int>() ) ;
}

To

inline R_xlen_t prod() const {
    return std::accumulate( dims.begin(), dims.end(), static_cast<R_xlen_t>(1), std::multiplies<R_xlen_t>() )
}
Matrix::Matrix

From

    template <typename Iterator>
    Matrix( const int& nrows_, const int& ncols, Iterator start ) :
    VECTOR( start, start + (nrows_*ncols) ),

To

     template <typename Iterator>
     Matrix( const int& nrows_, const int& ncols, Iterator start ) :
     VECTOR( start, start + (static_cast<R_xlen_t>(nrows_)*ncols) ),
Matrix::offset

From

inline R_xlen_t offset( int i, int j) const { return i + nrows * j ; }

To

inline R_xlen_t offset(const int i, const int j) const { return i + static_cast<R_xlen_t>(nrows) * j ; }  
Vector::offset

From

size_t offset(const size_t& i, const size_t& j) const {
         if( !::Rf_isMatrix(Storage::get__()) ) throw not_a_matrix() ;

         int *dim = dims() ;
         size_t nrow = static_cast<size_t>(dim[0]) ;
         size_t ncol = static_cast<size_t>(dim[1]) ;

         if( i >= nrow || j >= ncol ) throw index_out_of_bounds() ;

         return i + nrow*j ;
}

To

R_xlen_t offset(const int& i, const int& j) const {
         if( !::Rf_isMatrix(Storage::get__()) ) throw not_a_matrix() ;

         const int* dim = dims() ;
         const int nrow = dim[0] ;
         const int ncol = dim[1] ;

         if(i < 0|| i >= nrow || j < 0 || j >= ncol ) throw index_out_of_bounds() ;

         return i + static_cast<R_xlen_t>(nrow)*j ;
}
Vector::offset

From

size_t offset(const size_t& i) const {
         if( static_cast<R_xlen_t>(i) >= ::Rf_xlength(Storage::get__()) ) throw index_out_of_bounds() ;

         return i ;
}

To

R_xlen_t offset(const R_xlen_t& i) const {
       if(i < 0 || i >= ::Rf_xlength(Storage::get__()) ) throw index_out_of_bounds() ;
         return i ;
}
@thirdwing
Copy link
Member

@thirdwing thirdwing commented Sep 8, 2015

I think first two are enough to fix your problem, right?

@fplaza
Copy link
Contributor Author

@fplaza fplaza commented Sep 8, 2015

The last is probably not necessary to solve my problem. However, this might solve an issue while calling the 'Vector::at( const size_t& i)' method with large vectors.

The third change is required because 'Matrix::at' calls 'Vector::offset '
EDIT: I just remembered that the third change is already merged (see PR #342)

@thirdwing
Copy link
Member

@thirdwing thirdwing commented Sep 8, 2015

Yes. Please set a new PR, which will make our discussion easier.

But I don't think I have access to a machine with enough memory for testing.

@eddelbuettel
Copy link
Member

@eddelbuettel eddelbuettel commented Sep 8, 2015

I do. But this needs more discussion / calibration against 'pure R'. I recall vaguely that 'rows time columns' is allowed at more than 2^31-1, hence R_xlen_t for counting, Each row / col index is still capped. Correct, or am I hallucinating?

@fplaza
Copy link
Contributor Author

@fplaza fplaza commented Sep 8, 2015

I guess you are correct

> v=numeric(2**31)
> length(v)
[1] 2147483648
> m=matrix(nrow=2**31, ncol=1)
Error in matrix(nrow = 2^31, ncol = 1) :
  invalid 'nrow' value (too large or NA)
In addition: Warning message:
In matrix(nrow = 2^31, ncol = 1) :
  NAs introduced by coercion to integer range

The changes are consistent with this behaviour.

@fplaza fplaza closed this Sep 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.