Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rcpp::algorithm proposal #426

Closed
dcdillon opened this issue Jan 18, 2016 · 10 comments
Closed

Rcpp::algorithm proposal #426

dcdillon opened this issue Jan 18, 2016 · 10 comments

Comments

@dcdillon
Copy link
Contributor

A thought about how we could give the end user some more flexibility with respect to sugar (and additionally reimplement sugar with the more flexible functions). This is a proposal for an addition and not for changing any existing functionality. It goes something like this...If we add an include file that looks something like this:

namespace Rcpp
{
namespace algorithm
{
    template< typename InputIterator >
    typename std::iterator_traits< InputIterator >::value_type sum(InputIterator begin, InputIterator end)
    {
        typename std::iterator_traits< InputIterator >::value_type start = *begin++;

        while (begin != end)
        {
                start += *begin++;
        }

        return start;
    }

    template< typename InputIterator >
    typename std::iterator_traits< InputIterator >::value_type prod(InputIterator begin, InputIterator end)
    {
        typename std::iterator_traits< InputIterator >::value_type start = *begin++;

        while (begin != end)
        {
                start *= *begin++;
        }

        return start;
    }

    template< typename InputIterator, typename OutputIterator >
    void log(InputIterator begin, InputIterator end, OutputIterator out)

    {
        while (begin != end)
        {
                *out = std::log(*begin++);
                ++out;
        }
    }
}
}

Then we can write functions that look like this:

#include <Rcpp.h>

// [[Rcpp::export]]
double mySum(Rcpp::NumericVector v, int begin, int end)
{
        return Rcpp::algorithm::sum(v.begin() + (begin - 1), v.begin() + end);
}

// [[Rcpp::export]]
double myProd(Rcpp::NumericVector v, int begin, int end)
{
        return Rcpp::algorithm::prod(v.begin() + (begin - 1), v.begin() + end);
}

// [[Rcpp::export]]
Rcpp::NumericVector myLog(Rcpp::NumericVector v)
{
        Rcpp::NumericVector x = Rcpp::clone(v);
        Rcpp::algorithm::log(v.begin(), v.end(), x.begin());
        return x;
}

Which produce output like this:

> library(Rcpp)
> sourceCpp("test.cpp")
> mySum(1:4, 1, 3)
[1] 6
> myProd(1:4, 2, 4)
[1] 24
> myLog(1:4)
[1] 0.0000000 0.6931472 1.0986123 1.3862944
> 

Currently, most sugar functions only work on stuff that IS a Rcpp::Vector. This excludes, for instance, Rcpp::Matrix::Row. Creating range based algorithms would simply give more flexibility to the end user without removing any of the current functionality.

@thirdwing
Copy link
Member

👍

@dcdillon
Copy link
Contributor Author

So I'll plan to take a look at this and have a simple PR with a couple of algos sometime soon. We can then discuss what all we would like to see implemented like this.

@eddelbuettel
Copy link
Member

Sounds good to me.

@nathan-russell
Copy link
Contributor

Mostly an implementation detail, but since some of the sugar operations (cummax, cummin, any, all, ...) need to keep state, would it make sense to have a dedicated namespace of function objects, e.g. something like

#include <Rcpp.h>

namespace algorithm {
namespace Functors {

template <typename T>
class Cummax : public std::unary_function<T, T> {
public:
  enum { RTYPE = Rcpp::traits::r_sexptype_traits<T>::rtype };

private:
  bool unset;
  bool na_seen;
  T current;

public:
  Cummax() : unset(true), na_seen(false) {}

  inline T operator()(T value) {
    if (na_seen) {
      return Rcpp::traits::get_na<RTYPE>();
    }

    if (Rcpp::traits::is_na<RTYPE>(value)) {
      na_seen = true;
      return Rcpp::traits::get_na<RTYPE>();
    }

    if (!unset) {
      current = (value > current) ? value : current;
      return current;
    }

    unset = false;
    current = value;
    return current;
  }
};

} // Functors

template <typename InputIt, typename OutputIt>
void cummax(InputIt first, InputIt last, OutputIt dest) {
  typedef typename std::iterator_traits<InputIt>::value_type type;
  std::transform(first, last, dest, Functors::Cummax<type>());
}

} // algorithm

// [[Rcpp::export]]
Rcpp::NumericVector myCummax(Rcpp::NumericVector xx) {
  Rcpp::NumericVector x = Rcpp::clone(xx);
  algorithm::cummax(xx.begin(), xx.end(), x.begin());
  return x;
}

/*** R
set.seed(123); xx <- rpois(7, 25)

all.equal(cummax(xx), myCummax(xx))
#[1] TRUE

xx[5] <- NA
all.equal(cummax(xx), myCummax(xx))
#[1] TRUE

*/

@dcdillon
Copy link
Contributor Author

As opposed to just naming them things like cummax_helper or somesuch? I'm not quite sure why they need to be in a namespace, but I'm not opposed. Could you add a little more color?

@dcdillon
Copy link
Contributor Author

Anyhow, we merged the first version of this. Probably I should write up some documentation. Can anyone give me some direction on this? Otherwise it will just sit around anonymously and be unused. If people actually get some documentation and a chance to use it it may develop a reason to implement more things.

@eddelbuettel
Copy link
Member

Can anyone give me some direction on this?

How can we help? Do you want to know how to write vignettes? How to extend existing vignettes? Were you thinking of help pages? External examples in the Rcpp Gallery?

We have done any and all of the above.

@dcdillon
Copy link
Contributor Author

Not sure...what's the best "promotional" place to put it? I think it's a good idea, but without some promo work, it will likely die a horrible death.

I'm willing to explain why it's a good thing to use, but I need a place where that is appropriate. Perhaps Rcpp Gallery?

@kevinushey
Copy link
Contributor

I agree the Rcpp gallery would be a good place. https://github.com/RcppCore/rcpp-gallery/wiki/Contributing-to-the-Rcpp-Gallery should help you get started.

@coatless
Copy link
Contributor

This issue was addressed PR #481 (the second coming of PR #428).

Note: PR #503 addressed a long long issue reported in #502 regarding clang.

Rcpp Gallery Post:

http://gallery.rcpp.org/articles/rcpp-algorithm/

Please close. (Tagged #506)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants