quantile function is not available in sugar #967

rsparapa · 2019-06-04T19:46:32Z

First of all, Rcpp is amazing! After 10 years of working with it, I'm finally capable enough to make most of my wishes come true. One sugar function that is an obvious omission is quantile. I realize that we can't have every single function provided by R within Rcpp. But, this seems like an obvious low-hanging fruit. If someone could give me some pointers on what the best way to implement it is, then I would be happy to contribute a version to Rcpp. Thanks

eddelbuettel · 2019-06-04T19:51:35Z

Let's do it, maybe even for subsets of R functionality. Because in R there are nine (!!) ways to do it, see help(quantile) and look for type.

The R source is here: https://github.com/wch/r-source/blob/trunk/src/library/stats/R/quantile.R I guess we'd need to figure want parts you a) really need and b) which ones are actually worth redoing in C(++).

rsparapa · 2019-06-05T15:03:06Z

Ok, I have type=7 coded and it is trivial (several hours of debugging still counts as trivial, right ;o) Three other functions are needed which are potential sugar additions: floor, ceiling and sort (unless I missed them somehow). However, my template foo is insufficient to sweeten those so I have simple workarounds builtin. Here is a minimal working example...

library(Rcpp)

sourceCpp(code='
#include <Rcpp.h>

/*
// what am I doing wrong here?
template <typename T>
class Qfloor : public std::unary_function<T,T> {
public:
T operator()( T t) const { return std::floor(t) ; }
} ;

template <typename T>
class Qceiling : public std::unary_function<T,T> {
public:
T operator()( T t) const { return std::ceil(t) ; }
} ;
*/

Rcpp::IntegerVector Qfloor(Rcpp::NumericVector x) {
  size_t N=x.size();
  Rcpp::IntegerVector y(N);
  for(size_t i=0; i<N; ++i) y[i]=std::floor(x[i]);
  return y;
}

Rcpp::IntegerVector Qceiling(Rcpp::NumericVector x) {
  size_t N=x.size();
  Rcpp::IntegerVector y(N);
  for(size_t i=0; i<N; ++i) y[i]=std::ceil(x[i]);
  return y;
}

Rcpp::NumericVector Qsort(Rcpp::NumericVector x) {
  Rcpp::NumericVector y = Rcpp::clone(x);
  std::sort(y.begin(), y.end());
  return y;
}

// [[Rcpp::export]]
Rcpp::NumericVector Quantile(Rcpp::NumericVector x,
                             Rcpp::NumericVector probs) {
  // implementation of type 7
  const int n=x.size(), np=probs.size();
  if(n==0) return x;
  if(np==0) return probs;
  /*
    int _np=_probs.size(), np;
    np = _np==0 ? 5 : _np;
    Rcpp::NumericVector probs(np);
    if(_np==0) {
    probs[0]=0.00; probs[1]=0.25; probs[2]=0.50; probs[3]=0.75; probs[5]=1.00;
    }
    Rcpp::IntegerVector lo(Rcpp::sapply(index, Qfloor<double>() )), 
    hi(Rcpp::sapply(index, Qceiling<double>() )); 
  */
  Rcpp::NumericVector index=(n-1.)*probs, y=Qsort(x), x_hi(np), qs(np);
  Rcpp::IntegerVector lo(Qfloor(index)), hi(Qceiling(index));

  for(size_t i=0; i<np; ++i) {
    qs[i]=y[lo[i]];
    x_hi[i]=y[hi[i]];
    if((index[i]>lo[i]) && (x_hi[i] != qs[i])) {
      double h;
      h=index[i]-lo[i];
      qs[i]=(1.-h)*qs[i]+h*x_hi[i];
    }
  }

  return qs;
  //return Rcpp::wrap(qs);
}
')

.quantile <-
    function(x, probs = seq(0, 1, 0.25), na.rm = FALSE, names = TRUE,
             type = 7, ...)
{
    if(is.factor(x)) {
	if(is.ordered(x)) {
	   if(!any(type == c(1L, 3L)))
	       stop("'type' must be 1 or 3 for ordered factors")
	} else
            stop("factors are not allowed")
        lx <- levels(x)
    } else lx <- NULL
    if (na.rm)
	x <- x[!is.na(x)]
    else if (anyNA(x))
	stop("missing values and NaN's not allowed if 'na.rm' is FALSE")
    eps <- 100*.Machine$double.eps
    if (any((p.ok <- !is.na(probs)) & (probs < -eps | probs > 1+eps)))
	stop("'probs' outside [0,1]")
    n <- length(x)
    if(na.p <- any(!p.ok)) { # set aside NA & NaN
        o.pr <- probs
        probs <- probs[p.ok]
        probs <- pmax(0, pmin(1, probs)) # allow for slight overshoot
    }
    np <- length(probs)
    if (n > 0 && np > 0) {
        if(type == 7) { # be completely back-compatible
            index <- 1 + (n - 1) * probs
            lo <- floor(index)
            hi <- ceiling(index)
            x <- sort(x, partial = unique(c(lo, hi)))
            qs <- x[lo]
	    i <- which(index > lo & x[hi] != qs) # '!=' for '>' working w/ complex
	    h <- (index - lo)[i] # > 0	by construction
            ## print(c(h=h))
##	    qs[i] <- qs[i] + .minus(x[hi[i]], x[lo[i]]) * (index[i] - lo[i])
##	    qs[i] <- ifelse(h == 0, qs[i], (1 - h) * qs[i] + h * x[hi[i]])
	    qs[i] <- (1 - h) * qs[i] + h * x[hi[i]]
        } else {
            if (type <= 3) {
                ## Types 1, 2 and 3 are discontinuous sample qs.
                nppm <- if (type == 3) n * probs - .5 # n * probs + m; m = -0.5
                else n * probs          # m = 0
                j <- floor(nppm)
		h <- switch(type,
			    (nppm > j),		# type 1
			    ((nppm > j) + 1)/2, # type 2
			    (nppm != j) | ((j %% 2L) == 1L)) # type 3
            } else {
                ## Types 4 through 9 are continuous sample qs.
                switch(type - 3,
                   {a <- 0; b <- 1},    # type 4
                       a <- b <- 0.5,   # type 5
                       a <- b <- 0,     # type 6
                       a <- b <- 1,     # type 7 (unused here)
                       a <- b <- 1 / 3, # type 8
                       a <- b <- 3 / 8) # type 9
                ## need to watch for rounding errors here
                fuzz <- 4 * .Machine$double.eps
                nppm <- a + probs * (n + 1 - a - b) # n*probs + m
                j <- floor(nppm + fuzz) # m = a + probs*(1 - a - b)
                h <- nppm - j
                if(any(sml <- abs(h) < fuzz)) h[sml] <- 0
            }
            x <- sort(x, partial =
                      unique(c(1, j[j>0L & j<=n], (j+1)[j>0L & j<n], n))
                      )
            x <- c(x[1L], x[1L], x, x[n], x[n])
            ## h can be zero or one (types 1 to 3), and infinities matter
####        qs <- (1 - h) * x[j + 2] + h * x[j + 3]
            ## also h*x might be invalid ... e.g. Dates and ordered factors
            qs <- x[j+2L]
            qs[h == 1] <- x[j+3L][h == 1]
	    other <- (0 < h) & (h < 1) & (x[j+2L] != x[j+3L]) # '!=' for '<' in complex case
            if(any(other)) qs[other] <- ((1-h)*x[j+2L] + h*x[j+3L])[other]
        }
    } else {
	qs <- rep(NA_real_, np)
    }
    if(is.character(lx))
        qs <- factor(qs, levels = seq_along(lx), labels = lx, ordered = TRUE)
    if(names && np > 0L) {
	##names(qs) <- format_perc(probs)
    }
    if(na.p) { # do this more elegantly (?!)
        o.pr[p.ok] <- qs
        names(o.pr) <- rep("", length(o.pr)) # suppress <NA> names
        names(o.pr)[p.ok] <- names(qs)
        o.pr
    } else qs
}

N=100
P=10
set.seed(21)
x=c(matrix(runif(N/P), nrow=N/P, ncol=P))
A=quantile(x, (19:21)/100)  ## R
B=.quantile(x, (19:21)/100) ## quantile.default
C=Quantile(x, (19:21)/100)  ## Rcpp
all(A==B)
all(A==C)
all(B==C)
cbind(A, B, C)

coatless · 2019-06-05T15:13:27Z

Use:


```cpp
// your c++ code here

```

eddelbuettel · 2019-06-05T18:58:00Z

BTW floor and ceil exist in Rcpp sugar, sort() is (just to be different) a member function:

R> Rcpp::cppFunction("List foo(NumericVector x) { return List::create(floor(x), ceil(x), x.sort()); }")
R> foo(c(1.23, 4.56, 2.34))
[[1]]
[1] 1 2 4

[[2]]
[1] 2 3 5

[[3]]
[1] 1.23 2.34 4.56

R>

eddelbuettel · 2019-06-05T18:59:31Z

~~And a little googling got me this too: http://systematicinvestor.github.io/Run-Quantile-Rcpp~~

Or maybe not. Seems for focussed on running (moving) quantiles.

eddelbuettel · 2019-06-06T17:12:07Z

@rsparapa here is a shorter version of what you did, taking advantage of the sugar functions:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector Quantile(Rcpp::NumericVector x, Rcpp::NumericVector probs) {
  // implementation of type 7
  const size_t n=x.size(), np=probs.size();
  if (n==0) return x;
  if (np==0) return probs;
  Rcpp::NumericVector index = (n-1.)*probs, y=x.sort(), x_hi(np), qs(np);
  Rcpp::NumericVector lo = Rcpp::floor(index), hi = Rcpp::ceiling(index);

  for (size_t i=0; i<np; ++i) {
    qs[i] = y[lo[i]];
    x_hi[i] = y[hi[i]];
    if ((index[i]>lo[i]) && (x_hi[i] != qs[i])) {
      double h;
      h = index[i]-lo[i];
      qs[i] = (1.-h)*qs[i] + h*x_hi[i];
    }
  }
  return qs;
}

/*** R
N <- 100
P <- 10
set.seed(21)
x <- matrix(runif(N/P), nrow=N/P, ncol=P)
A <- quantile(x, (19:21)/100)  ## R
B <- Quantile(x, (19:21)/100)  ## Rcpp
all(A==B)
*/

Which, thanks to cute /**** R regexp at the end, even runs the demo when built:

R> Rcpp::sourceCpp("/tmp/quantile.cpp")

R> N <- 100

R> P <- 10

R> set.seed(21)

R> x <- matrix(runif(N/P), nrow=N/P, ncol=P)

R> A <- quantile(x, (19:21)/100)  ## R

R> B <- Quantile(x, (19:21)/100)  ## Rcpp

R> all(A==B)
[1] TRUE
R>

eddelbuettel · 2019-06-06T17:13:00Z

Stills needs a char vector with names though to be fully comparable to the R version.

rsparapa · 2019-06-22T17:38:43Z

Ok, I've almost got it now. But, how can we add the % to the end of the names?

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector Quantile(Rcpp::NumericVector x, Rcpp::NumericVector probs) {
  // implementation of type 7
  const size_t n=x.size(), np=probs.size();
  if (n==0) return x;
  if (np==0) return probs;
  Rcpp::NumericVector index = (n-1.)*probs, y=x.sort(), x_hi(np), qs(np);
  Rcpp::NumericVector lo = Rcpp::floor(index), hi = Rcpp::ceiling(index);
  Rcpp::CharacterVector qs_names(np);

  for (size_t i=0; i<np; ++i) {
    size_t j;
    qs_names[i] = probs[i]*100.;
    j=qs_names[i].size();
    //j++;
    //qs_names[i].reserve(j);
    //qs_names[i][j]='%'; 
    qs[i] = y[lo[i]];
    x_hi[i] = y[hi[i]];
    if ((index[i]>lo[i]) && (x_hi[i] != qs[i])) {
      double h;
      h = index[i]-lo[i];
      qs[i] = (1.-h)*qs[i] + h*x_hi[i];
    }
  }

  qs.names()=qs_names;
  return qs;
}

/*** R
N <- 100
P <- 10
set.seed(21)
x <- matrix(runif(N/P), nrow=N/P, ncol=P)
A <- quantile(x, (19:21)/100)  ## R
B <- Quantile(x, (19:21)/100)  ## Rcpp
all(A==B)
names(A)
names(B)
*/

Which produces...

> all(A==B)
[1] TRUE

> names(A)
[1] "19%" "20%" "21%"

> names(B)
[1] "19" "20" "21"

eddelbuettel · 2019-06-22T17:39:41Z

Off the cuff, I'd say work with C++ std::string which has a nice + operator.

Else of course C char vectors, proper sizing, concat or sprintf etc pp.

eddelbuettel · 2019-06-22T17:53:03Z

Equally quick and dirty:

    qs_names[i] = std::to_string(static_cast<int>(probs[i]*100)) + std::string("%");

This truncates. Do you need / want round() instead? Also, this may imply C++11 which you may or may not want as a minimal level. There are C++98 alternatives to to_string ....

eddelbuettel · 2019-06-22T17:53:35Z

I forgot:

R> Rcpp::sourceCpp("/tmp/rodney.cpp")

R> N <- 100

R> P <- 10

R> set.seed(21)

R> x <- matrix(runif(N/P), nrow=N/P, ncol=P)

R> A <- quantile(x, (19:21)/100)  ## R

R> B <- Quantile(x, (19:21)/100)  ## Rcpp

R> all(A==B)
[1] TRUE

R> names(A)
[1] "19%" "20%" "21%"

R> names(B)
[1] "19%" "20%" "21%"
R>

rsparapa · 2019-06-22T18:44:34Z

C++11 is the lowest I go since vector is so useful. Admittedly, round() I had not considered. But, I see that R doesn't seem to have an obviously logical shortening...

> names(quantile(pi, probs=pi/100))
[1] "3.141593%"

eddelbuettel · 2019-06-22T18:49:06Z

std::vector is a more than a decade older than C++11 :) But yes, I personally have issues with requiring C++11 or newer now that CRAN supports it. Some packages now require C++14 ...

As for the digits, I (personally) would not obsess over matching R here. When I used to_string on a double it have too many digits but you always pick a substring or do something else.

rsparapa · 2019-06-22T20:24:23Z

I am sure that you right about C++11 and std::vector being around for awhile. I meant support for std::vector in R packages which requires C++11 support, right?

eddelbuettel · 2019-06-22T20:26:38Z

No. We used std::vector<double> in RQuantLib in 2002. On CRAN. I think I first saw it (via my advisor) about a decade earlier when the STL was still an SGI project (long before the HP acquisition). It did become available in g++ and visual c++ during the 1990s.

(We interfaced it in a brutish way into R, but all that is history ... and gave to rise to Rcpp. So all good).

rsparapa · 2019-06-22T20:39:53Z

Oh, I was misinformed ;o) So, getting back to rounding. I'm not sure how to do this. Is the round function available with sugar? I don't see it, but I couldn't find floor or ceiling so I'm obviously not looking in the right places. If it is in sugar, then I think this is easy. If not, then should we just add round to sugar too?

eddelbuettel · 2019-06-22T20:41:50Z

But of course it is :) See below:

R> Rcpp::cppFunction("NumericVector myround(NumericVector x, int d) { return(round(x, d)); }")
R> myround(1.23456, 2)
[1] 1.23
R>

Searching for functions is still ... suboptimal. The Doxygen interface is searchable, grep'ing in the tests works etc pp.

rsparapa · 2019-06-22T21:45:00Z

Ok, I've got it now...

#include <Rcpp.h>
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
Rcpp::NumericVector Quantile(Rcpp::NumericVector x, Rcpp::NumericVector probs) {
  // implementation of type 7
  const size_t n=x.size(), np=probs.size();
  if (n==0) return x;
  if (np==0) return probs;
  Rcpp::NumericVector index = (n-1.)*probs, y=x.sort(), x_hi(np), qs(np);
  Rcpp::NumericVector lo = Rcpp::floor(index), hi = Rcpp::ceiling(index);
  Rcpp::NumericVector probs_names=Rcpp::round(probs*100., 6);
  Rcpp::CharacterVector qs_names(np);
  std::string _name, zero=".000000";

  for (size_t i=0; i<np; ++i) {
    _name = std::to_string(static_cast<double>(probs_names[i]));
    if(_name=="100.000000") _name=std::string("100");
    else if(_name.substr(1)==zero) _name=_name.substr(0, 1);
    else if(_name.substr(2)==zero) _name=_name.substr(0, 2);
    else if(_name.substr(3)==zero.substr(2)) _name=_name.substr(0, 3);
    else if(_name.substr(4)==zero.substr(3)) _name=_name.substr(0, 4);
    else if(_name.substr(5)==zero.substr(4)) _name=_name.substr(0, 5);
    else if(_name.substr(6)==zero.substr(5)) _name=_name.substr(0, 6);
    else if(_name.substr(7)==zero.substr(6)) _name=_name.substr(0, 7);
    else if(_name.substr(4)==zero.substr(2)) _name=_name.substr(0, 4);
    else if(_name.substr(5)==zero.substr(3)) _name=_name.substr(0, 5);
    else if(_name.substr(6)==zero.substr(4)) _name=_name.substr(0, 6);
    else if(_name.substr(7)==zero.substr(5)) _name=_name.substr(0, 7);
    else if(_name.substr(8)==zero.substr(6)) _name=_name.substr(0, 8);
/*  // why does this not work?
    else {
      for(size_t j=3; j<8; ++j) {
	if(_name.substr(j)==zero.substr(j-1)) _name=_name.substr(0, j);
	else if(_name.substr(j+1)==zero.substr(j-1)) _name=_name.substr(0, j+1);
      }
    }
*/
    qs_names[i] = _name + std::string("%");
    qs[i] = y[lo[i]];
    x_hi[i] = y[hi[i]];
    if ((index[i]>lo[i]) && (x_hi[i] != qs[i])) {
      double h;
      h = index[i]-lo[i];
      qs[i] = (1.-h)*qs[i] + h*x_hi[i];
    }
  }

  qs.names()=qs_names;
  return qs;
}

/*** R
N <- 100
P <- 10
set.seed(21)
x <- matrix(runif(N/P), nrow=N/P, ncol=P)
A <- quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## R
B <- Quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## Rcpp
all(A==B)
all(names(A)==names(B))
names(A)
names(B)
*/

Yielding...

> A <- quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## R

> B <- Quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## Rcpp

> all(A==B)
[1] TRUE

> all(names(A)==names(B))
[1] TRUE

> names(A)
 [1] "0%"        "1.9%"      "1.99999%"  "3.141593%" "18.9%"     "18.99999%"
 [7] "19%"       "20%"       "21%"       "100%"     

> names(B)
 [1] "0%"        "1.9%"      "1.99999%"  "3.141593%" "18.9%"     "18.99999%"
 [7] "19%"       "20%"       "21%"       "100%"

rsparapa · 2019-07-09T18:17:40Z

In hindsight, the name attribute would be easier to create if the R function format() existed in Rcpp. Does it? I don't see it in the unitTests.

eddelbuettel · 2019-07-09T21:00:19Z

I have not had time to look closely at what you do there would sprintf() not be a candidate?

rsparapa · 2019-07-09T21:07:24Z

I never thought you would have recommended C ;o) I'm not familiar with sprintf, but I'm trying to wrap my head around the man page now...

rsparapa · 2019-07-10T15:18:12Z

Before trying sprintf(), I just ran what I had above. That was working, but now I'm seeing this error...

> Rcpp::sourceCpp("~/quantile.cpp")
quantile.cpp: In function ‘Rcpp::NumericVector Quantile(Rcpp::NumericVector, Rcpp::NumericVector)’:
quantile.cpp:17:13: error: ‘to_string’ is not a member of ‘std’
     _name = std::to_string(static_cast<double>(probs_names[i]));
             ^
make: *** [quantile.o] Error 1
g++  -I"/usr/local/R/3.5.0/lib64/R/include" -DNDEBUG   -I"/usr/local/R/3.5.0/lib64/R/library/Rcpp/include" -I"/home/rsparapa" -I/usr/local/include   -fpic  -g -O2 -mtune=native  -c quantile.cpp -o quantile.o
Error in Rcpp::sourceCpp("~/quantile.cpp") : 
  Error 1 occurred building shared library.

kevinushey · 2019-07-10T15:19:32Z

std::to_string() is only available with C++11 and above, so you'll have to instruct the compiler to use that standard.

rsparapa · 2019-07-10T15:48:44Z

Ah, got it! Thanks Kevin. Fixed above: why it worked before is a mystery. But, I just don't grok sprintf. I have never read a more confusing man page. They could not have come up with more inscrutable formats if they tried. And what the hell is "char *restrict s"? From the following lines, I get a crash...

char *s;
sprintf(s, "%-10f", static_cast<double>(probs_names[0]));

Program terminated with signal 11, Segmentation fault.
#0  0x00007f38f1de381c in __GI__IO_str_overflow (fp=0x7ffe22601cd0, c=48) at strops.c:146
146	    *fp->_IO_write_ptr++ = (unsigned char) c;

eddelbuettel · 2019-07-11T06:02:59Z

You need to allocate the memory to sprintf into first:

> cppFunction('std::string f() { char v[32]; sprintf(v, "foo%-8f", 1.234); return std::string(v); }')
> f()
[1] "foo1.234000"
>

rsparapa · 2019-07-11T17:11:51Z

Aarghh!!! I don't think sprintf works here. What you need to have is a format like this "%1$*2.*3$f" or this "%1$*2$.$*3$f". However, neither appear to be supported.

#include <Rcpp.h>
#include <stdio.h>
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
Rcpp::NumericVector Quantile(Rcpp::NumericVector x, Rcpp::NumericVector probs) {
  // implementation of type 7
  const size_t n=x.size(), np=probs.size();
  if (n==0) return x;
  if (np==0) return probs;
  Rcpp::NumericVector index = (n-1.)*probs, y=x.sort(), x_hi(np), qs(np);
  Rcpp::NumericVector lo = Rcpp::floor(index), hi = Rcpp::ceiling(index);
  Rcpp::NumericVector probs_names=Rcpp::round(probs*100., 6);
  Rcpp::CharacterVector qs_names(np);
  std::string _name, zero=".000000";
  char string[11];
  double probs_full, probs_trunc;

  for (size_t i=0; i<np; ++i) {
    probs_full=static_cast<double>(probs_names[i]);

    if(probs_full==100.) _name=std::string("100");
    else if(probs_full==0.) _name=std::string("0");
    else {
      int k=std::ceil(log10(probs_full));
      if(probs_full==std::floor(probs_full)) 
	sprintf(string, "%1$*2$.0f", probs_full, k);
      else for(size_t j=0; j<7; ++j) {
	  probs_trunc==(std::floor(probs_full*pow(10, j))/pow(10, j)); // the equal sign
	  if(probs_trunc=probs_full)                                   // and its evils
	    sprintf(string, "%1$*2.*3$f", probs_trunc, j+k+1, j);
	}
      _name=string;
    }

    qs_names[i] = _name + std::string("%");
    qs[i] = y[lo[i]];
    x_hi[i] = y[hi[i]];
    if ((index[i]>lo[i]) && (x_hi[i] != qs[i])) {
      double h;
      h = index[i]-lo[i];
      qs[i] = (1.-h)*qs[i] + h*x_hi[i];
    }
  }

  qs.names()=qs_names;
  return qs;
}

/*** R
N <- 100
P <- 10
set.seed(21)
x <- matrix(runif(N/P), nrow=N/P, ncol=P)
A <- quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## R
B <- Quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## Rcpp
all(A==B)
all(names(A)==names(B))
names(A)
names(B)
*/

which yields...

> all(names(A)==names(B))
[1] FALSE

> names(A)
 [1] "0%"        "1.9%"      "1.99999%"  "3.141593%" "18.9%"     "18.99999%"
 [7] "19%"       "20%"       "21%"       "100%"     

> names(B)
 [1] "0%"        "%82.*3$f%" "%82.*3$f%" "%82.*3$f%" "%92.*3$f%" "%92.*3$f%"
 [7] "19%"       "20%"       "21%"       "100%"

eddelbuettel · 2019-07-12T08:13:35Z

I would try to simplify things a little. Remember that *printf patterns can be tried on the command-line (man printf -- there is a Unix command !!) or in R (help(Rprint)). And as I said earlier, nobody forces you to exactly replicate the R names vector. And/or you could use an Rcpp::Function() to have R create the names for you.

The main thing appears to be solved: you have your quantiles calculation. So the final step of finishing off names is just a nice to have detail, no?

rsparapa · 2019-07-12T15:47:40Z

Sorry, I was just ranting about C. Thank god we have C++! Got it now...

#include <Rcpp.h>
#include <stdio.h>
#include <string>

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
Rcpp::NumericVector Quantile(Rcpp::NumericVector x, Rcpp::NumericVector probs) {
  // implementation of type 7
  const size_t n=x.size(), np=probs.size();
  if (n==0) return x;
  if (np==0) return probs;
  Rcpp::NumericVector index = (n-1.)*probs, y=x.sort(), x_hi(np), qs(np);
  Rcpp::NumericVector lo = Rcpp::floor(index), hi = Rcpp::ceiling(index);
  Rcpp::NumericVector probs_names=Rcpp::round(probs*100., 6);
  Rcpp::CharacterVector qs_names(np);
  std::string _name, format;
  char string[11], width[2], digits[2];
  double probs_full, probs_trunc;

  for (size_t i=0; i<np; ++i) {
    probs_full=static_cast<double>(probs_names[i]);

    if(probs_full==100.) _name=std::string("100");
    else if(probs_full==0.) _name=std::string("0");
    else {
      int k=std::ceil(log10(probs_full));
      if(probs_full==std::floor(probs_full)) {
	sprintf(string, "%1$*2$.0f", probs_full, k);
	_name=string;
      }
      else for(size_t j=6; j>0; --j) {
	  probs_trunc=(std::floor(probs_full*pow(10, j))/pow(10, j)); 
	  if(probs_trunc==probs_full) {
	    sprintf(width, "%1d", j+k+1);
	    sprintf(digits, "%1d", j);
	    format = std::string("%") + width +
	      std::string(".") + digits + std::string("f");
	    sprintf(string, format.c_str(), probs_trunc);
	    _name=string;
	    //_name=format;
	  }
	}
    }

    qs_names[i] = _name + std::string("%");
    qs[i] = y[lo[i]];
    x_hi[i] = y[hi[i]];
    if ((index[i]>lo[i]) && (x_hi[i] != qs[i])) {
      double h;
      h = index[i]-lo[i];
      qs[i] = (1.-h)*qs[i] + h*x_hi[i];
    }
  }

  qs.names()=qs_names;
  return qs;
}

/*** R
N <- 100
P <- 10
set.seed(21)
x <- matrix(runif(N/P), nrow=N/P, ncol=P)
A <- quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## R
B <- Quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## Rcpp
all(A==B)
all(names(A)==names(B))
names(A)
names(B)
*/

... abracadabra ...

> N <- 100

> P <- 10

> set.seed(21)

> x <- matrix(runif(N/P), nrow=N/P, ncol=P)

> A <- quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## R

> B <- Quantile(x, c(0, 1.9, 1.99999, pi, 18.9, 18.99999, 19:21, 100)/100)  ## Rcpp

> all(A==B)
[1] TRUE

> all(names(A)==names(B))
[1] TRUE

> names(A)
 [1] "0%"        "1.9%"      "1.99999%"  "3.141593%" "18.9%"     "18.99999%"
 [7] "19%"       "20%"       "21%"       "100%"     

> names(B)
 [1] "0%"        "1.9%"      "1.99999%"  "3.141593%" "18.9%"     "18.99999%"
 [7] "19%"       "20%"       "21%"       "100%"

github-actions · 2020-12-23T02:10:38Z

This issue is stale (365 days without activity) and will be closed in 31 days unless new activity is seen. Please feel free to re-open it is still a concern, possibly with additional data.

rsparapa · 2020-12-23T14:36:30Z

Issue implemented as shown above

github-actions bot added the no-issue-activity label Dec 23, 2020

rsparapa closed this as completed Dec 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantile function is not available in sugar #967

quantile function is not available in sugar #967

rsparapa commented Jun 4, 2019

eddelbuettel commented Jun 4, 2019

rsparapa commented Jun 5, 2019 •

edited by eddelbuettel

coatless commented Jun 5, 2019

eddelbuettel commented Jun 5, 2019

eddelbuettel commented Jun 5, 2019 •

edited

eddelbuettel commented Jun 6, 2019

eddelbuettel commented Jun 6, 2019

rsparapa commented Jun 22, 2019 •

edited

eddelbuettel commented Jun 22, 2019 •

edited

eddelbuettel commented Jun 22, 2019

eddelbuettel commented Jun 22, 2019

rsparapa commented Jun 22, 2019 •

edited

eddelbuettel commented Jun 22, 2019

rsparapa commented Jun 22, 2019

eddelbuettel commented Jun 22, 2019 •

edited

rsparapa commented Jun 22, 2019

eddelbuettel commented Jun 22, 2019 •

edited

rsparapa commented Jun 22, 2019 •

edited

rsparapa commented Jul 9, 2019

eddelbuettel commented Jul 9, 2019 •

edited

rsparapa commented Jul 9, 2019

rsparapa commented Jul 10, 2019

kevinushey commented Jul 10, 2019

rsparapa commented Jul 10, 2019

eddelbuettel commented Jul 11, 2019

rsparapa commented Jul 11, 2019 •

edited

eddelbuettel commented Jul 12, 2019

rsparapa commented Jul 12, 2019 •

edited

github-actions bot commented Dec 23, 2020

rsparapa commented Dec 23, 2020

quantile function is not available in sugar #967

quantile function is not available in sugar #967

Comments

rsparapa commented Jun 4, 2019

eddelbuettel commented Jun 4, 2019

rsparapa commented Jun 5, 2019 • edited by eddelbuettel

coatless commented Jun 5, 2019

eddelbuettel commented Jun 5, 2019

eddelbuettel commented Jun 5, 2019 • edited

eddelbuettel commented Jun 6, 2019

eddelbuettel commented Jun 6, 2019

rsparapa commented Jun 22, 2019 • edited

eddelbuettel commented Jun 22, 2019 • edited

eddelbuettel commented Jun 22, 2019

eddelbuettel commented Jun 22, 2019

rsparapa commented Jun 22, 2019 • edited

eddelbuettel commented Jun 22, 2019

rsparapa commented Jun 22, 2019

eddelbuettel commented Jun 22, 2019 • edited

rsparapa commented Jun 22, 2019

eddelbuettel commented Jun 22, 2019 • edited

rsparapa commented Jun 22, 2019 • edited

rsparapa commented Jul 9, 2019

eddelbuettel commented Jul 9, 2019 • edited

rsparapa commented Jul 9, 2019

rsparapa commented Jul 10, 2019

kevinushey commented Jul 10, 2019

rsparapa commented Jul 10, 2019

eddelbuettel commented Jul 11, 2019

rsparapa commented Jul 11, 2019 • edited

eddelbuettel commented Jul 12, 2019

rsparapa commented Jul 12, 2019 • edited

github-actions bot commented Dec 23, 2020

rsparapa commented Dec 23, 2020

rsparapa commented Jun 5, 2019 •

edited by eddelbuettel

eddelbuettel commented Jun 5, 2019 •

edited

rsparapa commented Jun 22, 2019 •

edited

eddelbuettel commented Jun 22, 2019 •

edited

rsparapa commented Jun 22, 2019 •

edited

eddelbuettel commented Jun 22, 2019 •

edited

eddelbuettel commented Jun 22, 2019 •

edited

rsparapa commented Jun 22, 2019 •

edited

eddelbuettel commented Jul 9, 2019 •

edited

rsparapa commented Jul 11, 2019 •

edited

rsparapa commented Jul 12, 2019 •

edited