Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upQuanteda regression #161
Quanteda regression #161
Comments
|
Appreciate the super-quick reply, Ken! I suspected something exactly along those lines. It could be our code is too simplistic ie in R I learned to test via
Can you point us too where This was essentially a "regression test race condition" as we both made changes at the same time. But there is a lot more sparse matrix code on our side which will hopefully help you all too. |
|
It think it may be as simple as your setClass("dfmSparse",
contains = c("dfm", "dgCMatrix"))not meshing with our side of // Get the type of sparse matrix
std::string type = Rcpp::as<std::string>(mat.slot("class"));
if (type == "dgCMatrix") { |
|
It all works if we just make this change to RcppArmadillo: modified inst/include/RcppArmadilloAs.h
@@ -97,7 +97,7 @@ namespace traits {
// Get the type of sparse matrix
std::string type = Rcpp::as<std::string>(mat.slot("class"));
- if (type == "dgCMatrix") {
+ if (type == "dgCMatrix" || type == "dfmSparse") {
IntegerVector i = mat.slot("i");
IntegerVector p = mat.slot("p");
Vector<RTYPE> x = mat.slot("x");
@dmbates Is there a more S4-ish trick which could be done from client-package side? We can't possibly enumerate all possible S4 classes on our end, but we also shouldn't have to force people to stick with |
|
I think I have idea:
Thoughts? |
|
Sounds like we exactly that (from today's R-devel NEWS feed)
|
|
Realized (on train) that the shim class may create problems as it will likely affect behaviour once back in R too. So the better bet may be to go with R> class(dfm1)
[1] "dfmSparse"
attr(,"package")
[1] "quanteda"
R> "dgCMatrix" %in% methods::is(dfm1) ## this does the Right Thing (TM)
[1] TRUE
R> |
|
Sorry, I don't understand your latest solution. Do you mean a minor modification on the |
|
We can access the line So I guess I will implement that, and we then change all your With the news from R-devel we probably have a chance to conditionally on R-devel (and then R 3.5.* next) do it in C code. And as the In short: we just generalize your comparisons. At least, that is my current plan ... but then I am also currently at work and not looking at this. |
|
The following works (which is a start ;-) but feels a little ... rushed. Should the helper function be somewhere else? In Rcpp? modified inst/include/RcppArmadilloAs.h
@@ -25,6 +25,19 @@
namespace Rcpp{
+ inline bool isIt(const std::string cls, Rcpp::S4 s) {
+ static Rcpp::Environment mthds("package:methods");
+ static Rcpp::Function is = mthds["is"];
+
+ Rcpp::CharacterVector res = is(s);
+ bool itis = false;
+ for (int i=0; !itis && i<res.size(); i++) {
+ itis = strcmp(cls.c_str(), res[i]) == 0;
+ }
+ return itis;
+ }
+
+
namespace traits {
template <typename T>
@@ -97,7 +110,7 @@ namespace traits {
// Get the type of sparse matrix
std::string type = Rcpp::as<std::string>(mat.slot("class"));
- if (type == "dgCMatrix") {
+ if (type == "dgCMatrix" || isIt("dgCMatrix", mat)) {
IntegerVector i = mat.slot("i");
IntegerVector p = mat.slot("p");
Vector<RTYPE> x = mat.slot("x");
@@ -110,7 +123,7 @@ namespace traits {
std::copy(p.begin(), p.end(), arma::access::rwp(res.col_ptrs));
std::copy(x.begin(), x.end(), arma::access::rwp(res.values));
}
- else if (type == "dtCMatrix") {
+ else if (type == "dtCMatrix" || isIt("dtCMatrix", mat)) {
IntegerVector i = mat.slot("i");
IntegerVector p = mat.slot("p");
Vector<RTYPE> x = mat.slot("x");
@@ -128,7 +141,7 @@ namespace traits {
res.diag().ones();
}
}
- else if (type == "dsCMatrix") {
+ else if (type == "dsCMatrix" || isIt("dsCMatrix", mat)) {
IntegerVector i = mat.slot("i");
IntegerVector p = mat.slot("p");
Vector<RTYPE> x = mat.slot("x");
@@ -148,7 +161,7 @@ namespace traits {
res = symmatl(res);
}
}
- else if (type == "dgTMatrix") {
+ else if (type == "dgTMatrix" || isIt("dgTMatrix", mat)) {
IntegerVector ti = mat.slot("i");
IntegerVector tj = mat.slot("j");
Vector<RTYPE> tx = mat.slot("x");
@@ -200,7 +213,7 @@ namespace traits {
std::copy(p.begin(), p.end(), arma::access::rwp(res.col_ptrs));
std::copy(x.begin(), x.end(), arma::access::rwp(res.values));
}
- else if (type == "dtTMatrix") {
+ else if (type == "dtTMatrix" || isIt("dtTMatrix", mat)) {
IntegerVector ti = mat.slot("i");
IntegerVector tj = mat.slot("j");
Vector<RTYPE> tx = mat.slot("x");
@@ -257,7 +270,7 @@ namespace traits {
res.diag().ones();
}
}
- else if (type == "dsTMatrix") {
+ else if (type == "dsTMatrix" || isIt("dsTMatrix", mat)) {
IntegerVector ti = mat.slot("i");
IntegerVector tj = mat.slot("j");
Vector<RTYPE> tx = mat.slot("x");
@@ -316,7 +329,7 @@ namespace traits {
res = symmatl(res);
}
}
- else if (type == "dgRMatrix") {
+ else if (type == "dgRMatrix" || isIt("dgRMatrix", mat)) {
IntegerVector rj = mat.slot("j");
IntegerVector rp = mat.slot("p");
Vector<RTYPE> rx = mat.slot("x");
@@ -366,7 +379,7 @@ namespace traits {
std::copy(p.begin(), p.end(), arma::access::rwp(res.col_ptrs));
std::copy(x.begin(), x.end(), arma::access::rwp(res.values));
}
- else if (type == "dtRMatrix") {
+ else if (type == "dtRMatrix" || isIt("dtRMatrix", mat)) {
IntegerVector rj = mat.slot("j");
IntegerVector rp = mat.slot("p");
Vector<RTYPE> rx = mat.slot("x");
@@ -421,7 +434,7 @@ namespace traits {
res.diag().ones();
}
}
- else if (type == "dsRMatrix") {
+ else if (type == "dsRMatrix" || isIt("dsRMatrix", mat)) {
IntegerVector rj = mat.slot("j");
IntegerVector rp = mat.slot("p");
Vector<RTYPE> rx = mat.slot("x");
@@ -478,7 +491,7 @@ namespace traits {
res = symmatl(res);
}
}
- else if (type == "indMatrix") {
+ else if (type == "indMatrix" || isIt("indMatrix", mat)) {
std::vector<int> i;
IntegerVector p(ncol + 1);
IntegerVector x(nrow, 1);
@@ -519,7 +532,7 @@ namespace traits {
std::copy(p.begin(), p.end(), arma::access::rwp(res.col_ptrs));
std::copy(x.begin(), x.end(), arma::access::rwp(res.values));
}
- else if (type == "pMatrix") {
+ else if (type == "pMatrix" || isIt("pMatrix", mat)) {
std::vector<int> i;
IntegerVector p(ncol + 1);
IntegerVector x(ncol, 1);
@@ -550,7 +563,7 @@ namespace traits {
std::copy(p.begin(), p.end(), arma::access::rwp(res.col_ptrs));
std::copy(x.begin(), x.end(), arma::access::rwp(res.values));
}
- else if (type == "ddiMatrix") {
+ else if (type == "ddiMatrix" || isIt("ddiMatrix", mat)) {
std::vector<int> i;
std::vector<int> p;
std::vector<double> x;
|
|
It looks like S4 objects already do have an #include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
bool cpp_inherits(S4 object, std::string clazz)
{
return object.is(clazz);
}
/*** R
library(Matrix)
m <- Matrix(1)
cpp_inherits(m, "dsyMatrix")
cpp_inherits(m, "Matrix")
*/ |
|
Crikey. Did I just reimplement that? |
|
More or less :-) The Rcpp code does something similar, but extracts the class definition directly and looks to see if the compared class is a known parent class. |
|
And exactly what we needed. Thanks for cluebat, Kevin :) R> cpp_inherits(dfm1, "dgCMatrix") # after library(quanteda); example(dfm)
[1] TRUE
R> |
|
Well I didn't commit it so maybe it doesn't count, right? ;-) That'll make the right generalization for the test and is what we needed. All good. |
|
PR #162 which fixes this has been merged. |
Those watching this repo and hence seeing this probably also follow the rcpp-devel list.
The quanteda by @kbenoit et al package came up as it at times seems to fail tests for me / us / CRAN. Now that RcppArmadillo 0.760.1.0 is on CRAN, it is seen to let quanteda fail 14 times:
Most of these are of the
dfmSparse is not supportedvariety leading to ouras<>()caster.For testing, I downgraded to the previously release version 0.7.900.2.0 of RcppArmadillo -- and with it quateda 0.99 (which itself only got onto CRAN after we submitted) passes.
@binxiangni Could you possibly take a quick look if either we (or quanteda) can accomodate with a simple cast? I can take a look too but I am simply not that deep in the sparse code ...