diff --git a/DESCRIPTION b/DESCRIPTION index 8d1c366..3501a90 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,18 +1,20 @@ Package: frab Type: Package -Title: An Alternative Interpretation of Named Vectors -Version: 0.0-1 +Title: How to Add Two Tables +Version: 0.0-3 Authors@R: person(given=c("Robin", "K. S."), family="Hankin", role = c("aut","cre"), email="hankin.robin@gmail.com", comment = c(ORCID = "0000-0001-5982-0415")) Maintainer: Robin K. S. Hankin -Description: An alternative interpretation of named vectors as - generalized tables, so that c(a=1,b=2,c=3) + c(b=3,a=-1) will - return c(b=5,c=3). Uses 'disordR' discipline (Hankin, 2022, - ). Extraction and replacement - methods are provided. The underlying mathematical structure is - the Free Abelian group, hence the name. +Description: Methods to "add" two tables; also an alternative + interpretation of named vectors as generalized tables, so that + c(a=1,b=2,c=3) + c(b=3,a=-1) will return c(b=5,c=3). Uses + 'disordR' discipline (Hankin, 2022, ). + Extraction and replacement methods are provided. The underlying + mathematical structure is the Free Abelian group, hence the name. + To cite in publications please use Hankin (2023) + . License: GPL (>= 2) Depends: R (>= 3.5.0) -Suggests: knitr, markdown, rmarkdown, testthat +Suggests: knitr, markdown, rmarkdown, testthat, mvtnorm VignetteBuilder: knitr Imports: Rcpp (>= 1.0-7), mathjaxr, disordR (>= 0.9-8-1), methods LinkingTo: Rcpp @@ -20,7 +22,7 @@ URL: https://github.com/RobinHankin/frab BugReports: https://github.com/RobinHankin/frab RdMacros: mathjaxr NeedsCompilation: yes -Packaged: 2023-07-19 21:37:40 UTC; rhankin +Packaged: 2023-08-15 23:57:04 UTC; rhankin Author: Robin K. S. Hankin [aut, cre] () Repository: CRAN -Date/Publication: 2023-07-20 11:20:02 UTC +Date/Publication: 2023-08-16 09:02:35 UTC diff --git a/MD5 b/MD5 index c0d98dd..210c267 100644 --- a/MD5 +++ b/MD5 @@ -1,34 +1,44 @@ -5061f6385aff6a83e8baad22fc5faa8d *DESCRIPTION +c5ce79e255519bae4ff515c8d28f4f71 *DESCRIPTION 55e0cf44c1a5c0e1ea52ca6bd526b242 *NAMESPACE -868be7dc206df4f129d20419aad0a976 *R/RcppExports.R -5fbaabf29578c25827e6b0e4769c9afe *R/frab.R -61ae6ae99ef7a8c14b6b97185f446c71 *README.md -b1870707d2b690351f4c7f515f90a24f *build/frab.pdf -68865b88cfa1f13c38ca79af192d28ff *build/partial.rdb -160d6dfb46956e52b633848573847aed *build/stage23.rdb -06c8dac28c5e939c6af6592635f694e3 *build/vignette.rds -d385e72c2f60e980006f1ec48eb3b4a1 *inst/doc/frab.R -f787f9bf341b82267132833e40263a09 *inst/doc/frab.Rmd -5d019a465de919645b02f3a3847cc906 *inst/doc/frab.html -3b9e3ed50c0b8ffa0c7d5cc793a7dd4f *man/Arith.Rd -5c51a0d12de2c5a2909271a3d1975d39 *man/Compare.Rd -7ed1312f15bad6ae61b053f88902cd84 *man/Extract.Rd +2f8bd80de715cde904dfc8430a47cb85 *NEWS.md +ff2988afb5dc1fad914fc6f1b90585cf *R/RcppExports.R +5e92e3d8d9847f5c3b0c59077d27b803 *R/frab.R +5ea61be7cb5e3c72236773ccd15a69b0 *R/sparsetable.R +d3efe4d379e4985a05ae467269ff8b5e *README.md +f36a9abac32324778b7ce1aecc16cd80 *build/frab.pdf +5b34e07bb2f5f9fac47e07ba59581ef0 *build/partial.rdb +0ac9cace5052ba8cbcb6a029f6a8eafe *build/stage23.rdb +ee23a57ddc61e60b2ca2b956a157605d *build/vignette.rds +61cd24468278a2c74bea8ae24c335166 *inst/doc/frab.R +0cff9a2f802561af620671db34a793b6 *inst/doc/frab.Rmd +1c4110d8c1af85b8053a38022c10d268 *inst/doc/frab.html +47a0055c494ebee78647cc56b8f5cadf *inst/frab.bib +f0465c6b8a13269fab9d99f8aa266527 *inst/frab_arxiv.Rnw +d29784da0b905da88244b5a5c40f2fdc *inst/read.me +a5aac65fb29bf3430f1e7e8b7f717749 *inst/wittgenstein.Rmd +316dd6430416f86fdc133737ce7c34af *man/Arith.Rd +308adc135b1aad80fd826a6a9fb7ae0c *man/Compare.Rd +190453a31e19248c587d5706c73ecb1f *man/Extract.Rd 0e38107b0ebeccecfc57cab27e7529fd *man/figures/frab.png -9a60484529733e2ab3c431106041f1a6 *man/frab-class.Rd +ceaca23242a1baa9d8fd2dc3a93b6163 *man/frab-class.Rd 300bf5b585a4b58d36105cbadb306092 *man/frab-package.Rd -dd68f9458a0069b5023c88198550cd72 *man/frab.Rd -d868cd2cfe4b8250f5c8b44c33a4dd6c *man/misc.Rd +05c3919de40c5344a16a2acd78d2eb48 *man/frab.Rd +9098c88fb16e6152646639f9c5486858 *man/misc.Rd 2966ded66d965c0c31c9c3eca0a2a916 *man/namedvector.Rd 680350e287b311cc55c30fed88cb9ed3 *man/pmax.Rd 7026a022f968605cdf89f2ac997f6979 *man/print.Rd 64ff6a2224e515d260be98513c02c97f *man/rfrab.Rd +123d72827e9188318b9243c9cb31160c *man/sparsetable.Rd 6f249b0f0c53fddd870854e344936e17 *man/table.Rd 89f8675cf461379a8c0f224f4917b4b4 *man/zero.Rd -616ea317e88c1aac6cedb199c835e5ec *src/RcppExports.cpp -04dc8cbd76b5bcc8a858ec91dcc5a44b *src/frab.cpp +d78e2af1f3ebf698083a133f355afd9e *src/RcppExports.cpp +f6b5e8fed1acee228a850a03c75a9a5b *src/frab.cpp 2d0eddd0677a2b902568cdacdbf125cb *src/frab.h +47f31a21c0cd8fc66314309354ea187f *src/sparsetable_ops.cpp 9e68b0a4b1df7d52541fa46fff582d29 *tests/testthat.R -d645568bd1aa4d8ecd32857541701db4 *tests/testthat/test_aaa.R -3ff3363c5e04cb4d05ca9809881db10c *tests/testthat/test_aab.R -f787f9bf341b82267132833e40263a09 *vignettes/frab.Rmd -aa694962e8901d679115abaf2386f83e *vignettes/frab.bib +3e79d52355dae72090cf1eb0ec4b5902 *tests/testthat/test_aaa.R +b7f4a07171618298ad285c1502062e95 *tests/testthat/test_aab.R +6358c120726711bbe0edcd3cc4030cba *tests/testthat/test_aac.R +a82f9f28656912b8f25fd647d5238c0c *tests/testthat/test_aad.R +0cff9a2f802561af620671db34a793b6 *vignettes/frab.Rmd +700ee49f464ee71ac60824b6b60955e1 *vignettes/frab.bib diff --git a/NEWS.md b/NEWS.md new file mode 100644 index 0000000..d583f09 --- /dev/null +++ b/NEWS.md @@ -0,0 +1,15 @@ +# frab 0.0-1 + +- initial release + + +# frab 0.0-2 + +- multiplication method +- new arxiv preprint +- new draft vignette in inst/ + +# frab 0.0-3 + +- sparsetable functionality +- near complete test coverage \ No newline at end of file diff --git a/R/RcppExports.R b/R/RcppExports.R index c9490da..e8a6e0d 100644 --- a/R/RcppExports.R +++ b/R/RcppExports.R @@ -9,6 +9,10 @@ c_frab_add <- function(names1, values1, names2, values2) { .Call(`_frab_c_frab_add`, names1, values1, names2, values2) } +c_frab_multiply <- function(names1, values1, names2, values2) { + .Call(`_frab_c_frab_multiply`, names1, values1, names2, values2) +} + c_frab_pmax <- function(names1, values1, names2, values2) { .Call(`_frab_c_frab_pmax`, names1, values1, names2, values2) } @@ -17,3 +21,35 @@ c_frab_eq <- function(names1, values1, names2, values2) { .Call(`_frab_c_frab_eq`, names1, values1, names2, values2) } +sparsetable_maker <- function(M, d) { + .Call(`_frab_sparsetable_maker`, M, d) +} + +sparsetable_add <- function(M1, d1, M2, d2) { + .Call(`_frab_sparsetable_add`, M1, d1, M2, d2) +} + +sparsetable_overwrite <- function(M1, d1, M2, d2) { + .Call(`_frab_sparsetable_overwrite`, M1, d1, M2, d2) +} + +sparsetable_accessor <- function(M, d, Mindex) { + .Call(`_frab_sparsetable_accessor`, M, d, Mindex) +} + +sparsetable_setter <- function(M1, d1, M2, d2) { + .Call(`_frab_sparsetable_setter`, M1, d1, M2, d2) +} + +sparsetable_equality <- function(M1, d1, M2, d2) { + .Call(`_frab_sparsetable_equality`, M1, d1, M2, d2) +} + +sparsetable_asum_include <- function(M, d, n) { + .Call(`_frab_sparsetable_asum_include`, M, d, n) +} + +sparsetable_pmax <- function(M1, d1, M2, d2) { + .Call(`_frab_sparsetable_pmax`, M1, d1, M2, d2) +} + diff --git a/R/frab.R b/R/frab.R index e35f594..c6c0216 100644 --- a/R/frab.R +++ b/R/frab.R @@ -8,6 +8,19 @@ setMethod("names","frab", return(disord(names(x@x),h=hashcal(x@x))) } ) +setReplaceMethod("names",signature(x="frab",value="disord"), + function(x,value){ + v <- values(x) + stopifnot(consistent(v,value)) + frab(setNames(elements(v),elements(value))) + } ) + +setReplaceMethod("names",signature(x="frab",value="character"), + function(x,value){ + stopifnot(length(x)==1) + frab(setNames(values(x),value)) + } ) + setGeneric("values",function(x){standardGeneric("values")}) setMethod("values","frab", function(x){ @@ -67,6 +80,8 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng return(list_to_frab(x)) } else if(is.1dtable(x)){ return(table_to_frab(x)) + } else if(is.sparsetable(x)){ + return(sparsetable_to_frab(x)) } else if(is.frab(x)){ return(x) } else { @@ -80,11 +95,16 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng as.frab(c_frab_add(elements(names(F1)), elements(values(F1)), elements(names(F2)), elements(values(F2)))) } - -`frab_multiply_numeric` <- function(e1,e2){frab(setNames(elements(values(e1)*e2),elements(names(e1))))} -`frab_power_numeric` <- function(e1,e2){frab(setNames(elements(values(e1)^e2),elements(names(e1))))} -`numeric_multiply_frab` <- function(e1,e2){frab(setNames(elements(e1*values(e2)),elements(names(e2))))} -`numeric_power_frab` <- function(e1,e2){frab(setNames(elements(e1^values(e2)),elements(names(e2))))} + +`frab_multiply_frab` <- function(F1,F2){ + as.frab(c_frab_multiply(elements(names(F1)), elements(values(F1)), + elements(names(F2)), elements(values(F2)))) +} + +`frab_plus_numeric` <- function(e1,e2){if(is.namedvector(e2)){return(e1+frab(e2))}else{return(frab(setNames(elements(values(e1)+e2),elements(names(e1)))))}} +`frab_multiply_numeric` <- function(e1,e2){if(is.namedvector(e2)){stop("not defined")}else{return(frab(setNames(elements(values(e1)*e2),elements(names(e1)))))}} +`frab_power_numeric` <- function(e1,e2){if(is.namedvector(e2)){stop("not defined")}else{return(frab(setNames(elements(values(e1)^e2),elements(names(e1)))))}} +`numeric_power_frab` <- function(e1,e2){stop(" ^ not defined")} `frab_unary` <- function(e1,e2){ switch(.Generic, @@ -100,13 +120,14 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng switch(.Generic, "+" = frab_plus_frab(e1, e2), "-" = frab_plus_frab(e1, frab_negative(e2)), + "*" = frab_multiply_frab(e1, e2), stop(gettextf("binary operator %s not implemented on frabs", dQuote(.Generic))) ) } `frab_arith_numeric` <- function(e1,e2){ # e1 frab, e2 numeric; e2 might be a named vector. switch(.Generic, - "+" = frab_plus_frab(e1, as.frab(e2)), - "-" = frab_plus_frab(e1, frab_negative(as.frab(e2))), + "+" = frab_plus_numeric(e1, e2), + "-" = frab_plus_numeric(e1, -e2), "*" = frab_multiply_numeric(e1,e2), "/" = frab_multiply_numeric(e1,1/e2), "^" = frab_power_numeric(e1,e2), @@ -115,10 +136,10 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng `numeric_arith_frab` <- function(e1,e2){ # e1 numeric, e2 frab; e2 _might_ be a named vector. switch(.Generic, - "+" = frab_plus_frab(as.frab(e1), e2), - "-" = frab_plus_frab(as.frab(e1), -e2), - "*" = numeric_multiply_frab(e1,e2), - "/" = numeric_multiply_frab(e1,frab_reciprocal(e2)), + "+" = frab_plus_numeric( e2,e1), + "-" = frab_plus_numeric(-e2,e1), + "*" = frab_multiply_numeric(e2,e1), + "/" = frab_multiply_numeric(frab_reciprocal(e2),e1), "^" = numeric_power_frab(e1,e2), stop(gettextf("binary operator %s not implemented on frabs", dQuote(.Generic))) ) } @@ -137,6 +158,7 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng } `frab_eq_num` <- function(e1,e2){values(e1) == e2} +`frab_ne_num` <- function(e1,e2){values(e1) != e2} `frab_gt_num` <- function(e1,e2){values(e1) > e2} `frab_ge_num` <- function(e1,e2){values(e1) >= e2} `frab_lt_num` <- function(e1,e2){values(e1) < e2} @@ -145,6 +167,7 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng `frab_compare_numeric` <- function(e1,e2){ # rfrab() > 3 switch(.Generic, "==" = frab_eq_num(e1, e2), + "!=" = frab_ne_num(e1, e2), ">" = frab_gt_num(e1, e2), ">=" = frab_ge_num(e1, e2), "<" = frab_lt_num(e1, e2), @@ -154,6 +177,7 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng `num_eq_frab` <- function(e1,e2){e1 == values(e2)} +`num_ne_frab` <- function(e1,e2){e1 != values(e2)} `num_gt_frab` <- function(e1,e2){e1 > values(e2)} `num_ge_frab` <- function(e1,e2){e1 >= values(e2)} `num_lt_frab` <- function(e1,e2){e1 < values(e2)} @@ -162,6 +186,7 @@ setMethod("as.table","frab",function(x,...){structure(as.namedvector(x),dim=leng `numeric_compare_frab` <- function(e1,e2){ # 4 <= rfrab() switch(.Generic, "==" = num_eq_frab(e1, e2), + "!=" = num_ne_frab(e1, e2), ">" = num_gt_frab(e1, e2), ">=" = num_ge_frab(e1, e2), "<" = num_lt_frab(e1, e2), @@ -250,6 +275,12 @@ setReplaceMethod("[",signature(x="frab",i="character",j="missing",value="numeric ) }) +setReplaceMethod("[",signature(x="frab",i="character",j="missing",value="logical"), + function(x,i,j,value){ + x[i] <- as.numeric(value) # the meat + return(x) + }) + setReplaceMethod("[",signature(x="frab",i="disord",j="missing",value="numeric"), function(x,i,j,value){ s <- names(x) @@ -268,6 +299,12 @@ setReplaceMethod("[",signature(x="frab",i="disord",j="missing",value="numeric"), ) }) +setReplaceMethod("[",signature(x="frab",i="disord",j="missing",value="logical"), + function(x,i,j,value){ + x[i] <- as.numeric(value) # the meat + return(x) + } ) + setReplaceMethod("[",signature(x="frab",i="disord",j="missing",value="frab"), function(x,i,j,value){ stop("not currently implemented. Idiom such as x[x<0] <- -x[x<0] is disord-compliant [and meaningful] but not yet implemented") @@ -280,13 +317,24 @@ setReplaceMethod("[",signature(x="frab",i="disindex",j="missing",value="numeric" return(frab(setNames(elements(p),names(x)))) } ) -setReplaceMethod("[",signature(x="frab",i="missing",j="missing",value="ANY"), +setReplaceMethod("[",signature(x="frab",i="missing",j="missing",value="numeric"), function(x,i,j,value){ v <- values(x) - v[] <- value + v[] <- value # disord discipline violations trapped here return(frab(setNames(v,names(x)))) } ) +setReplaceMethod("[",signature(x="frab",i="missing",j="missing",value="frab"), + function(x,i,j,value){ + stop("x[] <- y (with x, y frabs) does not make sense; try x <- y?") + } ) + +setReplaceMethod("[",signature(x="frab",i="missing",j="missing",value="ANY"), + function(x,i,j,value){ + stop("frab,missing,missing,ANY-method not implemented") + } ) + + setReplaceMethod("[",signature(x="frab",i="ANY",j="ANY",value="ANY"), function(x,i,j,value){ stop("replacement operation not defined in this case") @@ -323,7 +371,7 @@ setGeneric("pmin",function(...){standardGeneric("pmin")}) } else if(nargs()<3){ return(pmax_pair(x, ...)) } else { - return(pmax_pair(x, pmax_pair(...))) + return(pmax_pair(x, pmax_dots(...))) } } @@ -333,7 +381,7 @@ setGeneric("pmin",function(...){standardGeneric("pmin")}) } else if(nargs()<3){ return(pmin_pair(x, ...)) } else { - return(pmin_pair(x, pmin_pair(...))) + return(pmin_pair(x, pmin_dots(...))) } } @@ -341,13 +389,33 @@ setMethod("pmax",signature("..."="frab"), function(...){pmax_dots(...)} ) setMethod("pmin",signature("..."="frab"), function(...){pmin_dots(...)} ) setMethod("pmax",signature("..."="ANY"),function(...,na.rm=FALSE){base::pmax(..., na.rm=na.rm)}) -setMethod("pmin",signature("..."="ANY"),function(...,na.rm=FALSE){base::pmax(..., na.rm=na.rm)}) +setMethod("pmin",signature("..."="ANY"),function(...,na.rm=FALSE){base::pmin(..., na.rm=na.rm)}) setGeneric("is.na") setMethod("is.na","frab",function(x){which(is.na(values(x)))}) setGeneric("is.na<-") setReplaceMethod("is.na",signature("frab",value="disord"), function(x,value){ - values(x)[value] <- NA - return(x) + v <- values(x) + is.na(v) <- value # the meat + return(frab(setNames(elements(v),elements(names(x))))) } ) + +setGeneric("is.notna",function(x){standardGeneric("is.notna")}) +setMethod("is.notna","frab",function(x){which(!is.na(values(x)))}) + +setMethod("Summary", "frab", + function(x, ..., na.rm=FALSE){ + switch(.Generic, + max = max(values(x)), + min = min(values(x)), + range = c(min(values(x)),max(values(x))), + sum = sum(values(x)), + stop(gettextf("Summary function %s not implemented on frabs", dQuote(.Generic))) + ) + } + ) + + + + diff --git a/R/sparsetable.R b/R/sparsetable.R new file mode 100644 index 0000000..a81ab14 --- /dev/null +++ b/R/sparsetable.R @@ -0,0 +1,415 @@ +setClass("sparsetable", + representation = representation(index="matrix",values="numeric"), + prototype = list(index=matrix(),values=numeric()), + ) + +setGeneric("index",function(x){standardGeneric("index")}) +setMethod("index","sparsetable",function(x){x@index}) +setMethod("values","sparsetable", + function(x){ + disord(as.numeric(x@values),h=hashcal(list(x@index,x@values))) + ## no occurrences of "@" below this line; accessor methods end + } ) + +setMethod("names","sparsetable", + function(x){ + stop("sparsetable objects do not have a 'names' attribute; try dimnames()") + } ) + +setMethod("dimnames","sparsetable",function(x){colnames(index(x))}) +setReplaceMethod("dimnames","sparsetable",function(x,value){ + I <- index(x) + colnames(I) <- value # the meat + return(sparsetable(I,values(x))) +} ) + +setGeneric("nterms",function(x){standardGeneric("nterms")}) +setMethod("nterms","sparsetable",function(x){nrow(index(x))}) + +setValidity("sparsetable",function(object){ + i <- index(object) + v <- values(object) + if(!is.character(i)){ + stop("not a character, we need a character vector") + } else if(nrow(i) != length(v)){ + stop("length of values must match number of rows of index") + } else { + return(TRUE) + } +} ) + +setGeneric("is.empty",function(x){standardGeneric("is.empty")}) +setMethod("is.empty","sparsetable",function(x){nrow(index(x))==0}) +setGeneric("arity",function(x){standardGeneric("arity")}) +setMethod("arity","sparsetable",function(x){ncol(index(x))}) +setMethod("dim","sparsetable",function(x){apply(index(x),2,function(x){length(unique(x))})}) + +setGeneric("as.array") +setMethod("as.array","sparsetable",function(x){sparsetable_to_array(x)}) + +`sparsetable_to_array` <- function(x){ + if(is.empty(x)){return(array(0,rep(0,arity(x))))} + I <- apply(index(x),2,function(x){as.numeric(as.factor(x))}) + dims <- apply(I,2,max) + out <- array(0,dims) + out[I] <- values(x) + L <- apply(index(x),2,function(x){levels(as.factor(x))},simplify=FALSE) + names(L) <- colnames(I) + dimnames(out) <- L + return(out) +} + +`array_to_sparsetable` <- function(x){ + + I <- which(x != 0,arr.ind=TRUE) + J <- I + colnames(J) <- names(dimnames(x)) + for(i in seq_len(ncol(I))){ + J[,i] <- (dimnames(x)[[i]])[I[,i]] # the meat + } + return(sparsetable(J,x[x!=0])) +} + +`sparsetable_to_frab` <- function(x){ + frab(setNames(elements(values(x)),apply(index(x),1,paste,collapse="_"))) +} + +setMethod("show", "sparsetable", function(object){print_sparsetable_matrixform(object)}) + +`print_sparsetable_matrixform` <- function(S){ + if(is.empty(S)){ + cat(paste('empty sparsetable with ', arity(S), ' columns\n',sep="")) + } else if((arity(S)==2) && !isFALSE(getOption("print_2dsparsetables_as_matrices"))){ + print(sparsetable_to_array(S)) + } else { + jj <- + data.frame(index(S),symbol= " = ", val=round(elements(values(S)),getOption("digits"))) + mdc <- colnames(index(S)) + if(is.null(mdc)){ + colnames(jj) <- c(rep(" ",arity(S)+1),'val') + } else { + colnames(jj) <- c(mdc[seq_len(arity(S))],' ','val') + } + print(jj,row.names=FALSE) + } + return(invisible(S)) +} + +`sparsetable` <- function(i,v=1){ + if(length(v)==1){v <- rep(v,nrow(i))} + stopifnot(nrow(i) == length(v)) + jj <- sparsetable_maker(i,v) + if(is.null(jj$index)){ + jj$index <-matrix(character(0),0,ncol(i)) + jj$value <- numeric(0) + } + colnames(jj$index) <- colnames(i) + new("sparsetable",index=jj$index,values=jj$value)} # This is the only time new("sparsetable",...) is called + +`is.sparsetable` <- function(x){inherits(x,"sparsetable")} + +`as.sparsetable` <- function(x){ + if(is.sparsetable(x)){ + return(x) + } else if(is.frab(x)){ + return(sparsetable(cbind(names(x)),values(x))) + } else if(is.list(x)){ + return(sparsetable(x$index,x$value)) + } else if(is.table(x)){ + return(array_to_sparsetable(as.array(x))) + } else if(is.array(x)){ + return(array_to_sparsetable(x)) + } +} + +`sparsetable_negative` <- function(S){ + if(is.zero(S)){ + return(S) + } else { + return(sparsetable(index(S),-values(S))) + } +} + +`sparsetable_eq_sparsetable` <- function(S1,S2){ + if(arity(S1) != arity(S2)){ + return(FALSE) + } else if(nterms(S1) != nterms(S2)){ + return(FALSE) + } else { + return(sparsetable_equality(index(S1),values(S1),index(S2),values(S2))) + } +} + +`rspar` <- function(n=15,l=3,d=3){ + sparsetable( + matrix( + letters[sample(seq_len(l),n*d,replace=TRUE)], + n, d, dimnames=list(NULL,month.abb[seq_len(d)])),seq_len(n)) +} + +`rspar2` <- function(n=15,l=6){ + sparsetable(as.matrix(data.frame( + foo=letters[sample(seq_len(l),n,replace=TRUE)], + bar=LETTERS[sample(seq_len(l),n,replace=TRUE)])), + seq_len(n)) +} + +`rsparr` <- function(n=20,d=6,l=5,s=4){ + I <- sapply(seq_len(d),function(d){sample(replicate(s,paste(sample(letters[seq_len(l)],d,replace=TRUE),collapse="")),n,replace=TRUE)}) + colnames(I) <- month.abb[seq_len(d)] + return(sparsetable(I,seq_len(n))) +} + +`sparsetable_negative` <- function(x){sparsetable(index(x), -values(x))} +`sparsetable_reciprocal` <- function(x){stop("inverse not implemented")} +`sparsetable_plus_sparsetable` <- function(F1,F2){ + stopifnot(arity(F1) == arity(F2)) + out <- ( + sparsetable_add( + index(F1),values(F1), + index(F2),values(F2) + )) + if(is.null(out$index)){ + out$index <- index(F1)[FALSE,] + out$value <- numeric(0) + } + out <- as.sparsetable(out) + if(is.null(dimnames(F1))){ + dimnames(out) <- dimnames(F2) + } else { + dimnames(out) <- dimnames(F1) + } + return(out) +} + +`sparsetable_multiply_sparsetable` <- function(F1,F2){ + stop("multiplication not implemented") +} + +`sparsetable_multiply_numeric` <- function(e1,e2){sparsetable(index(e1),values(e1)*e2)} +`sparsetable_power_numeric` <- function(e1,e2){stop("sparsetable power not implemented")} +`numeric_multiply_sparsetable` <- function(e1,e2){sparsetable_multiply_numeric(e2,e1)} +`numeric_power_sparsetable` <- function(e1,e2){stop("sparsetable power not implemented")} + +`sparsetable_unary` <- function(e1,e2){ + switch(.Generic, + "+" = e1, + "-" = sparsetable_negative(e1), + stop(gettextf("unary operator %s not implemented on sparsetables", dQuote(.Generic))) + ) +} + +`sparsetable_arith_sparsetable` <- function(e1,e2){ + e1 <- as.sparsetable(e1) + e2 <- as.sparsetable(e2) + switch(.Generic, + "+" = sparsetable_plus_sparsetable(e1, e2), + "-" = sparsetable_plus_sparsetable(e1, sparsetable_negative(e2)), + "*" = sparsetable_multiply_sparsetable(e1, e2), + stop(gettextf("binary operator %s not implemented on sparsetables", dQuote(.Generic))) + ) } + +`sparsetable_arith_numeric` <- function(e1,e2){ # e1 sparsetable, e2 numeric; e2 might be a named vector. + switch(.Generic, + "*" = sparsetable_multiply_numeric(e1,e2), + "/" = sparsetable_multiply_numeric(e1,1/e2), + "^" = sparsetable_power_numeric(e1,e2), + stop(gettextf("binary operator %s not implemented in this case", dQuote(.Generic))) + ) } + +`numeric_arith_sparsetable` <- function(e1,e2){ # e1 numeric, e2 sparsetable; e2 _might_ be a named vector. + switch(.Generic, + "*" = numeric_multiply_sparsetable(e1,e2), + "/" = numeric_multiply_sparsetable(e1,sparsetable_reciprocal(e2)), + "^" = numeric_power_sparsetable(e1,e2), + stop(gettextf("binary operator %s not implemented in this case", dQuote(.Generic))) + ) } + + +`sparsetable_compare_sparsetable` <- function(e1,e2){ + switch(.Generic, + "==" = sparsetable_eq_sparsetable(e1, e2), + "!=" = !sparsetable_eq_sparsetable(e1, e2), + stop(gettextf("comparison '%s' not for sparsetables", dQuote(.Generic))) + ) +} + +`sparsetable_eq_num` <- function(e1,e2){values(e1) == e2} +`sparsetable_gt_num` <- function(e1,e2){values(e1) > e2} +`sparsetable_ge_num` <- function(e1,e2){values(e1) >= e2} +`sparsetable_lt_num` <- function(e1,e2){values(e1) < e2} +`sparsetable_le_num` <- function(e1,e2){values(e1) <= e2} + +`sparsetable_compare_numeric` <- function(e1,e2){ # rsparsetable() > 3 + switch(.Generic, + "==" = sparsetable_eq_num(e1, e2), + ">" = sparsetable_gt_num(e1, e2), + ">=" = sparsetable_ge_num(e1, e2), + "<" = sparsetable_lt_num(e1, e2), + "<=" = sparsetable_le_num(e1, e2), + stop(gettextf("Comparison operator %s not implemented in this case", dQuote(.Generic))) + ) } + +`num_eq_sparsetable` <- function(e1,e2){e1 == values(e2)} +`num_gt_sparsetable` <- function(e1,e2){e1 > values(e2)} +`num_ge_sparsetable` <- function(e1,e2){e1 >= values(e2)} +`num_lt_sparsetable` <- function(e1,e2){e1 < values(e2)} +`num_le_sparsetable` <- function(e1,e2){e1 <= values(e2)} + +`numeric_compare_sparsetable` <- function(e1,e2){ # 4 <= rsparsetable() + switch(.Generic, + "==" = num_eq_sparsetable(e1, e2), + ">" = num_gt_sparsetable(e1, e2), + ">=" = num_ge_sparsetable(e1, e2), + "<" = num_lt_sparsetable(e1, e2), + "<=" = num_le_sparsetable(e1, e2), + stop(gettextf("Comparison operator %s not implemented in this case", dQuote(.Generic))) + ) } + +setMethod("Arith" , signature(e1="sparsetable" , e2="missing"), sparsetable_unary ) +setMethod("Arith" , signature(e1="sparsetable" , e2="sparsetable" ), sparsetable_arith_sparsetable ) +setMethod("Arith" , signature(e1="sparsetable" , e2="numeric"), sparsetable_arith_numeric) +setMethod("Arith" , signature(e1="numeric", e2="sparsetable" ), numeric_arith_sparsetable) +setMethod("Arith" , signature(e1="ANY" , e2="sparsetable" ), sparsetable_arith_sparsetable ) +setMethod("Arith" , signature(e1="sparsetable" , e2="ANY" ), sparsetable_arith_sparsetable ) + +setMethod("Compare", signature(e1="sparsetable" , e2="sparsetable" ), sparsetable_compare_sparsetable ) +setMethod("Compare", signature(e1="sparsetable" , e2="numeric"), sparsetable_compare_numeric) +setMethod("Compare", signature(e1="numeric" , e2="sparsetable" ), numeric_compare_sparsetable ) + +setMethod("[", + signature(x="sparsetable",i="ANY",j="ANY"), + function(x,i, ...){ + if(is.matrix(i)){ + out <- sparsetable_accessor(index(x),values(x), i) + } else { + out <- sparsetable_accessor(index(x),values(x), matrix(c(i,j,unlist(list(...))),nrow=1)) + } + return(out) + } ) + +setMethod("[",signature(x="sparsetable",i="disord",j="missing"), + function(x,i){ + sparsetable(index(x)[i,,drop=FALSE],values(x)[i]) # the meat + } ) + +setMethod("[",signature(x="sparsetable",i="disindex",j="missing"), + function(x,i,j){ + vx <- frab::values(x) + vi <- disordR::values(i) + sparsetable(index(x)[vi,,drop=FALSE], vx[i]) # the meat + } ) + +setReplaceMethod("[",signature(x="sparsetable",i="disord",j="missing",value="numeric"), + function(x,i,j,value){ + v <- values(x) + stopifnot(consistent(v,i)) + v[i] <- value # the meat + sparsetable(index(x),v) + } ) + +setReplaceMethod("[",signature(x="sparsetable",i="disindex",j="missing",value="ANY"), + function(x,i,j,...,value){ + stopifnot(identical(hash(values(x)),hash(i))) + if(is.disord(value)){stop("replace methods for disindex do not take disords")} + jj <- values(x) + jj[disordR::values(i)] <- value # the meat + return(sparsetable(index(x),jj)) + } ) + +setReplaceMethod("[",signature(x="sparsetable",value="ANY"), + function(x,i,j,...,value){ + if(missing(i)){ # S[] <- something + if(is.sparsetable(value)){ + return( + as.sparsetable(sparsetable_overwrite( + index(x ),values(x ), + index(value),values(value)))) + } else { + stop("replacement method not defined (disord discipline violation?)") + ## return(sparsetable(index(x),value)) + } + } + + if(is.matrix(i)){ + M <- i + } else if(is.sparsetable(i)){ + warning("possible disord discipline violation") + M <- index(i) + } else { + if(missing(j)){j <- NULL} + M <- as.matrix(expand.grid(c(list(i), j, list(...)))) + } + if(ncol(M) != arity(x)){ + stop("incorrect number of dimensions specified") + } + + if(length(value)==1){value <- rep(value,nrow(M))} + stopifnot(length(value)==nrow(M)) + return(as.sparsetable(sparsetable_setter(index(x),values(x),M,value))) + } + ) + +setMethod("drop","sparsetable",function(x){frab(setNames(disordR::elements(values(x)),c(index(x))))}) + +`pmax_pair_sparsetable` <- function(F1,F2){ + as.sparsetable(sparsetable_pmax( + index(F1),elements(values(F1)), + index(F2),elements(values(F2)) + )) +} + +`pmin_pair_sparsetable` <- function(F1,F2){ + -pmax_pair_sparsetable(-F1,-F2)} + +`pmax_dots_sparsetable` <- function(x,...){ + if(nargs()==1){ + return(x) + } else if(nargs()<3){ + return(pmax_pair_sparsetable(x, ...)) + } else { + return(pmax_pair_sparsetable(x, pmax_dots_sparsetable(...))) + } +} + +`pmin_dots_sparsetable` <- function(x,...){ + if(nargs()==1){ + return(x) + } else if(nargs()<3){ + return(pmin_pair_sparsetable(x, ...)) + } else { + return(pmin_pair_sparsetable(x, pmin_dots_sparsetable(...))) + } +} + +setMethod("pmax",signature("..."="sparsetable"), function(...){pmax_dots_sparsetable(...)} ) +setMethod("pmin",signature("..."="sparsetable"), function(...){pmin_dots_sparsetable(...)} ) + +setGeneric("asum",function(S,dims){standardGeneric("asum")}) +setMethod("asum",signature(S="sparsetable"), function(S,dims){asum_sparsetable(S,dims)}) + +`asum_sparsetable` <- function(S, dims){ + if(is.character(dims)){ + dims <- which(dimnames(S) %in% dims) + } else { + if(is.logical(dims)){ dims <- which(dims) } + stopifnot(all(dims <= arity(S))) + stopifnot(all(dims >0)) + stopifnot(all(dims == round(dims))) + } + jj <- sparsetable_asum_include(index(S),elements(values(S)),dims) + I <- jj$index[,-dims,drop=FALSE] + colnames(I) <- dimnames(S)[-dims] + return(sparsetable(I,jj$value)) +} + +`asum_exclude_sparsetable` <- function(S,dims){ + if(is.character(dims)){ + dims <- which(!(dimnames(S) %in% dims)) + } else if(is.logical(dims)){ + dims <- !dims + } else { + dims <- which(!(seq_len(arity(S)) %in% dims)) + } + return(asum_sparsetable(S, dims)) +} diff --git a/README.md b/README.md index 51cf093..85ab639 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -The frab package: an alternative interpretation of named vectors +The frab package: how to add tables ================ @@ -7,21 +7,30 @@ The frab package: an alternative interpretation of named vectors # Overview -The `frab` package furnishes an alternative interpretation of named -vectors wherein addition is defined using the (unique) names as the -primary key. This allows one to “add” tables in a consistent and -meaningful way. The underlying mathematical object is the Free Abelian -group. +The `frab` package allows one to “add” tables in a natural way. It also +furnishes an alternative interpretation of named vectors wherein +addition is defined using the (unique) names as the primary key. Support +for multi-dimensional tables is included. The underlying mathematical +object is the Free Abelian group. To cite in publications please use R. +K. S. Hankin 2023. “The free Abelian group in R: the frab package”, +arXiv, . + +The package has two S4 classes: `frab` and `sparsetable`. Class `frab` +is for one-dimensional tables and is an alternative implementation of +named vectors; class `sparsetable` handles multi-way tables in a natural +way. # The package in use -The package has a single S4 class of objects, `frab`. Primary -construction function `frab()` takes a named vector and returns a `frab` -object: +## One-dimensional tables: class `frab` + +Primary construction function `frab()` takes a named vector and returns +a `frab` object: ``` r suppressMessages(library("frab")) -frab(c(x=1,b=2,a=2,b=3,c=7,x=-1)) +p <- c(x=1,b=2,a=2,b=3,c=7,x=-1) +frab(p) #> A frab object with entries #> a b c #> 2 5 7 @@ -31,8 +40,20 @@ Above, we see from the return value that function `frab()` has reordered the labels of its argument, calculated the value for entry `b` \[as ![2+3=5](https://latex.codecogs.com/png.latex?2%2B3%3D5 "2+3=5")\], determined that the entry for `x` has vanished \[the values cancelling -out\], and printed the result using a bespoke show method. However, the -package’s most useful feature is the overloaded definition of addition: +out\], and printed the result using a bespoke show method. It is useful +to think of the input argument as a semi-constructed and generalized +“table” of observations. Thus + +``` r +p +#> x b a b c x +#> 1 2 2 3 7 -1 +``` + +Above we see `p` might correspond to a story: “look, we have one `x`, +two `b`s, two `a`s, another three `b`s, seven `c`s…oh hang on that `x` +was a mistake I had better subtract one now”. However, the package’s +most useful feature is the overloaded definition of addition: ``` r (x <- rfrab()) @@ -49,8 +70,170 @@ x+y #> 7 10 2 6 8 5 7 7 ``` +Above we see function `rfrab()` used to generate a random `frab` object, +corresponding to a table. It is *possible* to add `x` and `y` directly: + +``` r +xn <- as.namedvector(x) +yn <- as.namedvector(y) +table(c(rep(names(xn),times=xn),rep(names(yn),times=yn))) +#> +#> a b c d e f g i +#> 7 10 2 6 8 5 7 7 +``` + +but this is extremely inefficient and cannot deal with fractional (or +indeed negative) entries. + +# Multi-way tables + +Class `sparsetable` deals with multi-way tables. Taking three-way tables +as an example: + +``` r +(x3 <- rspar()) +#> Jan Feb Mar val +#> a a a = 10 +#> a c b = 15 +#> b a a = 11 +#> b a b = 9 +#> b a c = 12 +#> b b a = 6 +#> b b b = 3 +#> b b c = 14 +#> b c a = 9 +#> b c c = 21 +#> c c a = 10 +``` + +Function `rspar()` returns a random `sparsetable` object. We see that, +of the +![3^3=27](https://latex.codecogs.com/png.latex?3%5E3%3D27 "3^3=27") +possible entries, only 11 are non-zero. We may coerce to a regular +table: + +``` r +as.array(x3) +#> , , Mar = a +#> +#> Feb +#> Jan a b c +#> a 10 0 0 +#> b 11 6 9 +#> c 0 0 10 +#> +#> , , Mar = b +#> +#> Feb +#> Jan a b c +#> a 0 0 15 +#> b 9 3 0 +#> c 0 0 0 +#> +#> , , Mar = c +#> +#> Feb +#> Jan a b c +#> a 0 0 0 +#> b 12 14 21 +#> c 0 0 0 +``` + +In this case it is hardly worth taking advantage of the sparse +representation (which is largely inherited from the `spray` package) but +a larger example might be + +``` r +rspar(n=4,l=10,d=12) +#> Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec val +#> b c j e f j f a g i a d = 1 +#> g a j e c f e c a f g c = 4 +#> j b j g h c d c c b b i = 2 +#> j j h h a a i f c h g h = 3 +``` + +The random `sparsetable` object shown above would require +![10^{12}](https://latex.codecogs.com/png.latex?10%5E%7B12%7D "10^{12}") +floating point numbers in full array form, of which only 4 are nonzero. +Multi-way tables may be added in the same way as `frab` objects: + +``` r +y3 <- rspar() +x3+y3 +#> Jan Feb Mar val +#> a a a = 10 +#> a a b = 14 +#> a b a = 4 +#> a c a = 14 +#> a c b = 15 +#> b a a = 11 +#> b a b = 23 +#> b a c = 12 +#> b b a = 17 +#> b b b = 13 +#> b b c = 23 +#> b c a = 9 +#> b c b = 7 +#> b c c = 24 +#> c a a = 15 +#> c c a = 15 +#> c c c = 14 +``` + +## Two-way tables + +Two-way tables are something of a special case, having their own print +method. By default, two-dimensional `sparsetable` objects are coerced to +a matrix before printing, but otherwise operate in the same way as the +multi-dimensional case discussed above: + +``` r +(x2 <- rspar2()) +#> bar +#> foo A B D E F +#> a 3 20 0 0 9 +#> b 0 0 15 0 0 +#> c 0 0 0 4 0 +#> d 0 0 0 5 22 +#> e 0 2 0 11 29 +(y2 <- rspar2()) +#> bar +#> foo A C D E F +#> a 9 0 25 6 10 +#> b 7 0 0 0 1 +#> c 0 0 0 11 0 +#> d 8 5 0 4 0 +#> e 0 3 2 0 0 +#> f 0 0 14 0 15 +x2+y2 +#> bar +#> foo A B C D E F +#> a 12 20 0 25 6 19 +#> b 7 0 0 15 0 1 +#> c 0 0 0 0 15 0 +#> d 8 0 5 0 9 22 +#> e 0 2 3 2 11 29 +#> f 0 0 0 14 0 15 +``` + +Above, note how the sizes of the coerced matrices are different +(![5\times 5](https://latex.codecogs.com/png.latex?5%5Ctimes%205 "5\times 5") +for `x2`, +![6\times 5](https://latex.codecogs.com/png.latex?6%5Ctimes%205 "6\times 5") +for `y2`) but the addition method copes, using a bespoke sparse matrix +representation. Also note that the sum has *six* columns (corresponding +to six distinct column headings) even though `x2` and `y2` have only +five. + # Further information For more detail, see the package vignette `vignette("frab")` + +## References + +- R. K. S. Hankin 2023. “The free Abelian group in `R`: the `frab` + package”, arXiv, . +- R. K. S. Hankin 2022. “Disordered vectors in `R`: introducing the + `disordR` package”, arXiv, diff --git a/build/frab.pdf b/build/frab.pdf index 3e797cd..6547f8e 100644 Binary files a/build/frab.pdf and b/build/frab.pdf differ diff --git a/build/partial.rdb b/build/partial.rdb index d278d96..afe5e32 100644 Binary files a/build/partial.rdb and b/build/partial.rdb differ diff --git a/build/stage23.rdb b/build/stage23.rdb index 922405b..56cc568 100644 Binary files a/build/stage23.rdb and b/build/stage23.rdb differ diff --git a/build/vignette.rds b/build/vignette.rds index fad4529..9ff503d 100644 Binary files a/build/vignette.rds and b/build/vignette.rds differ diff --git a/inst/doc/frab.R b/inst/doc/frab.R index 3d1eda8..e28c5ac 100644 --- a/inst/doc/frab.R +++ b/inst/doc/frab.R @@ -2,6 +2,7 @@ knitr::opts_chunk$set(echo = TRUE) options(rmarkdown.html_vignette.check_title = FALSE) library("frab") +library("mvtnorm") set.seed(1) ## ----label=badtables---------------------------------------------------------- @@ -48,3 +49,22 @@ a b a+b +## ----label=twodeetables------------------------------------------------------- +(x <- rspar2(9)) +(y <- rspar2(9)) +x+y + +## ----label=threedeetables----------------------------------------------------- +A <- matrix(0.95,3,3) +diag(A) <- 1 +x <- round(rmvnorm(300,mean=rep(10,3),sigma=A/7)) +x[] <- letters[x] +head(x) +(sx <- sparsetable(x)) + +## ----label=showthreedeeadd---------------------------------------------------- +(sz <- sparsetable(matrix(sample(letters[9:11],12,replace=TRUE),ncol=3),1001:1004)) + +## ----label=usualsemantics----------------------------------------------------- +sx + sz + diff --git a/inst/doc/frab.Rmd b/inst/doc/frab.Rmd index 75a9a46..49e693f 100644 --- a/inst/doc/frab.Rmd +++ b/inst/doc/frab.Rmd @@ -1,5 +1,5 @@ --- -title: "An alternative implementation of named vectors: addition of `table`s with the `frab` package" +title: "Addition of `table` objects with the `frab` package" author: "Robin K. S. Hankin" output: html_vignette bibliography: frab.bib @@ -9,11 +9,11 @@ vignette: > %\VignetteEncoding{UTF-8} --- - ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) options(rmarkdown.html_vignette.check_title = FALSE) library("frab") +library("mvtnorm") set.seed(1) ``` @@ -24,8 +24,10 @@ set.seed(1) **TLDR**: Adding two objects of class `table` has a natural interpretation. However, in base R, adding two tables can give plausible but incorrect results. The `frab` package provides a -consistent and efficient way to add `table` objects. The underlying -mathematical structure is the Free Abelian group, hence "`frab`". +consistent and efficient way to add `table` objects, subject to +`disordR` discipline [@hankin2022_disordR]. The underlying +mathematical structure is the Free Abelian group, hence "`frab`". To +cite in publications, please use [@hankin2023_frab]. # Prologue: `table()` @@ -150,9 +152,48 @@ Above we see the `+` operator is defined between a `frab` and a `table`, coercing tables to `frab` objects to give consistent results. -Carefully regulated ambiguity +# Two dimensional tables + +The ideas above have a natural generalization to two-dimensional tables. + +```{r,label=twodeetables} +(x <- rspar2(9)) +(y <- rspar2(9)) +x+y +``` + +Above, note that the resulting sum is automatically resized to +accommodate both addends, and also that entries with nonzero values in +both `x` and `y` are correctly summed. + + +## Arbitrary-dimensioned tables +The one- and two- dimensional tables above have somewhat specialized +print methods and the general case with dimension $\geqslant 3$ uses +methods similar to those of the `spray` package. We can generate a +`sparsetable` object quite easily: +```{r label=threedeetables} +A <- matrix(0.95,3,3) +diag(A) <- 1 +x <- round(rmvnorm(300,mean=rep(10,3),sigma=A/7)) +x[] <- letters[x] +head(x) +(sx <- sparsetable(x)) +``` + +But we can add `sx` to other `sparsetable` objects: + +```{r label=showthreedeeadd} +(sz <- sparsetable(matrix(sample(letters[9:11],12,replace=TRUE),ncol=3),1001:1004)) +``` + +Then the usual semantics for addition operate: + +```{r label=usualsemantics} +sx + sz +``` ## References diff --git a/inst/doc/frab.html b/inst/doc/frab.html index 26eeade..c7ed70a 100644 --- a/inst/doc/frab.html +++ b/inst/doc/frab.html @@ -13,7 +13,7 @@ -An alternative implementation of named vectors: addition of tables with the frab package +Addition of table objects with the frab package @@ -168,7 +168,7 @@ -

An alternative implementation of named vectors: addition of tables with the frab package

+

Addition of table objects with the frab package

Robin K. S. Hankin

@@ -176,7 +176,7 @@

Robin K. S. Hankin

-

TLDR: Adding two objects of class table has a natural interpretation. However, in base R, adding two tables can give plausible but incorrect results. The frab package provides a consistent and efficient way to add table objects. The underlying mathematical structure is the Free Abelian group, hence “frab.”

+

TLDR: Adding two objects of class table has a natural interpretation. However, in base R, adding two tables can give plausible but incorrect results. The frab package provides a consistent and efficient way to add table objects, subject to disordR discipline (Hankin 2022). The underlying mathematical structure is the Free Abelian group, hence “frab.” To cite in publications, please use (Hankin 2023).

Prologue: table()

Suppose we have three tables:

@@ -282,12 +282,99 @@

The frab package

## a b c d e f g i ## 5 8 2 7 2 2 8 5

Above we see the + operator is defined between a frab and a table, coercing tables to frab objects to give consistent results.

-

Carefully regulated ambiguity

+
+
+

Two dimensional tables

+

The ideas above have a natural generalization to two-dimensional tables.

+
(x <- rspar2(9))
+
##    bar
+## foo A  B C D F
+##   b 3  0 8 0 2
+##   d 5 16 0 0 6
+##   f 1  0 0 4 0
+
(y <- rspar2(9))
+
##    bar
+## foo A C D E F
+##   a 0 0 0 9 0
+##   b 0 0 0 0 8
+##   e 0 0 4 0 0
+##   f 7 9 8 0 0
+
x+y
+
##    bar
+## foo A  B C  D E  F
+##   a 0  0 0  0 9  0
+##   b 3  0 8  0 0 10
+##   d 5 16 0  0 0  6
+##   e 0  0 0  4 0  0
+##   f 8  0 9 12 0  0
+

Above, note that the resulting sum is automatically resized to accommodate both addends, and also that entries with nonzero values in both x and y are correctly summed.

+
+

Arbitrary-dimensioned tables

+

The one- and two- dimensional tables above have somewhat specialized print methods and the general case with dimension \(\geqslant 3\) uses methods similar to those of the spray package. We can generate a sparsetable object quite easily:

+
A <- matrix(0.95,3,3)
+diag(A) <- 1
+x <- round(rmvnorm(300,mean=rep(10,3),sigma=A/7))
+x[] <- letters[x]
+head(x)
+
##      [,1] [,2] [,3]
+## [1,] "i"  "i"  "i" 
+## [2,] "j"  "j"  "j" 
+## [3,] "j"  "j"  "k" 
+## [4,] "j"  "j"  "j" 
+## [5,] "j"  "j"  "i" 
+## [6,] "j"  "j"  "j"
+
(sx  <- sparsetable(x))
+
##            val
+##  i i i  =   22
+##  i i j  =    2
+##  i j i  =    5
+##  i j j  =    4
+##  j i i  =    2
+##  j i j  =    1
+##  j j i  =    3
+##  j j j  =  223
+##  j j k  =    7
+##  j k j  =    3
+##  j k k  =    1
+##  k j j  =    2
+##  k j k  =    4
+##  k k j  =    1
+##  k k k  =   20
+

But we can add sx to other sparsetable objects:

+
(sz <- sparsetable(matrix(sample(letters[9:11],12,replace=TRUE),ncol=3),1001:1004))
+
##             val
+##  i k k  =  1003
+##  j j j  =  1004
+##  j j k  =  1001
+##  k k j  =  1002
+

Then the usual semantics for addition operate:

+
sx + sz
+
##             val
+##  i i i  =    22
+##  i i j  =     2
+##  i j i  =     5
+##  i j j  =     4
+##  i k k  =  1003
+##  j i i  =     2
+##  j i j  =     1
+##  j j i  =     3
+##  j j j  =  1227
+##  j j k  =  1008
+##  j k j  =     3
+##  j k k  =     1
+##  k j j  =     2
+##  k j k  =     4
+##  k k j  =  1003
+##  k k k  =    20
+

References

-Hankin, Robin K. S. 2022. “Disordered Vectors in R: Introducing the disordR Package.” arXiv. https://doi.org/10.48550/ARXIV.2210.03856. +Hankin, Robin K. S. 2022. “Disordered Vectors in R: Introducing the disordR Package.” arXiv. https://doi.org/10.48550/ARXIV.2210.03856. +
+
+———. 2023. “The Free Abelian Group in R: The frab Package.” arXiv. https://doi.org/10.48550/ARXIV.2307.13184.
R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. diff --git a/inst/frab.bib b/inst/frab.bib new file mode 100644 index 0000000..408ad42 --- /dev/null +++ b/inst/frab.bib @@ -0,0 +1,38 @@ +@Manual{rcore2023, + title = {R: A Language and Environment for Statistical Computing}, + author = {{R Core Team}}, + organization = {R Foundation for Statistical Computing}, + address = {Vienna, Austria}, + year = {2023}, + url = {https://www.R-project.org/}, + } + +@Misc{hankin2022_spray, + title = {Sparse arrays in {R}: the {{\tt spray}} package}, + url = {https://arxiv.org/abs/2210.10848}, + howpublished = {\url{https://arxiv.org/abs/2210.03856}}, + author = {Robin K. S. Hankin}, + year = {2022}, + publisher = {arXiv}, + doi = {10.48550/ARXIV.2210.10848}, + } + +@misc{hankin2022_disordR, + doi = {10.48550/ARXIV.2210.03856}, + howpublished = {\url{https://arxiv.org/abs/2210.03856}}, + author = {Hankin, Robin K. S.}, + title = {Disordered vectors in {R}: introducing the {{\tt disordR}} package}, + publisher = {arXiv}, + year = {2022} + } + +@misc{hankin2022_clifford, + doi = {10.48550/ARXIV.2209.13659}, + howpublished = {\url{https://arxiv.org/abs/2209.13659}}, + author = {Hankin, Robin K. S.}, + keywords = {Symbolic Computation (cs.SC), FOS: Computer and information sciences, FOS: Computer and information sciences}, + title = {Clifford algebra in {R}}, + publisher = {arXiv}, + year = {2022} + } + diff --git a/inst/frab_arxiv.Rnw b/inst/frab_arxiv.Rnw new file mode 100644 index 0000000..775cfcd --- /dev/null +++ b/inst/frab_arxiv.Rnw @@ -0,0 +1,1031 @@ +% -*- mode: noweb; noweb-default-code-mode: R-mode; -*- + + +\documentclass{article} + +\usepackage{arxiv} + +\usepackage[utf8]{inputenc} % allow utf-8 input +\usepackage[T1]{fontenc} % use 8-bit T1 fonts +\usepackage{hyperref} % hyperlinks +\usepackage{url} % simple URL typesetting +\usepackage{booktabs} % professional-quality tables +\usepackage{amsfonts} % blackboard math symbols +\usepackage{amssymb} % needed for \leqslant +\usepackage{amsmath} % needed for cases +\usepackage{nicefrac} % compact symbols for 1/2, etc. +\usepackage{microtype} % microtypography +\usepackage{lipsum} % Can be removed after putting your text content +\usepackage{graphicx} +\usepackage[numbers]{natbib} +\usepackage{doi} +\usepackage{wrapfig} +\usepackage{tikz-cd} +\usepackage{xcolor} + +\title{Adding tables with the free Abelian group in R: introducing the {\tt frab} package} + +%\date{September 9, 1985} % Here you can change the date presented in the paper title +%\date{} % Or removing it + +\author{ \href{https://orcid.org/0000-0001-5982-0415}{\includegraphics[width=0.03\textwidth]{orcid.pdf}\hspace{1mm}Robin K. S.~Hankin}\thanks{\href{https://academics.aut.ac.nz/robin.hankin}{work}; +\href{https://www.youtube.com/watch?v=JzCX3FqDIOc&list=PL9_n3Tqzq9iWtgD8POJFdnVUCZ_zw6OiB&ab_channel=TrinTragulaGeneralRelativity}{play}} \\ + Auckland University of Technology\\ + \texttt{hankin.robin@gmail.com} \\ +} + +% Uncomment to remove the date +%\date{} + +% Uncomment to override the `A preprint' in the header +%\renewcommand{\headeright}{Technical Report} +%\renewcommand{\undertitle}{Technical Report} + + +%%% Add PDF metadata to help others organize their library +%%% Once the PDF is generated, you can check the metadata with +%%% $ pdfinfo template.pdf +\hypersetup{ +pdftitle={The free Abelian group in R}, +pdfsubject={q-bio.NC, q-bio.QM}, +pdfauthor={Robin K. S.~Hankin}, +pdfkeywords={The free Abelian group, named vectors} +} + +\begin{document} +\maketitle + +\setlength{\intextsep}{0pt} +\begin{wrapfigure}{r}{0.2\textwidth} + \begin{center} +\includegraphics[width=1in]{frab.png} + \end{center} +\end{wrapfigure} + + +\begin{abstract} + + In this short article I introduce the {\tt frab} package which + provides an alternative interpretation of named vectors in the R + programming language; it is available on CRAN at\\ + \url{https://CRAN.R-project.org/package=frab}. The underlying + mathematical object is the free Abelian group. + +\end{abstract} + +\SweaveOpts{} + + +\section{Introduction} + +The {\bf Free Abelian Group} is a direct sum of infinite cyclic +groups. If these cyclic goups are generated by $\left\lbrace +x_i\colon i\in\mathcal{I}\right\rbrace$ for some (finite) index set +$\mathcal{I}$, then the Free Abelian group $F$ will be + +$$F=\bigoplus_{i\in\mathcal{I}}\left\langle x_i\right\rangle.$$ + +From now on we assume that $\left|\mathcal{I}\right| =k < \infty$; +thus the elements of $F$ will be of the form + +\begin{equation}\label{formal_form} +g=n_1x_1+n_2x_2+\cdots+n_kx_k +\end{equation} + +where $k_i\in\mathbb{Z}$, $1\leqslant i\leqslant k$. The group +operation (conventionally one uses additive notation) is then defined +by componentwise addition: + +$$g=n_1x_1+n_2x_2+\cdots+n_kx_k$$ +$$h=r_1x_1+r_2x_2+\cdots+r_kx_k$$ + +$$h+g=(n_1+r_1)x_1+(n_2+r_2)x_2+\cdots+(n_k+r_k)x_k$$ + +One can define $F$ formally by starting with a generating set +$X=\left\lbrace x_1,\ldots,x_k\right\rbrace$ of symbols and defining +$F$ as the set of all formal expressions of the form~\ref{formal_form} +under addition as defined above. + +The Free Abelian group is an interesting and useful mathematical +object. Here I show how it may be implemented in the R programming +language \citep{rcore2023}. I also show how a slight natural +generalization (which is convenient in the context of numerical +techniques), may be incorporated. + +\section{Package internals} + +The package uses the {\tt STL map} class for efficiency. This class +maps strings (symbols) to doubles; the declaration + +\begin{verbatim} +typedef std::map frab; +\end{verbatim} + +appears in the {\tt src/} package directory. Such maps are limited +only by memory availability. + +\section{The {\tt frab} package in use} + +The {\tt frab} package associates a numerical value with each of a set +of arbitrary (character string) symbols. This is accomplished using +the {\tt STL} {\tt map} class, a container that stores key-value pairs +and allows fast lookup and insertion based on the key. Here we have +keys as character strings and values are double-precision numbers. + +To use the package, it must first be installed and loaded: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +install.packages("frab") +<>= +library("frab") +set.seed(0) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Function {\tt install.packages()} downloads packages from CRAN +\item Function {\tt library()} loads packages to the current R session +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +The package uses a single S4 class, {\tt frab}, for which a variety of +methods is defined. There are several ways to create {\tt frab} +objects, but the most straightforward is to coerce a named vector +using the {\tt frab()} function: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +frab(c(z=2,y=7,x=1)) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Function {\tt frab()} takes a named vector as its single argument +\item It returns an object of class {\tt frab} +\item The elements of the returned {\tt frab} object are reordered; they +appear in an implementation-specific order +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Above, see how {\tt frab()} takes a named numeric vector and returns +an object of class {\tt frab}. It takes the names of its argument, +possibly reordering them, and returns a {\tt frab} object. Function +{\tt frab()} considers the names of the elements to be the primary +extraction and replacement mechanism. If the argument has repeated +names, function {\tt frab()} sums them: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +frab(c(t=3,q=2,t=4,q=-1,p=6,a=3,t=5)) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Function {\tt frab()} coerces its argument, a named +vector, to an object of class {\tt frab} +\item Element {\tt t} and {\tt t} are summed, with values $3+5=8$ and +$3+4+5=12$ respectively +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Above we see that the entries for {\tt t} and {\tt q} are summed. +Zero entries are discarded: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +frab(c(pear=1,kiwi=0,fig=3,lime=2,fig=-3)) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Function {\tt frab()} coerces its argument, a named +vector, to an object of class {\tt frab} +\item Element {\tt kiwi} is discarded, having a zero value +\item Element {\tt fig} vanishes, its entries cancelling +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Above we see that zero entries are discarded, irrespective of whether +a zero is explicitly given, or repeated values cancel. However, the +main motivation for using {\tt frab} objects is that they may be +added: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +a <- frab(c(x=2,y=1,z=3)) +b <- frab(c(y=3,x=3,u=1)) +a+b +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt a} and {\tt b} are of class {\tt frab} +\item Their sum is defined in terms of the keys of the summands, not +position +\item Thus, {\tt a+b} has 5 ($=2+3$) for its {\tt x} entry and 4 +($=1+3$) for its {\tt y} entry +\item {\tt a+b} has its entries in implementation-specific order, as per + {\tt disordR} discipline +\item Also, note that {\tt a+b} has length +4, while {\tt a} and {\tt b} have length 3 +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +\section{Extensions of {\tt frab} objects to floating-point values} + +The {\tt frab} class is sufficiently flexible to incorporate +floating-point values, although one has to be a little careful with +numerical round-off errors: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x <- frab(c(a=4,u=pi,p=exp(pi))) +y <- frab(c(p=-exp(pi)/3,u=-pi)) +z <- frab(c(p=-exp(pi)*2/3)) +x+y+z +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt x}, {\tt y} and {\tt z} are of class {\tt frab} +\item Their sum {\tt x+y+z} should have zero entries for {\tt u} and {\tt p} +\item We see the entry for {\tt u} vanishes\ldots +\item \ldots But the entry for {\tt p} is nonzero, being subject to (small) + numerical roundoff error +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + + +\section{The {\tt frab} package and {\tt disordR} discipline} + +The {\tt frab} package conforms to disord +discipline~\cite{hankin2022_disordR}. Here I present some discussion +of the motivation for this design decision. Consider the following +short R session: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +\begin{Schunk} +\begin{Sinput} +(a <- frab(c(x=2,y=1,u=8,z=3,v=5))) +\end{Sinput} +\begin{Soutput} +A frab object with entries +u v x y z +8 5 2 1 3 +\end{Soutput} +\begin{Sinput} +> a["x"] +\end{Sinput} +\begin{Soutput} +A frab object with entries +x +2 +\end{Soutput} +\begin{Sinput} +> a[1] +\end{Sinput} +\begin{Soutput} +Error in .local(x, i, j = j, ..., drop): + not implemented +> +\end{Soutput} +\end{Schunk} +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Object {\tt a} is a map from symbols to numeric values +\item The {\tt STL map} class stores value-key pairs in an undefined order +\item Thus, extracting the value for {\tt "x"} is fine, but because the order is not +defind it makes no sense to extract the ``first" element +\item And attempting to do so results in a disord discipline error +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Observe that we cannot dispense with order of the values entirely, +because sometimes I am interested in the vector of keys, or their +values, in isolation. If we want to work with the names or values of +a {\tt frab} object, then the {\tt disord} print methods are used: + +<>= +a <- frab(c(x=2,y=1,z=3)) +names(a) +values(a) +@ + +Above we see that {\tt names(a)} and {\tt values(a)} return {\tt +disord} objects, in this case with the same hash code which indicates +that the objects are consistent with one another in the sense of {\tt +disordR::consitent()}. These objects may be displayed and +subsequently manipulated, subject to disord discipline: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +(a <- frab(c(x=2,y=1,z=3))) +names(a) <- toupper(names(a)) +a +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item {\tt a} is a {\tt frab} object +\item {\tt names(a)} is a {\tt disord} object as above +\item Replacement methods are defined, in this case {\tt toupper()} + returns a {\tt disord} object +\item The names of {\tt a} become their uppercase equivalents +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Again observe that there is no meaning to the operation ``extract the +first element of {\tt names(a)}", because the elements of {\tt +names(a)}, being a {\tt disord} object, are stored in an +implementation-specific order. We may manipulate the values of a {\tt +frab} object, if we are careful to be consistent with disord +discipline. The package includes a number of convenient replacement +idioms: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +(a <- frab(c(x=2,y=-1,z=3,p=-4,u=20))) +values(a) <- values(a)^2 +a +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item {\tt a} is a {\tt frab} object +\item We square the {\tt values()} of {\tt a} using the replacement method +\item And object {\tt a} is altered appropriately +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Further, we may use the {\tt disindex} class of the {\tt disordR} +package to replace certain values using standard square bracket +replacement idiom: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +(a <- frab(c(x=2,y=-1,z=11,p=-4,u=20))) +a[a>10] <- 19 +a +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item {\tt a} is a {\tt frab} object +\item We set any value exceeding 10 to 19 +\item And object {\tt a} is altered appropriately +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + + + +\section{Higher-dimensional tables} + +The ideas above have a natural generalization to multi-dimensional +tables. The {\tt frab} package {\tt S4} class is {\tt sparsetable}, +and this has arithmetic methods implemented. We start with +three-dimensional tables (two-dimensional {\tt sparstable} objects +have a print method that coerces them to arrays before displaying, and +the general case is easier to understand). + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +abcd <- letters[1:4] +jj <- sample(abcd,99,repl=T,prob=1/(1:4)^2) +I <- matrix(jj,33,3) +head(I) +nrow(I) +(x <- sparsetable(I)) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Object {\tt abc} is just {\tt c("a","b","c")}, defined to save +horizontal space +\vspace{30mm} +\item Random matrix {\tt I} has elements a-d +\item It has 33 rows +\item Coercing {\tt I} to {\tt sparsetable} object {\tt x} shows the counts +of each type of row. +\item Object {\tt x} has 13 rows; not all $4^3=64$ possible row types +are represented. +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] +@ + +Standard extraction and replacement operations work as expected on {\tt sparsetable} object {\tt x}: + + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x["a","a","b"] +x[rbind(c("a","b","b"),c("a","b","c"))] +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Extraction can operate with the arguments sent separately\ldots +\vspace{3mm} +\item Or as a matrix. +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] +@ + + +However, other more specialised extraction methods are also provided: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x[x>3] +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Extraction can operate with a disord object +\item Here we extract every element greater than 3 +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] +@ + +Replacement methods are also defined: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x[x==1] <- 0 +x +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Replacement can operate with a disord object +\item Here we replace each element equal to 1 with zero +\item Only elements $>1$ remain +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] +@ + +Replacement methods can add new entries if needed: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x["a","x","y"] <- 1000 +x +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item The replacement operator can create new entries +\item Here we create a new entry corresponding to {\tt a x y} with value $1000$ +\item The new entry is placed in its own implementation-specific +location, as per {\tt disordR} discipline +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] +@ + + +\section{Two dimensional tables} + +If a {\tt sparsetable} object has arity 2, it is coerced to matrix +form before printing. Otherwise, the semantics are the same as for +any other {\tt sparsetable} object. + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +(x <- rspar2(9)) +(y <- rspar2(9)) +x+y +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Function {\tt rspar()} gives a random 2D sparse table +\vspace{25mm} +\item {\tt x} and {\tt y} are random {\tt sparsetable} objects +\vspace{20mm} +\item Package idiom allows sparse tables to be added even if the tables are different sizes +\item The resulting sum is automatically resized to accommodate both +addends +\item Entries with nonzero values in both {\tt x} and {\tt y} +are correctly summed +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] +@ + + + +\appendix +\section{Appendix: Named vectors in R} + +A {\em named vector} is a vector with a names attribute; they are a +convenient and useful feature of the R programming language (R Core +Team 2022). Each element of a named vector is associated with a name +or label. Objects of the {\tt frab} class bears some resemblance to +named vectors. However, there are some profound differences: + +\subsection{Uniqueness of names} + +The names of a named vector are not necessarily unique, unlike those +of a {\tt frab} object. This has consequences for extraction and +replacement operations. Consider the following: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +(x <- c(a=7,b=4,a=3)) +x["a"] +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Object {\tt x} is a named vector with three elements. Both the +first and the third element are named {\tt "a"} +\item This is perfectly OK +\item Extracting element {\tt "a"} returns the {\em first} element +with this name (Technically, it returns a named numeric vector of +length 1) +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +(we note in passing that double square bracket extraction, as in {\tt +x[["a"]]}, returns the value of the first element with name {\tt a}. + +\subsection{Replacement methods for named vectors} + +Replacement methods for named vectors is also somewhat problematic: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +(x <- c(b=7,a=4,b=3,c=5)) +x["a"] <- 100 +x +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Object {\tt x} is a named vector with four elements. Both the +first and the third element are named {\tt "a"} +\item Replacing the element {\tt "a"} with 100 behaves as expected: +the element with name {\tt "a"} is returned +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +This might lead one to believe that replacement of multiple elements +would behave as expected. But: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x <- c(b=7,a=4,b=3,c=5) +x["b"] <- 100 +x +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Object {\tt x} is as before +\item Replacing the elements (putatively) indexed with {\tt "b"} [of +which there are two] with {\tt 100} results in only one element +being replaced. +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +We may also use multiple names for the index in a replacement operation: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x <- c(a=7, b=4, a=3, c=5) +x[c("a","c")] <- c(100,101) +x +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Object {\tt x} is as before +\item Replacing the elements (putatively) indexed with {\tt +c("a","c")} with {\tt c(100,101)} replaces the first (but not the +second) of the {\tt "a"} elements, and the {\tt "c"} element, with the +replacement value +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +\subsection{Addition of named vectors} + +Named vectors obey the usual algebraic relations for vectors, although +the details can be unexpected. Firstly, if nontrivial recycling rules +are applied, the result retains only the names of the longer of the +two addends: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +(x <- c(a=7, b=4, a=3, c=5)) +x + c(uu=100) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Object {\tt x} is a standard named vector +\item Adding {\tt c(uu=100)} [a named vector of length 1] to {\tt x} changes the values but not the names of the result +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Now with {\tt x} and {\tt y} named vectors of the same length, there +are at least three plausible values that it might give, viz: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x <- c(a=5,b=3,c=4) +y <- c(b=4,c=2,a=3) +plausible1 <- c(a=9,b=5,c=7) +plausible2 <- c(b=9,c=5,a=7) +plausible1 <- c(a=8,b=7,c=6) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt x} and {\tt y} are standard named vectors +\item Three plausible results: {\tt p1 p2 p3} +\item {\tt p1} adds elementwise and assigns names of {\tt x} +\item {\tt p2} adds elementwise and assigns names of {\tt y} +\item {\tt p3} adds namewise +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +A good case could be made for any of the plausible outcomes above. +However, in standard R idiom, adding two named vectors is equivalent +to stripping the names attribute, performing the addition, then +inserting the names as appropriate. If the two addends are of equal +length, the names of the first one is given the the result: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x <- c(a=1,b=2,c=3) +y <- c(c=4,b=1,a=1) +x+y +y+x +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt x} and {\tt y} are standard named vectors +\item {\tt x+y} and {\tt y+x} have the same values but different names +\item {\tt x+y} inherits the names of {\tt x} +\item {\tt y+x} inherits the names of {\tt y} +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +If the addends are of incompatible length, a warning is given: + + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +\begin{Schunk} +\begin{Sinput} +> x <- c(a=1,b=2,c=3) +> y <- c(c=4,b=1,a=1,p=4) +> x+y +\end{Sinput} +\begin{Soutput} +c b a p +5 3 4 5 +Warning message: +In x + y : longer object length is not a + multiple of shorter object length +> +\end{Soutput} +\end{Schunk} +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt x} and {\tt y} are standard named vectors +\item {\tt x+y} is calculated using standard recycling rules +\item A warning (and optionally an error) is given +\item Names are inherited from the longer of the two addends, here {\tt y} +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + + +\section{Tables in R} + +Objects of class {\tt table} are created by function {\tt +table::base()}. Their behaviour is discussed here. Suppose we have +three tables: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +xl <- c("a","a","b","c","d","d","a") +yl <- c("a","a","b","d","d","d","e") +zl <- c("a","a","b","d","d","e","f") +x <- table(xl) +y <- table(yl) +z <- table(zl) +x +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt xl yl zl} are character vectors +\item They may be tabulated using {\tt table()} +\item Object {\tt x} is of class {\tt table} +\item Its entries are in alphabetical order +\item The internal structure of {\tt x} is that of an array with named dimensions +\item Objects {\tt y} and {\tt z} are similar +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +Can we ascribe any meaning to {\tt x+y}? We attempt standard R semantics: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x +y +x+y +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item The sum is defined here +\item No error or warning is given +\item The result is clearly incorrect: the entries for {\tt c,d,e} +should be $1+0=1$, $2+3=5$, and $0+1=1$ respectively +\item The result given by R is comparable to the result of adding named vectors +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + +The correct way to add such tables would be by concatenating their +respective data: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x +y +table(c(xl,yl)) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt xl,yl} are defined in the previous chunk +\item We may tabulate {\tt c(xl,yl)} +\item The resulting object in essence sums the named entries of the tables {\tt x,y} +\item For example, the entry for {\tt a} is $3+2=5$ +\item This is a reasonable interpretation of ``{\tt x+y}" +\item Note that the result is length 5, and that of each table is 4 +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + + +If we do not have access to {\tt xl} and {\tt yl} then the only way to +``add" {\tt x} and {\tt y} would be to reconstruct them: + +\vphantom{f}\\[2\baselineskip] +\begin{minipage}[t]{0.49\textwidth} +<>= +x +y +(xl_rec <- rep(names(x),times=x)) +(yl_rec <- rep(names(y),times=y)) +table(c(xl_rec,yl_rec)) +@ +\end{minipage} +\hfill +\vrule +\hfill +\begin{minipage}[t]{0.49\textwidth} +\vspace{-\baselineskip} +\color{violet} +\begin{itemize} +\item Objects {\tt x,y} are unchanged +\item We may reconstruct {\tt xl} by using {\tt base::rep(...)} with +the {\tt times} argument to form {\tt xl\_rec} +\item Similarly for {\tt yl} +\item And simply tabulate the concatentation {\tt c(xl\_rec,yl\_rec)} to ``add" {\tt x} and {\tt y} +\item The resulting table correctly sums the entries with regard to their labels +\end{itemize} +\color{black} +\end{minipage} +\vphantom{f}\\[2\baselineskip] + + +However, this is extremely inefficient, especially if the entries are +large. And indeed this method will not work for negative or +non-integral entries, although it is sufficiently robust to +accommodate zero entries consistently. + + +\bibliographystyle{apalike} +\bibliography{frab} + +\end{document} diff --git a/inst/read.me b/inst/read.me new file mode 100644 index 0000000..2466db4 --- /dev/null +++ b/inst/read.me @@ -0,0 +1,28 @@ +To create frab_arxiv.pdf, first use Sweave: + +R CMD Sweave frab_arxiv.Rnw + +this will create frab_arxiv.tex + +To process this with latex, you will need various auxiliary files such +as arxiv.sty and orcid.pdf. These are not under version control, you +will have to find these on the web. File frab.png (the hex sticker) +is in the man/figures directory. Copy these files to this directory, +inst/ + +There are two distinct versions of frab.bib, which use slightly +different markup. + + +Then to create frab_arxiv.pdf: + +pdflatex frab_arxiv +bibtex frab_arxiv +pdflatex frab_arxiv +pdflatex frab_arxiv + +should work. + + +For arxiv, upload frab_arxiv.tex which will be processed online. For +arxiv to process the file, you need to upload Sweave.sty as well. diff --git a/inst/wittgenstein.Rmd b/inst/wittgenstein.Rmd new file mode 100644 index 0000000..33e8a47 --- /dev/null +++ b/inst/wittgenstein.Rmd @@ -0,0 +1,42 @@ +--- +title: "That of which we cannot speak, we must remain silent" +author: "Robin Hankin" +date: "2023-07-25" +output: html_document +--- + +```{r setup, include=FALSE} +knitr::opts_chunk$set(echo = TRUE) +``` + +## Absent symbols and the `frab` package + +I do not really understand what Wittgenstein was trying to say in +proposition 7 of his _Tractatus_, but the interpretation of symbols +not present in a `frab` object is certainly an interesting problem. +Consider the canonical use-case for `frab` objects: + +```{r frabaplusb} +suppressMessages(library("frab")) +(a <- frab(c(u= -5,v=2,w=1))) +(b <- frab(c(v=1,x=2))) +a+b +``` + +Above we see that `a` has no `"x"` term, and `b` has `"x"=2`, the sum +being `"x"=2`; we are justified in asserting that, in object `b`, +`"x"=0`. However, consider this: + +```{r frabminus2} +(x <- frab(c(a=4, b=-3, c=1, d=-1, e=9))) +x[x > -2] +``` + +Above we extract every element greater than $-2$. We might say +"extract every element from `x` that is _known_ to exceed $-2$". It +has not extracted any absent symbols such as `f`, and as such we +cannot associate `f` with zero (because zero exceeds $-2$, and no `f` +element was extracted). But this is not consistent with the `a+b` +example above, in which absent symbols have the very definite and very +known value of zero. + diff --git a/man/Arith.Rd b/man/Arith.Rd index febf147..4dd8b5c 100644 --- a/man/Arith.Rd +++ b/man/Arith.Rd @@ -13,15 +13,18 @@ \alias{frab_negative} \alias{frab_reciprocal} \alias{frab_plus_frab} +\alias{frab_plus_numeric} +\alias{frab_multiply_frab} \alias{frab_unary} \alias{frab_arith_frab} \alias{frab_arith_numeric} \alias{numeric_arith_frab} \alias{c_frab_add} +\alias{c_frab_multiply} \alias{c_frab_eq} \alias{c_frab_pmax} \alias{c_frab_identity} -\title{Extraction and replacement methods for class \code{"frab"}} +\title{Arithmetic methods for class \code{"frab"}} \description{ The \code{frab} class provides basic arithmetic methods for frab @@ -38,10 +41,10 @@ frab_reciprocal(x) frab_plus_frab(F1,F2) frab_multiply_numeric(e1,e2) frab_power_numeric(e1,e2) -numeric_multiply_frab(e1,e2) numeric_power_frab(e1,e2) frab_unary(e1,e2) frab_arith_frab(e1,e2) +frab_plus_numeric(e1,e2) frab_arith_numeric(e1,e2) numeric_arith_frab(e1,e2) } @@ -64,10 +67,15 @@ numeric_arith_frab(e1,e2) \seealso{\code{\link{Compare}}} \examples{ -x <- frab(c(a=1,b=2,c=3)) -y <- frab(c(b=-2,d=8)) +(x <- frab(c(a=1,b=2,c=3))) +(y <- frab(c(b=-2,d=8,x=1,y=7))) +(z <- frab(c(c=2,x=5,b=1,a=6))) + x+y +x+y+z + +x*y } diff --git a/man/Compare.Rd b/man/Compare.Rd index f84c183..620d472 100644 --- a/man/Compare.Rd +++ b/man/Compare.Rd @@ -7,12 +7,14 @@ \alias{frab_eq} \alias{frab_compare_frab} \alias{frab_eq_num} +\alias{frab_ne_num} \alias{frab_gt_num} \alias{frab_ge_num} \alias{frab_lt_num} \alias{frab_le_num} \alias{frab_compare_numeric} \alias{num_eq_frab} +\alias{num_ne_frab} \alias{num_gt_frab} \alias{num_ge_frab} \alias{num_lt_frab} @@ -31,19 +33,22 @@ it calls low-level helper function \code{c_frab_eq()}, which calls its \proglang{C} namesake which is written for speed (specifically, returning \code{FALSE} as soon as it spots a difference between its - two arguments). + two arguments). Note that if any value is \code{NA}, \code{frab_eq()} + will return \code{FALSE}. } \usage{ frab_eq(e1,e2) frab_compare_frab(e1,e2) frab_eq_num(e1,e2) +frab_ne_num(e1,e2) frab_gt_num(e1,e2) frab_ge_num(e1,e2) frab_lt_num(e1,e2) frab_le_num(e1,e2) frab_compare_numeric(e1,e2) num_eq_frab(e1,e2) +num_ne_frab(e1,e2) num_gt_frab(e1,e2) num_ge_frab(e1,e2) num_lt_frab(e1,e2) diff --git a/man/Extract.Rd b/man/Extract.Rd index 0caaca1..783685c 100644 --- a/man/Extract.Rd +++ b/man/Extract.Rd @@ -5,6 +5,10 @@ \alias{values<-} \alias{values<-,frab,numeric-method} \alias{values<-,frab,disord-method} +\alias{names} +\alias{names,frab-method} +\alias{names<-,frab,disord-method} +\alias{names<-,frab,character-method} \alias{[} \alias{[.frab} \alias{[,frab-method} @@ -25,11 +29,15 @@ \alias{[<-,frab,ANY,ANY,ANY-method} \alias{[<-,frab,character,missing-method} \alias{[<-,frab,character,missing,numeric-method} +\alias{[<-,frab,character,missing,logical-method} \alias{[<-,frab,disord,missing-method} \alias{[<-,frab,disord,missing,frab-method} +\alias{[<-,frab,disord,missing,logical-method} \alias{[<-,frab,disord,missing,numeric-method} \alias{[<-,frab,disindex,missing,numeric,ANY-method} \alias{[<-,frab,disindex,missing,numeric-method} +\alias{[<-,frab,missing,missing,frab-method} +\alias{[<-,frab,missing,missing,numeric-method} \alias{[<-,frab,missing,missing,ANY-method} \title{Extraction and replacement methods for class \code{"frab"}} \description{ @@ -51,14 +59,20 @@ "missing", value = "ANY")}: \code{x["a"] <- 3}} \item{[<-}{\code{signature(x = "frab", i = "disord", j = "missing", value="frab")}: \code{x[x<0] <- -x[x<0]}; not implemented} + \item{[<-}{\code{signature(x = "frab", i = "disord", j = "missing", + value="logical")}: \code{x[x<0] <- NA}} \item{[<-}{\code{signature(x = "frab", i = "ANY",j = "ANY", value = "ANY")}: not implemented} \item{[<-}{\code{signature(x = "frab", i = "disindex",j = "missing", value = "numeric")}: \code{x[x>0] <- 3}} + \item{[<-}{\code{signature(x = "frab", i = "character", j = + "missing", value = "logical")}: \code{x["c"] <- NA}} } Double square extraction, as in \code{x[[i]]} and \code{x[[i]] <- - value}, is not currently defined. + value}, is not currently defined. In replacement methods, if + \code{value} is logical it is coerced to numeric (this includes + \code{NA}). } \author{Robin K. S. Hankin} @@ -81,5 +95,8 @@ v <- values(x) v[v<0] <- abs(v[v<0]) + 50 values(x) <- v +names(x) <- toupper(names(x)) +x + } diff --git a/man/frab-class.Rd b/man/frab-class.Rd index ed68d74..02c1c27 100644 --- a/man/frab-class.Rd +++ b/man/frab-class.Rd @@ -1,9 +1,7 @@ \name{frab-class} \docType{class} \alias{frab-class} -\alias{names} \alias{namedvector} -\alias{names,frab-method} \alias{namedvector,frab-method} \title{Class \dQuote{frab}} \description{The formal \proglang{S4} class for frab objects} @@ -15,7 +13,6 @@ and \code{namedvector()}. } \usage{ -\S4method{names}{frab}(x) \S4method{namedvector}{frab}(x) } \arguments{\item{x}{Object of class \code{frab}}} diff --git a/man/frab.Rd b/man/frab.Rd index 23c90fa..205644d 100644 --- a/man/frab.Rd +++ b/man/frab.Rd @@ -31,9 +31,15 @@ dispatches to \code{list_to_frab()}. If given a table it dispatches to \author{Robin K. S. Hankin} \seealso{\code{\link{frab-class}}} \examples{ + +frab(c(x=6,y=6,z=-4,u=0,x=3)) + as.frab(c(a=2,b=1,c=77)) as.frab(list(names=letters[5:2],values=1:4)) +x <- rfrab() +y <- rfrab() +x+y } diff --git a/man/misc.Rd b/man/misc.Rd index af6024d..be78aef 100644 --- a/man/misc.Rd +++ b/man/misc.Rd @@ -8,10 +8,13 @@ \alias{lapply.disord} \alias{is.na} \alias{is.na.frab} +\alias{is.na,frab-method} +\alias{is.notna} +\alias{is.notna.frab} +\alias{is.notna,frab-method} \alias{is.na<-} \alias{is.na<-.frab} \alias{is.na<-.frab} -\alias{is.na,frab-method} \alias{is.na<-.frab} \alias{is.na<-,frab,disord-method} \alias{!,frab-method} @@ -32,17 +35,22 @@ on \code{frab} object \code{x}, work with \code{values(x)} (which is a little more involved: \itemize{ - \item{length()} returns the length of the data component of the + \item\code{length()} returns the length of the data component of the object. - \item\code{which()} returns a \code{disind} object when given a - Boolean \code{frab} - \item{\code{is.na()} returns a logical \code{disord} object} -} + \item\code{which()} returns an error when called with a \code{frab} + object, but is useful here because it returns a \code{disind} when + given a Boolean \code{disord} object. This is useful for idiom such + as \code{x[x>0]} + \item Functions \code{is.na()} and \code{is.notna()} return a + \code{disind} object} } \value{Generally return frabs} \author{Robin K. S. Hankin} \note{ -note here + Constructions such as \code{!is.na(x)} do not work if \code{x} is a + \code{frab} object: this is because \code{is.na()} returns a + \code{disind} object, not a logical. Use \code{is.notna()} to + identify elements that are not \code{NA}. } \seealso{\code{\link{extract}}} \examples{ @@ -61,4 +69,8 @@ is.na(x) <- x<3 x x[is.na(x)] <- 100 x + +y <- frab(c(a=5,b=NA,c=3,d=NA)) +y[is.notna(y)] <- 199 +y } diff --git a/man/sparsetable.Rd b/man/sparsetable.Rd new file mode 100644 index 0000000..05ca6e8 --- /dev/null +++ b/man/sparsetable.Rd @@ -0,0 +1,196 @@ +\name{sparsetable} +\alias{sparsetable} +\alias{as.sparsetable} +\alias{is.sparsetable} +\alias{sparsetable-class} +\alias{index} +\alias{index,sparsetable-method} +\alias{values,sparsetable-method} +\alias{names,sparsetable-method} +\alias{dimnames,sparsetable-method} +\alias{dimnames<-,sparsetable-method} +\alias{dimnames<-,sparsetable,ANY-method} +\alias{is.empty,sparsetable-method} +\alias{arity} +\alias{arity,sparsetable-method} +\alias{dim} +\alias{dim,sparsetable-method} +\alias{asum} +\alias{asum.sparsetable} +\alias{asum,sparsetable-method} +\alias{asum_sparsetable} +\alias{asum_exclude_sparsetable} +\alias{as.array,sparsetable-method} +\alias{nterms} +\alias{nterms,sparsetable-method} +\alias{show,sparsetable-method} +\alias{print_sparsetable_matrixform} +\alias{sparsetable_to_table} +\alias{sparsetable_to_frab} +\alias{sparsetable_asum_include} +\alias{sparsetable_asum_exclude} +\alias{table_to_sparsetable} +\alias{sparsetable} +\alias{array_to_sparsetable} +\alias{sparsetable_to_array} +\alias{sparsetable_pmax} +\alias{sparsetable_pmin} +\alias{sparsetable_negative} +\alias{sparsetable_times_scalar} +\alias{sparsetable_eq_sparsetable} +\alias{sparsetable_equality} +\alias{sparsetable_maker} +\alias{rspar} +\alias{rspar2} +\alias{rsparr} +\alias{sparsetable_add} +\alias{sparsetable_negative} +\alias{sparsetable_reciprocal} +\alias{sparsetable_plus_sparsetable} +\alias{sparsetable_multiply_sparsetable} +\alias{sparsetable_multiply_numeric} +\alias{sparsetable_power_numeric} +\alias{numeric_compare_sparsetable} +\alias{numeric_multiply_sparsetable} +\alias{numeric_power_sparsetable} +\alias{sparsetable_accessor} +\alias{sparsetable_unary} +\alias{sparsetable_arith_sparsetable} +\alias{sparsetable_arith_numeric} +\alias{numeric_arith_sparsetable} +\alias{sparsetable_overwrite} +\alias{sparsetable_setter} +\alias{sparsetable_eq} +\alias{sparsetable_compare_sparsetable} +\alias{sparsetable_eq_num} +\alias{sparsetable_gt_num} +\alias{sparsetable_ge_num} +\alias{sparsetable_lt_num} +\alias{sparsetable_le_num} +\alias{sparsetable_compare_numeric} +\alias{num_eq_sparsetable} +\alias{num_gt_sparsetable} +\alias{num_ge_sparsetable} +\alias{num_lt_sparsetable} +\alias{num_le_sparsetable} +\alias{drop} +\alias{drop,sparsetable-method} +\alias{[,sparsetable-method} +\alias{[,sparsetable,disord,missing-method} +\alias{[,sparsetable,disindex,missing,ANY-method} +\alias{[,sparsetable,disord,missing,ANY-method} +\alias{[,sparsetable,ANY,ANY,ANY-method} +\alias{[<-,sparsetable-method} +\alias{[<-,sparsetable,ANY,ANY,ANY-method} +\alias{[<-,sparsetable,disindex,missing,ANY-method} +\alias{[<-,sparsetable,disord,missing,numeric-method} +\alias{pmax_sparsetable} +\alias{pmin_sparsetable} +\alias{pmax.sparsetable} +\alias{pmin.sparsetable} +\alias{pmax,sparsetable-method} +\alias{pmin,sparsetable-method} +\alias{pmax_pair_sparsetable} +\alias{pmin_pair_sparsetable} +\alias{pmax_dots_sparsetable} +\alias{pmin_dots_sparsetable} + +\title{Generalized sparse tables: \code{sparsetable} objects} +\description{Package idiom for creating and manipulating + \code{sparsetable} objects} +\usage{ +sparsetable(i,v=1) +rspar(n=15,l=3,d=3) +rspar2(n=15,l=6) +rsparr(n=20,d=6,l=5,s=4) +sparsetable_to_array(x) +array_to_sparsetable(x) +sparsetable_to_frab(x) +\S4method{index}{sparsetable}(x) +\S4method{values}{sparsetable}(x) +\S4method{dimnames}{sparsetable}(x) +\S4method{dim}{sparsetable}(x) +} +\arguments{ + \item{x}{In functions like \code{index()}, an object of class \code{sparsetable}} + \item{i,v}{In standard constructor function \code{sparsetable()}, + argument \code{i} is the index matrix of strings, and \code{v} a + numeric vector of values} + \item{n,l,d,s}{In functions \code{rspar()}, \code{rspar2()}, and + \code{rsparr()}, \code{n} is the number of terms, \code{l} the + number of letters, \code{d} the dimensionality and \code{s} the + number of distinct marginal values to return} +} +\details{ + +Most functions here mirror their equivalent in the \CRANpkg{spray} +package [which the \proglang{C} code is largely copied from] or the +\code{frab} functionality. So, for example, \code{num_eq_sparsetable()} +is the equivalent of \code{num_eq_spray()}. + +The print method treats arity-2 \code{sparsetable} objects differently +from other arities. By default, arity-2 \code{sparsetable} objects are +displayed as two-dimensional tables. Control this behaviour with option +\code{print_2dsparsetables_as_matrices}: + +\preformatted{ + options("print_2dsparsetables_as_matrices" = FALSE) +} + +The default value for this option, non-\code{FALSE} (including its +out-of-the-box status of \dQuote{unset}), directs the print method to +coerce arity-2 \code{sparsetable} objects to two-dimensional tables +before printing. If this option is \code{FALSE}, arity-2 sparsetables +are printed using matrix index form, just the same as any other arity. + +Functions \code{rspar()}, \code{rspar2()}, and \code{rsparr()} create +random \code{sparsetable} objects of increasing complexity. The +defaults are chosen to make the values of sensible sizes. + +Function \code{drop()} takes a sparsetable object of arity one and +coerces to a \code{frab} object. + +Function \code{dim()} returns a named vector, with names being the +\code{dimnames} of its argument. + +Extraction and replacement methods are a subset of \CRANpkg{spray} +methods, but most should work. There is special dispensation so that +standard idiom for arrays [e.g. \code{x['a','b','a']} and +\code{x['a','b','a'] <- 55}] work as expected, although the general +expectation is that access and replacement use (character) matrices and +an index object. However, indexing by \code{disord} and \code{disindex} +objects should also work [e.g. \code{x[x>7]}]. + +The \CRANpkg{spray} source code and the \code{sparstable} functionality +hve about 90\% overlap; there were enough small differences between the +codes to make it worth maintaining two sets of source code, IMO. + +There is a discussion of package idiom in the vignette, +\code{vignette("frab")}. + +} +\note{ + The pronunciation of \dQuote{sparsetable} has the emphasis on the + first syllable, so it rhymes with \dQuote{Barnable} or + \dQuote{Barnstaple}. +} +\author{Robin K. S. Hankin} +\seealso{\code{\link{frab-class}}} +\examples{ + +sparsetable(matrix(sample(letters[1:4],36,replace=TRUE),ncol=2),1:18) +sparsetable(matrix(sample(letters[1:4],39,replace=TRUE),ncol=3),1:13) + +(x <- rspar2(9)) +(y <- rspar2(9)) +x + y + +x["KT","FF"] <- 100 +x + +rsparr() + +a <- rspar(d=4) +asum(a,"Feb") + +} diff --git a/src/RcppExports.cpp b/src/RcppExports.cpp index 4bdeeb0..425948f 100644 --- a/src/RcppExports.cpp +++ b/src/RcppExports.cpp @@ -36,6 +36,20 @@ BEGIN_RCPP return rcpp_result_gen; END_RCPP } +// c_frab_multiply +List c_frab_multiply(const CharacterVector names1, const NumericVector values1, const CharacterVector names2, const NumericVector values2); +RcppExport SEXP _frab_c_frab_multiply(SEXP names1SEXP, SEXP values1SEXP, SEXP names2SEXP, SEXP values2SEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterVector >::type names1(names1SEXP); + Rcpp::traits::input_parameter< const NumericVector >::type values1(values1SEXP); + Rcpp::traits::input_parameter< const CharacterVector >::type names2(names2SEXP); + Rcpp::traits::input_parameter< const NumericVector >::type values2(values2SEXP); + rcpp_result_gen = Rcpp::wrap(c_frab_multiply(names1, values1, names2, values2)); + return rcpp_result_gen; +END_RCPP +} // c_frab_pmax List c_frab_pmax(const CharacterVector names1, const NumericVector values1, const CharacterVector names2, const NumericVector values2); RcppExport SEXP _frab_c_frab_pmax(SEXP names1SEXP, SEXP values1SEXP, SEXP names2SEXP, SEXP values2SEXP) { @@ -64,12 +78,129 @@ BEGIN_RCPP return rcpp_result_gen; END_RCPP } +// sparsetable_maker +List sparsetable_maker(const CharacterMatrix& M, const NumericVector& d); +RcppExport SEXP _frab_sparsetable_maker(SEXP MSEXP, SEXP dSEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M(MSEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d(dSEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_maker(M, d)); + return rcpp_result_gen; +END_RCPP +} +// sparsetable_add +List sparsetable_add(const CharacterMatrix& M1, const NumericVector& d1, const CharacterMatrix& M2, const NumericVector& d2); +RcppExport SEXP _frab_sparsetable_add(SEXP M1SEXP, SEXP d1SEXP, SEXP M2SEXP, SEXP d2SEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M1(M1SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d1(d1SEXP); + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M2(M2SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d2(d2SEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_add(M1, d1, M2, d2)); + return rcpp_result_gen; +END_RCPP +} +// sparsetable_overwrite +List sparsetable_overwrite(const CharacterMatrix& M1, const NumericVector& d1, const CharacterMatrix& M2, const NumericVector& d2); +RcppExport SEXP _frab_sparsetable_overwrite(SEXP M1SEXP, SEXP d1SEXP, SEXP M2SEXP, SEXP d2SEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M1(M1SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d1(d1SEXP); + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M2(M2SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d2(d2SEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_overwrite(M1, d1, M2, d2)); + return rcpp_result_gen; +END_RCPP +} +// sparsetable_accessor +NumericVector sparsetable_accessor(const CharacterMatrix& M, const NumericVector& d, const CharacterMatrix& Mindex); +RcppExport SEXP _frab_sparsetable_accessor(SEXP MSEXP, SEXP dSEXP, SEXP MindexSEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M(MSEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d(dSEXP); + Rcpp::traits::input_parameter< const CharacterMatrix& >::type Mindex(MindexSEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_accessor(M, d, Mindex)); + return rcpp_result_gen; +END_RCPP +} +// sparsetable_setter +List sparsetable_setter(const CharacterMatrix& M1, const NumericVector& d1, const CharacterMatrix& M2, const NumericVector& d2); +RcppExport SEXP _frab_sparsetable_setter(SEXP M1SEXP, SEXP d1SEXP, SEXP M2SEXP, SEXP d2SEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M1(M1SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d1(d1SEXP); + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M2(M2SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d2(d2SEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_setter(M1, d1, M2, d2)); + return rcpp_result_gen; +END_RCPP +} +// sparsetable_equality +bool sparsetable_equality(const CharacterMatrix& M1, const NumericVector& d1, const CharacterMatrix& M2, const NumericVector& d2); +RcppExport SEXP _frab_sparsetable_equality(SEXP M1SEXP, SEXP d1SEXP, SEXP M2SEXP, SEXP d2SEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M1(M1SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d1(d1SEXP); + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M2(M2SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d2(d2SEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_equality(M1, d1, M2, d2)); + return rcpp_result_gen; +END_RCPP +} +// sparsetable_asum_include +List sparsetable_asum_include(const CharacterMatrix& M, const NumericVector& d, const IntegerVector& n); +RcppExport SEXP _frab_sparsetable_asum_include(SEXP MSEXP, SEXP dSEXP, SEXP nSEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M(MSEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d(dSEXP); + Rcpp::traits::input_parameter< const IntegerVector& >::type n(nSEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_asum_include(M, d, n)); + return rcpp_result_gen; +END_RCPP +} +// sparsetable_pmax +List sparsetable_pmax(const CharacterMatrix& M1, const NumericVector& d1, const CharacterMatrix& M2, const NumericVector& d2); +RcppExport SEXP _frab_sparsetable_pmax(SEXP M1SEXP, SEXP d1SEXP, SEXP M2SEXP, SEXP d2SEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M1(M1SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d1(d1SEXP); + Rcpp::traits::input_parameter< const CharacterMatrix& >::type M2(M2SEXP); + Rcpp::traits::input_parameter< const NumericVector& >::type d2(d2SEXP); + rcpp_result_gen = Rcpp::wrap(sparsetable_pmax(M1, d1, M2, d2)); + return rcpp_result_gen; +END_RCPP +} static const R_CallMethodDef CallEntries[] = { {"_frab_c_frab_identity", (DL_FUNC) &_frab_c_frab_identity, 2}, {"_frab_c_frab_add", (DL_FUNC) &_frab_c_frab_add, 4}, + {"_frab_c_frab_multiply", (DL_FUNC) &_frab_c_frab_multiply, 4}, {"_frab_c_frab_pmax", (DL_FUNC) &_frab_c_frab_pmax, 4}, {"_frab_c_frab_eq", (DL_FUNC) &_frab_c_frab_eq, 4}, + {"_frab_sparsetable_maker", (DL_FUNC) &_frab_sparsetable_maker, 2}, + {"_frab_sparsetable_add", (DL_FUNC) &_frab_sparsetable_add, 4}, + {"_frab_sparsetable_overwrite", (DL_FUNC) &_frab_sparsetable_overwrite, 4}, + {"_frab_sparsetable_accessor", (DL_FUNC) &_frab_sparsetable_accessor, 3}, + {"_frab_sparsetable_setter", (DL_FUNC) &_frab_sparsetable_setter, 4}, + {"_frab_sparsetable_equality", (DL_FUNC) &_frab_sparsetable_equality, 4}, + {"_frab_sparsetable_asum_include", (DL_FUNC) &_frab_sparsetable_asum_include, 3}, + {"_frab_sparsetable_pmax", (DL_FUNC) &_frab_sparsetable_pmax, 4}, {NULL, NULL, 0} }; diff --git a/src/frab.cpp b/src/frab.cpp index a0acbbd..e9b65d7 100644 --- a/src/frab.cpp +++ b/src/frab.cpp @@ -46,6 +46,23 @@ frab sum2(frab F1, frab F2){ } } +frab prod2(frab F1, frab F2){ + frab out; + if(F1.size() > F2.size()){ + for(auto it = F2.begin() ; it != F2.end() ; ++it){ // iterate through the smaller one + const string symbol = it->first; + out[symbol] = F1[symbol] * F2[symbol]; + } + return remove_zeros(out); + } else { + for(auto it = F1.begin() ; it != F1.end() ; ++it){ + const string symbol = it->first; + out[symbol] = F2[symbol] * F1[symbol]; + } + return remove_zeros(out); + } +} + frab frabmaker(const CharacterVector names, const NumericVector values){ if(names.size() != values.size()) { throw std::invalid_argument("names and values are not same length"); @@ -117,6 +134,17 @@ List c_frab_add( ) ); } +//[[Rcpp::export]] +List c_frab_multiply( + const CharacterVector names1, const NumericVector values1, + const CharacterVector names2, const NumericVector values2 + ){ + return retval(prod2( + frabmaker(names1,values1), + frabmaker(names2,values2) + ) ); +} + //[[Rcpp::export]] List c_frab_pmax( const CharacterVector names1, const NumericVector values1, diff --git a/src/sparsetable_ops.cpp b/src/sparsetable_ops.cpp new file mode 100644 index 0000000..832e5e7 --- /dev/null +++ b/src/sparsetable_ops.cpp @@ -0,0 +1,262 @@ +// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4; indent-tabs-mode: nil; -*- + +#define container vector // Could be 'vector' or 'deque' (both work but there may be performance differences) + +#define STRICT_R_HEADERS +#include + +#include +#include +#include +#include +#include +#include +#include + + +using namespace Rcpp; + +typedef std::container mycont; // a mycont is a container [vector or deque] of strings +typedef std::map sparsetable; + +sparsetable prepare(const CharacterMatrix M, const NumericVector d){ + sparsetable S; + sparsetable::iterator it; + mycont v; + + for(int i=0; isecond == 0){ + it = S.erase(it); // in C++11, erase() returns *next* iterator + } else { + ++it; // else just increment the iterator + } + } + return(S); +} + +CharacterMatrix makeindex(const sparsetable &S){ // takes a sparsetable, returns the matrix of indices + const unsigned int ncol = S.begin()->first.size(); + CharacterMatrix out(S.size(),ncol); // index + mycont v; + unsigned int row=0, col=0; + + for(auto it=S.begin(); it != S.end(); ++it){ + v = it->first; + col = 0; + for(auto ci=v.begin() ; ci != v.end() ; ++ci){ + out(row,col++) = *ci; + } + row++; + } + return(out); +} + +NumericVector makevalue(const sparsetable &S){ // takes a sparsetable, returns data + NumericVector out(S.size()); // data + unsigned int i=0; + sparsetable::const_iterator it; // it iterates through a sparse array + + for(it=S.begin(); it != S.end(); ++it){ + out(i++) = it->second; // initialize-and-fill is more efficient than out.push_back(it->second) + } + return(out); +} + +List retval (const sparsetable &S){ // used to return a list to R + + // In this function, returning a zero-row matrix results in a + // segfault ('memory not mapped'). So we check for 'S' being zero + // size and, if so, return a special Nil value. This corresponds to + // an empty sparsetable object. + + if(S.size() == 0){ + return List::create(Named("index") = R_NilValue, + Named("value") = R_NilValue + ); + } else { + return List::create(Named("index") = makeindex(S), + Named("value") = makevalue(S) + ); + } +} + + +// [[Rcpp::export]] +List sparsetable_maker +( + const CharacterMatrix &M, const NumericVector &d + ){ + return retval(prepare(M,d)); +} + +// [[Rcpp::export]] +List sparsetable_add +( + const CharacterMatrix &M1, const NumericVector &d1, + const CharacterMatrix &M2, const NumericVector &d2 + ){ + sparsetable S1 = prepare(M1, d1); + sparsetable S2 = prepare(M2, d2); + + for (sparsetable::const_iterator it=S2.begin(); it != S2.end(); ++it){ + const mycont v = it->first; + S1[v] += S2[v]; // the meat: S1=S1+S2 (S1 += S2) + if(S1[v]==0){S1.erase(v);} + } + + return retval(S1); +} + + +// [[Rcpp::export]] +List sparsetable_overwrite // something like S1[ind(S2)] <- S2 +( + const CharacterMatrix &M1, const NumericVector &d1, + const CharacterMatrix &M2, const NumericVector &d2 + ){ + sparsetable S1 = prepare(M1, d1); + sparsetable S2 = prepare(M2, d2); + + for (sparsetable::const_iterator it=S2.begin(); it != S2.end(); ++it){ + const mycont v = it->first; + S1[v] = S2[v]; // the meat + } + + return retval(S1); +} + +// [[Rcpp::export]] +NumericVector sparsetable_accessor // returns S1[] +( + const CharacterMatrix &M, const NumericVector &d, + const CharacterMatrix &Mindex + ){ + sparsetable S; + mycont v; + signed int k=0; + NumericVector out(Mindex.nrow()); + + S = prepare(M, d); + + for(int i=0; i index ; d2 -> value + ){ + mycont v; + + sparsetable S1 = prepare(M1, d1); + sparsetable S2 = prepare(M2, d2); + + for(int i=0; ifirst; + if(S1[v] != S2[v]){ + return FALSE; + } else { + S2.erase(v); + } + } + // at this point, S1[v] == S2[v] for every index 'v' of S1; and we + // know that S1 and S2 are the same size, so S1 and S2 are identical: + + return TRUE; + +} + +// [[Rcpp::export]] +List sparsetable_asum_include +( + const CharacterMatrix &M, const NumericVector &d, + const IntegerVector &n + ){ + sparsetable S; + mycont v; + + for(int i=0; ifirst; + if(S2[v] > S1[v]){ S1[v] = S2[v];} // S1[v] = max(S1[v],S2[v]); + S2.erase(v); // not S2[v] = 0; // OK because the iterator is it1 and this line modifies S2 + } + + for (sparsetable::const_iterator it = S2.begin(); it != S2.end(); ++it){ //iterate through S2 keys not in S1 + const mycont v = it->first; + if(S2[v] > 0){ S1[v] = S2[v]; } + } + + return retval(S1); +} + + diff --git a/tests/testthat/test_aaa.R b/tests/testthat/test_aaa.R index 7365e09..852d28d 100644 --- a/tests/testthat/test_aaa.R +++ b/tests/testthat/test_aaa.R @@ -1,20 +1,33 @@ test_that("Test suite aaa.R",{ x <- frab(c(a=1,b=2,c=3,i=4)) expect_true(is.frab(x)) + expect_true(is.frab(as.frab(x))) expect_true(x == frab(c(b=2,a=1,i=4,c=3))) expect_true(x + c(a=1) == frab(c(a=2,b=2,i=4,c=3))) expect_true(c(a=1) + x == frab(c(a=2,b=2,i=4,c=3))) + + jjx <- x + jjx[] <- 1 + expect_true(jjx == 1/jjx) + expect_error(as.frab(1:5)) + + expect_error(x > as.frab(c(a=3,b=4))) + expect_true(is.namedvector(as.namedvector(x))) expect_true(is.namedlogical(as.namedvector(x)>2)) expect_true(is.unnamedvector(as.vector(as.namedvector(x)))) expect_true(x^2 == frab(c(a=1,b=4,c=9,i=16))) - expect_true(2^x == frab(c(a=2,b=4,c=8,i=16))) + expect_error(2^x) + expect_error(x^c(r=3)) doublex <- frab(c(c=6,a=2,i=8,b=4)) minusx <- frab(c(c=-3,a=-1,i=-4,b=-2)) expect_true(2*x == doublex) expect_true(x*2 == doublex) + expect_error(x * c(a=1)) + expect_error(c(a=1) * x) + expect_true(x/0.5 == doublex) expect_true(1/frab(c(a=1,b=1,i=1)) == frab(c(a=1,i=1,b=1))) @@ -27,18 +40,21 @@ test_that("Test suite aaa.R",{ expect_true(-x == minusx) expect_true( x == -minusx) - expect_error(x+5) - expect_error(x-5) - expect_error(2+x) - expect_error(2-x) + expect_true(x+5 == 5+x) + expect_true(x-5 == -5+x) + + expect_true(-x+5 == 5-x) + expect_true(-x-5 == -5-x) expect_true(sum(x == 2) == 1) + expect_true(sum(x != 2) == 3) expect_true(sum(x > 2) == 2) expect_true(sum(x >= 2) == 3) expect_true(sum(x < 2) == 1) expect_true(sum(x <= 2) == 2) expect_true(sum(2 == x) == 1) + expect_true(sum(2 != x) == 3) expect_true(sum(2 <= x) == 3) expect_true(sum(2 > x) == 1) expect_true(sum(2 >= x) == 2) @@ -52,6 +68,7 @@ test_that("Test suite aaa.R",{ expect_output(print(x)) options("frab_print_hash" = TRUE) expect_output(print(x)) + expect_output(print(x*0)) options("frab_print_hash" = FALSE) expect_output(print(x)) options("frab_print_hash" = 233) @@ -145,6 +162,7 @@ test_that("Test suite aaa.R",{ expect_error(c_frab_identity(letters[1:6],1:5)) expect_output(print(zero())) + expect_output(print(frab(c(a=1,b=3,c=5))*0)) jj <- c(a=3,b=1,d=3) expect_false(is.namedlogical(jj)) @@ -158,6 +176,92 @@ test_that("Test suite aaa.R",{ expect_false(is.namedlogical(jj>2)) expect_true(is.unnamedlogical(jj>2)) + x <- frab(c(b=6,a=3,y=8)) + expect_error(values(x) <- 1:3) + values(x) <- 3 + expect_true(x == frab(c(a=3,b=3,y=3))) + + x <- frab(c(b=6,a=3,y=8)) + values(x) <- values(x)^2 + expect_true(x == frab(c(a=9,b=36,y=64))) + + x <- frab(c(b=6,a=3,y=8)) + expect_error(values(x) <- disord(letters[1:3])) + + + x <- frab(c(b=6,a=3,y=8)) + names(x) <- toupper(names(x)) + expect_true(x == frab(c(Y=8 , B=6 , A=3))) + + expect_error(names(x) <- "o") + expect_error(names(x) <- letters) + + x <- frab(c(u=9)) + names(x) <- "foo" + expect_true(x == frab(c(foo=9))) + + x <- frab(c(a=6,b=3,c=2)) + y <- frab(c(b=1,c=5,x=3,yy=8)) + expect_true(y*x == frab(c(b=3,c=10))) + x <- frab(c(a=6,b=3,c=2)) + y <- frab(c(b=1,c=5,x=3,yy=8)) + expect_true(x*y == frab(c(b=3,c=10))) + expect_true(x*y == y*x) + + x1 <- frab(c(b=6,a=3,y=8)) + x2 <- frab(c(a=6,b=3,c=2)) + x3 <- frab(c(b=1,c=5,x=3,yy=8)) + x4 <- frab(c(yy=-3,a=5,c=3)) + expect_true(pmax(x1,x2,x3,x4) == frab(c(a=6,b=6,c=5,x=3,y=8,yy=8))) + expect_true(pmin(x1,x2,x3,x4) == frab(c(yy=-3))) + + x <- frab(c(a=1,b=-4,c=2,d=5,e=6,f=-9)) + x["b"] <- NA + x[is.na(x)] <- 33 + expect_true(x == frab(c(a=1,b=33,c=2,d=5,e=6,f=-9))) + + x <- frab(c(a=1,b=-4,c=2,d=5,e=6,f=-9)) + x[c("b","e")] <- NA + x[is.na(x)] <- 34 + expect_true(x == frab(c(a=1,b=34,c=2,d=5,e=34,f=-9))) + + x <- frab(c(a=1,b=-4,c=2,d=5,e=6,f=-9)) + x[x<0] <- NA + x[is.na(x)] <- 35 + expect_true(x == frab(c(a=1,b=35,c=2,d=5,e=6,f=35))) + + x <- frab(c(a=1,b=-4,c=2,d=5,e=6,f=-9)) + expect_silent(is.na(x) <- x>0) + expect_true(x["b"] == -4) + + expect_error(list_to_frab(list(names=frab::values(rfrab())))) + + expect_true(is.frab(rfrabb())) + expect_true(is.frab(rfrabbb())) + expect_output(print(rfrabb()*0)) + + x <- frab(c(a=1,b=-4,c=2,d=5,e=6,f=-9)) + is.na(x) <- x>0 + expect_true(all(x[is.notna(x)] < 0)) + + x <- frab(c(a=1,b=-4,c=2,d=5,e=6,f=-9)) + expect_true(max(x) == 6) + expect_true(min(x) == -9) + expect_true(sum(x) == 1) + expect_true(all(range(x) == c(-9,6))) + expect_error(prod(x)) + values(x) <- 1 + expect_true(x == 1/x) + + expect_true(all(pmax(1:5,5:1) == c(5,4,3,4,5))) + expect_true(all(pmin(1:5,5:1) == c(1,2,3,2,1))) + + + x <- frab(c(a=1,b=-4,c=2,d=5,e=6,f=-9)) + expect_error(x[] <- x) + expect_error(x[x != 6] <- x) + expect_error(x[values(x)] <- x) + expect_error(x[x<5] <- 6*x[x<5]) }) diff --git a/tests/testthat/test_aab.R b/tests/testthat/test_aab.R index f0fe22b..26144f7 100644 --- a/tests/testthat/test_aab.R +++ b/tests/testthat/test_aab.R @@ -27,6 +27,9 @@ checker1 <- function(A){ expect_true(pmax(A)==A) expect_true(pmin(A)==A) + expect_true(pmax(A,A) == A) + expect_true(pmin(A,A) == A) + } # checker1() closes checker2 <- function(A,B){ @@ -43,6 +46,10 @@ checker2 <- function(A,B){ expect_true(all(pmin(A,B)-A <= 0)) expect_true(all(pmin(A,B)-B <= 0)) + expect_true(length(A+B) <= length(A) + length(B)) + expect_true(length(A-B) <= length(A) + length(B)) + + } # checker2() closes checker3 <- function(A,B,C){ diff --git a/tests/testthat/test_aac.R b/tests/testthat/test_aac.R new file mode 100644 index 0000000..540507c --- /dev/null +++ b/tests/testthat/test_aac.R @@ -0,0 +1,265 @@ +test_that("Test suite aac.R",{ + x_c <- sparsetable( + i = matrix(letters[c( + 1,1,1,2,2,2,3,3,3,3,3, + 1,2,3,1,2,3,1,1,2,2,3, + 3,2,3,2,3,2,1,3,1,2,3 + )], ncol=3,dimnames=list(NULL,c("Jan","Feb","Mar"))), + v = c(8,6,16,21,15,9,11,3,15,3,13)) + + x <- x_c + expect_true(is.sparsetable(x)) + expect_true(is.sparsetable(x[x>20])) + expect_true(x == x) + expect_false(x != x) + expect_true(x+x == 2*x) + expect_true(x+x == x*2) + expect_true(x+x == sparsetable(index(x),values(x)*2)) + expect_true(is.empty(x-x)) + expect_true(all(dim(as.array(x-x))==0)) + expect_false(x == x*0) + expect_false(x == 0*x) + expect_true(x != 0*x) + + expect_error(names(x)) + expect_true(nterms(x) == 11) + expect_true(all(dim(x) == 3)) + expect_true(as.sparsetable(as.array(x)) == x) + expect_output(print(x)) + expect_output(print(x*0)) + expect_error(1/x) + expect_error(x*x) + expect_error(x^x) + expect_error(x^6) + expect_error(6^x) + + + + x['a','a','c'] <- 100 + expect_true(x == sparsetable( + matrix(letters[c(1,1,1,2,2,2,3,3,3,3,3,1,2,3,1,2,3,1,1,2,2,3,3,2,3,2,3,2,1,3,1,2,3)],11,3), + c(100,6,16,21,15,9,11,3,15,3,13)) + ) + + expect_false(sparsetable_equality( + M1 = matrix(letters[1:9],3,3), + d1 = 1:3, + M2 = index(x_c), + d2 = c(100,6,16,21,15,9,11,3,15,3,13)) + ) + + x['a','a','c'] <- 0 + expect_true( + x == sparsetable( + matrix(letters[c( + 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, + 2, 3, 1, 2, 3, 1, 1, 2, 2, 3, + 2, 3, 2, 3, 2, 1, 3, 1, 2, 3 )],10,3), c( + 6,16,21,15, 9,11, 3,15, 3,13 )) + ) + + x <- x_c + expect_true(as.frab(x) == + frab(c(a_a_c = 8, a_b_b = 6, a_c_c = 16, b_a_b = 21, + b_b_c = 15, b_c_b = 9, c_a_a = 11, c_a_c = 3, + c_b_a = 15, c_b_b = 3, c_c_c = 13))) + + + options("print_2dsparsetables_as_matrices" = TRUE) + expect_output(print(sparsetable( + matrix(c("a", "a", "b", "b", "b", "c", "c", "d", "e", "e", + "e", "f", "A", "E", "B", "C", "D", "B", "D", "D", "A", "D", "E", + "C"), 12, 2, dimnames = list(NULL, c("foo", "bar"))), + 1:12) + )) + + options("print_2dsparsetables_as_matrices" = FALSE) + expect_output(print(sparsetable( + matrix(c("a", "a", "b", "b", "b", "c", "c", "d", "e", "e", + "e", "f", "A", "E", "B", "C", "D", "B", "D", "D", "A", "D", "E", + "C"), 12, 2, dimnames = list(NULL, c("foo", "bar"))), + 1:12) + )) + + + options("print_2dsparsetables_as_matrices" = NULL) + + I <- index(x) + colnames(I) <- NULL + expect_output(print(sparsetable(I,1:11))) + + expect_output(print(sparsetable(matrix(letters[1:12],4,3),1))) + expect_true(as.sparsetable(frab(c(a=1,b=2,c=3))) == sparsetable(cbind(letters[1:3]),1:3)) + + expect_true(as.sparsetable(table(data.frame(fish=letters[1:4],chips=LETTERS[4:1]))) == sparsetable(matrix(c("a","D","b","C","c","B","d","A"),byrow=TRUE,ncol=2))) + + expect_error(x*(1:3)) + expect_error((1:3)*x) + + expect_false(x == x*2) + + expect_false(x == rspar2()) + + x <- x_c + x['x','y','a'] <- 99 + expect_false(x == x_c) + expect_true(x['x','y','a'] == 99) + +# xpc = 'x plus zero' + xpz <-sparsetable( # the 10,10,10 == j,j,j entry is 77-77==0 + i=matrix(letters[c( + 1,1,1,2,2,2,3,3,10,3,3,3,10, + 1,2,3,1,2,3,1,1,10,2,2,3,10, + 3,2,3,2,3,2,1,3,10,1,2,3,10 + )], ncol=3, dimnames=list(NULL,c("Jan","Feb","Mar"))), + v=c(8,6,16,21,15,9,11,3,77,15,3,13,-77)) + + expect_true(x_c == xpz) + + + + jj <-sparsetable( + i=matrix(letters[c( + 1,1,1, + 1,2,3, + 3,9,3 + )], ncol=3, dimnames=list(NULL,c("Jan","Feb","Mar"))), + v=c(77,78,79)) + + xpz[] <- jj + expect_true(xpz == sparsetable( + i = matrix(c( + "a", "a", "a", "a", "b", "b", "b", "c", "c", "c", "c", "c", + "a", "b", "b", "c", "a", "b", "c", "a", "a", "b", "b", "c", + "c", "b", "i", "c", "b", "c", "b", "a", "c", "a", "b", "c" + ),12, 3), + v = c(77, 6, 78, 79, 21, 15, 9, 11, 3, 15, 3, 13)) + ) + + x <- x_c + x[index(x)[1:4,]] <- 0 + expect_false(x == x_c) + expect_false(x_c == x) + + x['c','a','c'] <- 334 + expect_false(x == x_c) + expect_false(x_c == x) + + xas <- asum(x_c,"Feb") + xas_correct <- sparsetable(matrix( + c("a", "a", "b", "b", "c", "c", "c", + "b", "c", "b", "c", "a", "b", "c" + ), 7,2, dimnames = list(NULL, c("Jan", "Mar"))), c(6, 24, 30, 15, 26, 3, 16)) + expect_true(xas == xas_correct) + + xas <- asum(x_c,c(FALSE,TRUE,FALSE)) + xas_correct <- sparsetable(matrix( + c("a", "a", "b", "b", "c", "c", "c", + "b", "c", "b", "c", "a", "b", "c" + ), 7,2, dimnames = list(NULL, c("Jan", "Mar"))), c(6, 24, 30, 15, 26, 3, 16)) + expect_true(xas == xas_correct) + + x <- x_c + expect_true(all(x[x > 9] > 9)) + expect_true(all(x[x < 9] < 9)) + expect_true(all(x[9 > x] < 9)) + expect_true(all(x[9 < x] > 9)) + expect_true(all(9 < x[x > 9])) + expect_true(all(9 > x[x < 9])) + expect_true(all(9 > x[9 > x])) + expect_true(all(9 < x[9 < x])) + + expect_true(all(x[x >= 9] >= 9)) + expect_true(all(x[x <= 9] <= 9)) + expect_true(all(x[9 >= x] <= 9)) + expect_true(all(x[9 <= x] >= 9)) + expect_true(all(9 <= x[x >= 9])) + expect_true(all(9 >= x[x <= 9])) + expect_true(all(9 >= x[9 >= x])) + expect_true(all(9 <= x[9 <= x])) + + expect_true(all(x[x == 9] == 9)) + expect_true(all(x[9 == x] == 9)) + expect_true(all(9 == x[x == 9])) + expect_true(all(9 == x[9 == x])) + + expect_true(all(x[which(x > 9)] > 9)) + expect_true(all(x[which(x < 9)] < 9)) + expect_true(all(x[which(9 > x)] < 9)) + expect_true(all(x[which(9 < x)] > 9)) + expect_true(all(9 < x[which(x > 9)])) + expect_true(all(9 > x[which(x < 9)])) + expect_true(all(9 > x[which(9 > x)])) + expect_true(all(9 < x[which(9 < x)])) + + expect_true(all(x[which(x >= 9)] >= 9)) + expect_true(all(x[which(x <= 9)] <= 9)) + expect_true(all(x[which(9 >= x)] <= 9)) + expect_true(all(x[which(9 <= x)] >= 9)) + expect_true(all(9 <= x[which(x >= 9)])) + expect_true(all(9 >= x[which(x <= 9)])) + expect_true(all(9 >= x[which(9 >= x)])) + expect_true(all(9 <= x[which(9 <= x)])) + + expect_true(all(x[which(x == 9)] == 9)) + expect_true(all(x[which(9 == x)] == 9)) + expect_true(all(9 == x[which(x == 9)])) + expect_true(all(9 == x[which(9 == x)])) + + + expect_true(all(x[index(x)[c(1,3,5,7),]] %in% c(8,11,15,16))) + + x <- x_c + expect_silent(jjx <- x[which(x>9)]) + expect_true(all(jjx > 9)) + + x <- x_c + expect_silent(jjx <- x[values(x) > 9]) + expect_true(all(jjx > 9)) + + + x <- x_c + expect_silent(x[which(x>9)] <- 0) + expect_true(all(x<=9)) + + x <- x_c + expect_silent(x[values(x) >= 9] <- 8) + expect_true(all(x <= 8)) + + x <- x_c + expect_error(x[which(x<8)] <- values(x)[x<8] + 333333) + + x <- x_c + expect_error(x[] <- seq_len(nterms(x))) + + x <- x_c + I <- matrix("a",3,3) + diag(I) <- "b" + S <- sparsetable(I,777) + expect_warning(x[S] <- 1:3) + expect_true(x == sparsetable(matrix(c( + "a","a","a","a","a","b","b","b","b","c","c","c","c","c", + "a","a","b","b","c","a","a","b","c","a","a","b","b","c", + "b","c","a","b","c","a","b","c","b","a","c","a","b","c" + ),14,3),c(1,8,2,6,16,3,21,15,9,11,3,15,3, 13))) + + + x <- x_c + expect_error(x[list('a','a','x')] <- 3333) + + expect_true(drop(sparsetable(matrix(letters[1:3],3),1:3)) == frab(c(a=1,b=2,c=3))) + + x <- x_c + expect_true(asum(x,"Feb") == asum_exclude_sparsetable(x,c("Jan","Mar"))) + + x <- x_c + jj <- c(TRUE,FALSE,TRUE) + expect_true(asum(x, jj) == asum_exclude_sparsetable(x,!jj)) + expect_true(asum(x,!jj) == asum_exclude_sparsetable(x, jj)) + + expect_true(asum(x,c(2 )) == asum_exclude_sparsetable(x,c(1,3))) + expect_true(asum(x,c(1,3)) == asum_exclude_sparsetable(x,c(2 ))) + +}) + + diff --git a/tests/testthat/test_aad.R b/tests/testthat/test_aad.R new file mode 100644 index 0000000..255f090 --- /dev/null +++ b/tests/testthat/test_aad.R @@ -0,0 +1,75 @@ +## Function checker1() has one argument, checker2() two, and +## checker3() has three. + +test_that("Test suite aad.R",{ + +checker1 <- function(A){ + expect_true(A == +A) + expect_true(A == -(-A)) + expect_true(A+A == 2*A) + expect_true(A+A == A*2) + + expect_true(is.empty(A-A)) + expect_true(A+A+A == 3*A) + expect_true(A+A+A == A*3) + + expect_true(A/2 + A/2 == A) + + expect_error(A&A) + + expect_true(pmax(A)==A) + expect_true(pmin(A)==A) + + expect_true(pmax(A,A) == A) + expect_true(pmin(A,A) == A) + + expect_true(sum(values(asum(A,1))) == sum(values(A))) + + + + +} # checker1() closes + +checker2 <- function(A,B){ + expect_true(A+B == B+A) + expect_true(2*A+B == A+A+B) + expect_true(A+2*B == B+B+A) + + expect_true(pmax(A,B) == pmax(B,A)) + expect_true(pmin(A,B) == pmin(B,A)) + + expect_true(all(pmax(A,B)-A >= 0)) + expect_true(all(pmax(A,B)-B >= 0)) + + expect_true(all(pmin(A,B)-A <= 0)) + expect_true(all(pmin(A,B)-B <= 0)) + +} # checker2() closes + +checker3 <- function(A,B,C){ + expect_true(A+(B+C) == (A+B)+C) # addition is associative; 1.2 + + expect_true(pmax(A,pmax(B,C)) == pmax(pmax(A,pmax(B,C)))) + expect_true(pmin(A,pmin(B,C)) == pmin(pmin(A,pmin(B,C)))) + + expect_true(pmax(A,B,C) == pmax(pmax(A,pmax(B,C)))) + expect_true(pmin(A,B,C) == pmin(pmin(A,pmin(B,C)))) + +} # checker3() closes + + +for(i in seq_len(10)){ + A <- rspar() + B <- rspar() + C <- rspar() + + checker1(A) + checker2(A,B) + checker3(A,B,C) + + checker1(rsparr()) +} + + + +} ) diff --git a/vignettes/frab.Rmd b/vignettes/frab.Rmd index 75a9a46..49e693f 100644 --- a/vignettes/frab.Rmd +++ b/vignettes/frab.Rmd @@ -1,5 +1,5 @@ --- -title: "An alternative implementation of named vectors: addition of `table`s with the `frab` package" +title: "Addition of `table` objects with the `frab` package" author: "Robin K. S. Hankin" output: html_vignette bibliography: frab.bib @@ -9,11 +9,11 @@ vignette: > %\VignetteEncoding{UTF-8} --- - ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) options(rmarkdown.html_vignette.check_title = FALSE) library("frab") +library("mvtnorm") set.seed(1) ``` @@ -24,8 +24,10 @@ set.seed(1) **TLDR**: Adding two objects of class `table` has a natural interpretation. However, in base R, adding two tables can give plausible but incorrect results. The `frab` package provides a -consistent and efficient way to add `table` objects. The underlying -mathematical structure is the Free Abelian group, hence "`frab`". +consistent and efficient way to add `table` objects, subject to +`disordR` discipline [@hankin2022_disordR]. The underlying +mathematical structure is the Free Abelian group, hence "`frab`". To +cite in publications, please use [@hankin2023_frab]. # Prologue: `table()` @@ -150,9 +152,48 @@ Above we see the `+` operator is defined between a `frab` and a `table`, coercing tables to `frab` objects to give consistent results. -Carefully regulated ambiguity +# Two dimensional tables + +The ideas above have a natural generalization to two-dimensional tables. + +```{r,label=twodeetables} +(x <- rspar2(9)) +(y <- rspar2(9)) +x+y +``` + +Above, note that the resulting sum is automatically resized to +accommodate both addends, and also that entries with nonzero values in +both `x` and `y` are correctly summed. + + +## Arbitrary-dimensioned tables +The one- and two- dimensional tables above have somewhat specialized +print methods and the general case with dimension $\geqslant 3$ uses +methods similar to those of the `spray` package. We can generate a +`sparsetable` object quite easily: +```{r label=threedeetables} +A <- matrix(0.95,3,3) +diag(A) <- 1 +x <- round(rmvnorm(300,mean=rep(10,3),sigma=A/7)) +x[] <- letters[x] +head(x) +(sx <- sparsetable(x)) +``` + +But we can add `sx` to other `sparsetable` objects: + +```{r label=showthreedeeadd} +(sz <- sparsetable(matrix(sample(letters[9:11],12,replace=TRUE),ncol=3),1001:1004)) +``` + +Then the usual semantics for addition operate: + +```{r label=usualsemantics} +sx + sz +``` ## References diff --git a/vignettes/frab.bib b/vignettes/frab.bib index ea181b1..f701fd9 100644 --- a/vignettes/frab.bib +++ b/vignettes/frab.bib @@ -7,12 +7,22 @@ @Manual{rcore2022 url = {https://www.R-project.org/}, } - @misc{hankin2022_disordR, - title = "Disordered vectors in {R}: introducing the disordR package", + title = "Disordered vectors in {R}: introducing the \texttt{disordR} package", author = "Robin K. S. Hankin", year = "2022", publisher = "arXiv", doi = "10.48550/ARXIV.2210.03856", url = "https://arxiv.org/abs/2210.03856" } + +@misc{hankin2023_frab, + title = "The free Abelian group in {R}: the \texttt{frab} package", + author = "Robin K. S. Hankin", + year = "2023", + publisher = "arXiv", + doi = "10.48550/ARXIV.2307.13184", + url = "https://arxiv.org/abs/2307.13184" +} + +