-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
%#in%
family
#27
Comments
I see I am bad at explaining and so the purpose of
So few issues here:
|
@moodymudskipper pinging for your opinion. |
I was thinking about 2b this morning, I think we do, I thought I had written those actually. I remember now our talks about the Would this work as follows ? library(inops)
#>
#> Attaching package: 'inops'
#> The following object is masked from 'package:base':
#>
#> <<-
`%#in[]%` <- function(x, range){
if(is.data.frame(x))
set <- names(table(as.matrix(x)) %[in[]% range)
else
set <- names(table(x) %[in[]% range)
x %in{}% set
}
`%#[in[]%` <- function(x, range){
if(is.data.frame(x))
set <- names(table(as.matrix(x)) %[in[]% range)
else
set <- names(table(x) %[in[]% range)
x %[in{}% set
}
`%#in[]%<-` <- function(x, range, value){
if(is.data.frame(x))
set <- names(table(as.matrix(x)) %[in[]% range)
else
set <- names(table(x) %[in[]% range)
x %in{}% set <- value
x
}
x <- c(1,1:5,5,5)
x %#in[]% c(2,3)
#> [1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE
x %#[in[]% c(2,3)
#> [1] 1 1 5 5 5
x %#in[]% c(2,3) <- NA
x
#> [1] NA NA 2 3 4 NA NA NA
y <- data.frame(a = 1:4,b = c(1, 5, 5, 5))
y %#in[]% c(2,3)
#> a b
#> [1,] TRUE TRUE
#> [2,] FALSE TRUE
#> [3,] FALSE TRUE
#> [4,] FALSE TRUE
y %#[in[]% c(2,3)
#> [1] 1 1 5 5 5
y %#in[]% c(2,3) <- NA
y
#> a b
#> 1 NA NA
#> 2 2 NA
#> 3 3 NA
#> 4 4 NA
z <- factor(c("a",letters[1:5],"e", "e"))
z %#in[]% c(2,3)
#> [1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE
z %#[in[]% c(2,3)
#> [1] a a e e e
#> Levels: a b c d e
z %#in[]% c(2,3) <- NA
z
#> [1] <NA> <NA> b c d <NA> <NA> <NA>
#> Levels: a b c d e Created on 2019-11-04 by the reprex package (v0.3.0) I changed the df to a matrix as this is what In the case of In that case, variants A few remarks :
It seems to be quite a useful functionality to you and I'm ok to incorporate it if you ponder these points and think it is worth it. |
Hmmm few ideas:
So maybe we can get away with using
|
@KKPMW I updated my post above while you were replying |
I believe the |
You mean `%in#% might work and be less confusing? If so - I agree :) Also in my particular cases - I am often working with biological data that has "technical replicates". And then have to only select the samples that all have exact number (like 3) tech replicates, and analyse them separately. So |
This comment has been minimized.
This comment has been minimized.
Oh, maybe you rather meant : library(inops)
#>
#> Attaching package: 'inops'
#> The following object is masked from 'package:base':
#>
#> <<-
`%in#%` <- function(x, counts){
if(is.data.frame(x)){
tb <- table(as.matrix(x))
} else{
tb <- table(x)
}
set <- names(tb[tb %in% counts])
x %in{}% set
}
`%[in#%` <- function(x, counts){
if(is.data.frame(x)){
tb <- table(as.matrix(x))
} else{
tb <- table(x)
}
set <- names(tb[tb %in% counts])
x %[in{}% set
}
`%in#%<-` <- function(x, counts, value){
if(is.data.frame(x)){
tb <- table(as.matrix(x))
} else{
tb <- table(x)
}
set <- names(tb[tb %in% counts])
x %in{}% set <- value
x
}
x <- c(1,1:5,5,5)
x %in#% 2
#> [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
x %[in#% 2
#> [1] 1 1
x %in#% 2 <- NA
x
#> [1] NA NA 2 3 4 5 5 5
x2 <- c(1,1:5,5,5)
x2 %in#% 2:3
#> [1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE
x2 %[in#% 2:3
#> [1] 1 1 5 5 5
x2 %in#% 2:3 <- NA
x
#> [1] NA NA 2 3 4 5 5 5
y <- data.frame(a = 1:4,b = c(1, 5, 5, 5))
y %in#% 2
#> a b
#> [1,] TRUE TRUE
#> [2,] FALSE FALSE
#> [3,] FALSE FALSE
#> [4,] FALSE FALSE
y %[in#% 2
#> [1] 1 1
y %in#% 2 <- NA
y
#> a b
#> 1 NA NA
#> 2 2 5
#> 3 3 5
#> 4 4 5
z <- factor(c("a",letters[1:5],"e", "e"))
z %in#% 2
#> [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
z %[in#% 2
#> [1] a a
#> Levels: a b c d e
z %in#% 2 <- NA
z
#> [1] <NA> <NA> b c d e e e
#> Levels: a b c d e Created on 2019-11-04 by the reprex package (v0.3.0) |
Yup! I had in mind this last one, as it covers both points:
But I do wonder if we could ever have some kind of rounding issues doing %in% on numeric. |
Good, I'm totally fine with this one, doesn't add much complexity to the package, and I think it's easy enough to understand. As for edge cases and optimization (table() / ave() / tabulate(), how to deal with factors...), we need experiments and unit tests. But agreeing on the concept and naming is the main part |
Agree, both with the message and with the discussed naming convention. Should I add your proposed functions to the codebase as a first iteration? |
Yes, but please think about the desired behavior with factors and share your thoughts |
Hmm my thought as always - it should be consistent as much as possible with all the other operations... For now I think we do not allow assigning new levels to factors. I am not sure yet if this is a good or bad idea. An argument can be made that we are preventing silly users from making mistakes for themselves, while at the same time forcing more sophisticated users to do several additional steps (like adding new levels)? Maybe for now let's leave the current behaviour we have for factors and not worry about it. Adding a level once should not be a big deal. If one is using factors anyway - he/she will probably want to control the levels themselves. |
Also I think we should add |
First iteration in #30 |
The problem with using Which reminds me that we didn't test n>2 dimensional arrays, but I think our code so far should handle them properly. |
You are correct. I missed that. Tried to fix it in #31 |
it should be quick enough to implement, should be declined for all set/range/regex variants.
It should be as simple as :
where I leave the
na.rm
to your appreciation as I'm not sure if I'll use those much.@KKPMW
The text was updated successfully, but these errors were encountered: