Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add $,pSet method to easily access columns in the phenoData #203

Merged
merged 1 commit into from
Apr 29, 2017

Conversation

jorainer
Copy link
Collaborator

@jorainer jorainer commented Apr 7, 2017

  • Add $, $&lt;- and pData<- methods to easily get and replace/add columns to the
    phenodata.
  • Add related unit tests and add documentation.

Basically, It facilitates accessing pheno data columns by just calling pset$pheno_col. I find that pretty convenient e.g. in ExpressionSet objects and alike, so I thought it might be nice to have it in MSnbase as well.
Also I added replace methods for the pData.

- Add $, $<- and pData<- methods to easily get and replace/add columns to the
  phenodata.
- Add related unit tests and add documentation.
@codecov-io
Copy link

Codecov Report

Merging #203 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #203      +/-   ##
==========================================
+ Coverage   71.72%   71.74%   +0.01%     
==========================================
  Files          67       67              
  Lines        7165     7170       +5     
==========================================
+ Hits         5139     5144       +5     
  Misses       2026     2026
Impacted Files Coverage Δ
R/methods-pSet.R 81.19% <100%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d67ab4d...77a8bd6. Read the comment docs.

@jorainer jorainer requested a review from lgatto April 7, 2017 12:43
@lgatto
Copy link
Owner

lgatto commented Apr 9, 2017

Thanks @jotsetung, I think this will prove very useful. I'll wait for the new release to merge.

On a similar note, I also thought it would be useful to be able to change the behaviour of $ through a MSnbase option, where it could either link to pData (default) or fData. The latter is much more useful in our spatial proteomics work in pRoloc, where we so often need to access feature variables in our MSnSets.

@jorainer
Copy link
Collaborator Author

Don't you think that might become a little confusing if $ is used to access both rows or columns? What about $$ to access fData and $ for pData?

@lgatto
Copy link
Owner

lgatto commented Apr 10, 2017

Hmm, $$, that is an interesting suggestion. Or what about £ (no, that's a joke) :-). Ok, let's go for $$, as that will even bypass the need to setting an option. Whoever has time for this is welcome to have a go, but won't be merged before the new release.

@jorainer
Copy link
Collaborator Author

I checked the $$ option but that does not seem to be that straight forward. Didn't find where $ is defined to check how and if a function named $$ could be defined to have the same behavior (i.e. that it is possible to call it like x$$y instead of $$(x, y).
@lgatto do you know by chance how and where $ is defined?

@sgibb
Copy link
Collaborator

sgibb commented Apr 27, 2017

I am not sure that it is possible to use $$ as function. We can define other operators:

library("MSnbase")
data(msnset)

setGeneric("%fd%", function(object, name)standardGeneric("%fd%"))
setGeneric("%pd%", function(object, name)standardGeneric("%pd%"))

setMethod("%fd%", "MSnSet", function(object, name)fData(object)[, as.character(substitute(name))])
setMethod("%pd%", "MSnSet", function(object, name)pData(object)[, as.character(substitute(name))])

msnset %fd% ProteinAccession
#  [1] BSA     ECA1422 ECA4030 ECA3882 ECA1364 ECA0871
#  [7] ECA4512 ECA4513 ECA3969 ECA3082 ECA1032 ECA1294
# [13] ECA4514 ECA1104 ECA3356 ECA4037 ECA0621 ECA1093
# [19] ECA0452 ENO     ECA2391 ECA0435 ECA1362 ECA1363
# [25] ECA3566 ECA0435 ECA0691 ECA3566 ECA3377 ECA0978
# [31] ECA2831 ECA4514 ECA0469 ECA4514 ECA4514 ECA0172
# [37] ECA0631 ECA3349 ECA0469 ECA4514 ENO     ENO
# [43] ECA4026 ECA3929 ECA4514 ECA4026 ECA4013 BSA
# [49] BSA     ECA2186 ENO     ECA3349 ECA2421 ECA1443
# [55] ECA3175
# 40 Levels: BSA ECA0172 ECA0435 ECA0452 ... ENO

msnset %pd% mz
# [1] 114.1 115.1 116.1 117.1

(%fd<-%, %pd<-% would be possible, too)
Of course tab completion is not working. Maybe we can hack something via ?rcompgen. But I am not sure that is worth the effort: msnset %fd% ProteinAccession is IMHO no real improvement over fData(msnset)$ProteinAccession.

@sgibb
Copy link
Collaborator

sgibb commented Apr 27, 2017

`$$` <- function(x, y) { x[[y]] }
l <- list(a=1, b=2)
l$$b
# Error: unexpected '$' in "l$$"
$$(l, "a")
# Error: unexpected '$' in "$"

@jorainer
Copy link
Collaborator Author

Yes, that was what I realized too, defining $$ is tricky.

Regarding the %fd% etc operators, it would be possible, but might not be intuitive. What I mean is that many users that use e.g. ExpressionSet, SummarizedExperiment etc are familiar with the use of $ to access phenodata columns. That was also the first thing I missed in the pSet. I would not implement a completely new shortcut to do the same (i.e. the %pd%).

Eventually we just add the $ and use the old fashioned fData(msd) to access feature annotations?

@lgatto
Copy link
Owner

lgatto commented Apr 27, 2017

Have a look in Biobase to see how things have been implemented for $.

@jorainer
Copy link
Collaborator Author

That's exactly our implementation for the $ in MSnbase. The problem is that $$ does not work in a similar fashion (as @sgibb pointed out).

@lgatto
Copy link
Owner

lgatto commented Apr 27, 2017

Following up from @sgibb's example:

[1] 1
> `$$` <- function(x, y) { x[[y]] }
> l <- list(a=1, b=2)
> `$`(l, "a")
[1] 1
> `$$`(l, "a")
[1] 1

I think the problem we have is that R can't parse $$ in l$$a, while it knows how to handle $.

@lgatto
Copy link
Owner

lgatto commented Apr 27, 2017

An alternative would be to use @, but I always stumble on the wrong signature

> .Foo <- setClass("Foo", slots = c(a = "numeric"))
> foo <- .Foo(a = 1:10)
> names(foo@a) <- letters[1:10]
> `$`
.Primitive("$")
> getGeneric("$")
standardGeneric for "$" defined from package "base"
function (x, name) 
standardGeneric("$", .Primitive("$"))
<bytecode: 0x1355f40>
<environment: 0x1221b00>
Methods may be defined for arguments: x
Use  showMethods("$")  for currently available ones.
> setMethod("$", "Foo", function(x, name) x@a[name])
[1] "$"
> foo$a
a 
1 

but

> getGeneric("@")
NULL
> `@`
.Primitive("@")
> setMethod("@", "Foo", function(x, name) x@a[name])
Error in setGeneric(f, where = where) :@dispatches internally;  methods can be defined, but the generic function is implicit, and cannot be changed.
> setMethod("@", "Foo", function(object, name) object@a[name])
Error in setGeneric(f, where = where) :@dispatches internally;  methods can be defined, but the generic function is implicit, and cannot be changed.

@jorainer
Copy link
Collaborator Author

I think @ would also confuse people since it's the accessor for the slots of the class.

@lgatto
Copy link
Owner

lgatto commented Apr 27, 2017

Yes, but sometime the perfect is the enemy of good. I am unsure that $$ would ever work, hence may be better @ than nothing, ... My idea or using £ might become the least bad solution in the end!

@lgatto lgatto mentioned this pull request Apr 29, 2017
@lgatto
Copy link
Owner

lgatto commented Apr 29, 2017

I am going to merge this. Issue #209 follows up the discussion to directly access fData columns.

@lgatto lgatto merged commit 4bcd3c9 into master Apr 29, 2017
@lgatto lgatto deleted the dollarAccessor branch September 6, 2017 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants