-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first go at making a FacileInterface #10
Conversation
More or less "Yes", except:
Related to (1) should we use styler with a proper style guide (for this codebase, something close to the tidyverse style guide would be first preference, although it is violated already in original codebase in a few places that will be hard to change (ie. the |
Just to reiterate my feeling for a need to put ... in this way, maybe it makes sense to think of the FacileInterface/API being more like a Java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. My only concern (due to lack of experience using it) is that it might be possible to end up trying to use some functions on an object which does not formally inherit from FacileDataSet
. I cannot think of any specific examples, though.
stopifnot(is.FacileDataSet(x))
are redundant if replying on dispatching, no? Or are we going to be defensive about not using explicit methods e.g. f.FacileDataSet(notAFacileDataSet)
?
@@ -10,6 +10,6 @@ | |||
#' @param x A \code{FacileDataSet} | |||
#' @param ... NSE claused to use in \code{\link[dplyr]{filter}} expressions | |||
#' @family API | |||
filter_features <- function(x, ...) { | |||
filter_features.FacileDataSet <- function(x, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only consideration I'd make here is whether or not x
will guaranteeably be a FacileDataSet
any time filter_features
is required. Are there any situations a user could get into where this won't dispatch?
@@ -24,7 +24,7 @@ | |||
#' data to be \code{\link[dplyr]{collect}}ed when \code{db} is provided, | |||
#' otherwise a \code{tbl_df} of the results. | |||
#' @family API | |||
fetch_assay_data <- function(x, features, samples=NULL, | |||
fetch_assay_data.FacileDataSet <- function(x, features, samples=NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stopifnot(is.FacileDataSet(x))
becomes redundant.
@@ -28,7 +28,7 @@ | |||
#' filter(variable == "subtype_crc_cms", value %in% c("CMS3", "CMS4")) %>% | |||
#' collect | |||
#' setequal(crc.34$sample_id, eav.query$sample_id) | |||
filter_samples <- function(x, ..., with_covariates=FALSE) { | |||
filter_samples.FacileDataSet <- function(x, ..., with_covariates=FALSE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is guaranteed by dispatch, unless someone is calling the method directly (which I'm guessing they shouldn't).
R/facile_api_generics.R
Outdated
stop("The FacileAPI requires that a specific method be written for this type.") | ||
} | ||
|
||
#' Retrieves grouping table for samples within a FacileDataSet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this documentation be in the specific method?
@@ -42,7 +42,7 @@ subtype_map <- function(x) { | |||
#' @param ... the NSE boolean filter criteria | |||
#' @return a facile sample descriptor | |||
#' @family API | |||
fetch_samples <- function(x, samples=NULL, assay="rnaseq", ...) { | |||
fetch_samples.FacileDataSet <- function(x, samples=NULL, assay="rnaseq", ...) { | |||
stopifnot(is.FacileDataSet(x)) | |||
dots <- lazy_dots(...) | |||
if (length(dots)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, as of R 3.5.0
you can use ...length()
for the number of arguments in ...
.
@jonocarroll you say:
I think there's a bit of confusion here, but I also think this confusion speaks a bit to my point 3 in this comment and my follow-up comment/clarification Which is to say that I think you're saying that "sometimes you want to know if the object you have at least advertises itself as supporting the 'Facile API'" As a point of clarification, though: nothing will likely inherit from
Imagine a I expect having This is also why I slightly prefer thinking of the "FacileInterface" more like a Java-"abstract class", and calling "FacileInterface" something like "FacileDataStore" because |
The confusion may be mine, but I was thinking of a scenario more along the lines of some function calling e.g. I suggested I have no problem with having an additional class |
I'll start by addressing your last TL;DR question:
There is no "unique contact surface". If "I" see a To this end, a If you're asking me "well what's the use of that?" Then I can only say that "interfaces" and/or "abstract classes" are just a useful tool to have in any OO language, and R doesn't officially have it is all. There is, of course, a big drawback in injecting some API or abstract class into the class hiearachy of "Facile data stores", which is that there are no runtime guarantees that the object, in fact, does implement all functions, but we've got our hands tied here ... Back to your first paragraph Could you give a pseudo-code snippet that mocks up the scenario that provides an example of the scenario you are thinking of? To clarify part of a comment I made earlier, raw" dplyr verbs (like filter.facileframe <- function(.data, ...) {
orig.classes <- class(.data)
.fds <- fds(.data)
class(.data) <- setdiff(orig.classes, "facileframe")
res <- filter(.data, ...)
class(res) <- orig.classes
set_fds(res, .fds)
} (... how's that for a painful improv-implementation ...) I pretty strongly believe that we don't need S4 or R6, so if you could provide some pseudo-code that outlines your scenario here, maybe we can figure this out a bit better? To your last paragraph (FacileInterface hierarchy ...) We are simply trying to enable multiple data storage implementations to be used as "drop in replacements" for functions in the "facileverse". A scenario would be something like this, let's take: a. the b. Let's also take that same recount2 Further imagine that both have "sample covariates" named We'd like these function calls to both work, and (more or less) return equivalent data: fds.samples <- filter_samples(tcga.fds, sex == "f")
se.samples <- filter_samples(tcga.se, sex == "f") I'm pretty sure you get that though, what's confusing me is when you say:
Sorry for being pedantic, but we're not extending an API, but rather just implementing the same "Facile API" for different data storage containers, and injecting an API-like (or "abstract class"-like) "thing" in the class hierarchy simply tells "whoever is asking" that this "thing" (a FacileDataSet, or a FacileSummarizedExperiment, or a FacileMultiAssayExperiment, or a FacileDGEList) does, in fact, support the facile api. ... of course the program can realize that some object "FacileSomething" doesn't support the API when it errors-out because "FacileSomething" doesn't have a Does that help at all? If not, I think maybe we're just on the wrong frequency and will need another way to chat about this one ;-) EDIT: actually, another example that might help motivate this is to think of some of the higher level functions that combine results from lower level ones. We have some functions like "assays_over_samples" or something (sorry, already closed my laptop for the night :-). As long as some of the lower level facile api functions are defined and return the results in the correct data.frame format, this function can be sure to work on anything that is a "FacileDataStore" ... does that help, hopefully? |
Okay, I think I get the motivation behind adding In your TCGA example I would expect a My 'extends' comments were in the train of thought of In our case we have a single method and a disabled generic, so I wanted to be careful about what we would be falling back to if the dispatch was not correct (if, through error or mishandling by the user, I suspect a lot more of my misunderstandings can be solved by me buying you a few beers, so I'll leave it there. |
Glad you guys are excited about this!
I'm happy with making FacileDataStore the parent class of any of our FacileXXX classes. But, I think we can do SummarizedExperiment and such without decorator classes. The only code that should care what type of object we have are the methods in FacileInterface. And, they already know what class you are because they are methods and you got into them via dispatch. Let's see where that gets us. If we have to do FacileSummarizedExperiment, so be it. |
Oh, and by AbstractFacileXXX I just meant AbstractFacileDataStore or AbstractFacileBackend or something. They can be only one. |
Also, did anyone get my "Fetch" joke in the comments? |
If I may, I will join the fun. @phaverty, I also really like that you got this conversation going on a concrete example of an "R-Interface" for current and future FacileDataStores. As a side note, because of the is(x, "FacileDataStore"), highlighted by @lianos , I would also vote for the "interface" to be called "FacileDataStore" as opposed to FacileInterface. In terms of the API methods you illustrate (organism, samples, fetch_XXX, ...), I am not sure to understand why the fallback Facile API functions are on ***.default. |
I'm happy to have the class for FacileDataSet be The point of the default methods is that there is no default. You have to write a specific method for each class. With the inheritance above, you could fall back to the method on FacileDataStore, but these methods will probably all depend on the specific backend implementation. Where ever we can write generic code, we can put that in the default, or in the FacileDataStore, as low as possible. For example, if we can define fetch_x in terms of only fetch_y and fetch_z, then we win. I really like that pattern and I use it in gneDB and in my julia stuff. |
@VRouilly: thank you for chiming in, I was going to poke you via a CC when I got to work this morning. @phaverty: Pretty sure we're 99.83% driving towards the same destination:
If you feel strongly about sticking the So, if we agree to inject Just in the way that R works, I can't figure out how having both I totally get that using I think I was thinking along the same lines as what @VRouilly mentioned re: " fetch_assay_data.FacileDataStore <- function(x, ...) {
stop("This function needs to be implemented in a storage-specific manner")
} However, I think @phaverty's motivation for rather stoping on the Related to this, if we did go the "decorator" route, then @phaverty if you want to take a stab at going the "pure" API route w/o the decorator (not now, I know, just talking about sometime in the future), then I'm all for it. In conclusionAlthough there's a lot of discussion here, like I said before I actually think we're actually all driving towards the same vicinity given the "concluding" remarks from @phaverty and @VRouilly, so I "approve" of this PR in the general sense, because I think small technical details can be iterated over. @jonocarroll you're right in that we only have one implementation of a I think @phaverty is excited about leveraging TileDB for instance(?) And we also had spoke briefly somewhere on github about trying to make a |
FacileInterface is just a label for the set of functions. It will be seen in the Roxygen "family" tags and we can use it when we talk about these functions. It will not appear in the code. |
Yes, I want a few more backends. I'm interested in TileDB. We are re-doing the whole EP backend and I want facile to talk to that. There will be many ... |
I'm hopeful that we can pull this off without making everything inherit from AbstractFacileDataStore. At this point, my primary need is for a list of all the methods that make up the API so I can understand the landscape and think about building abstractions between the layers of the app. |
As we, @phaverty @lianos @jonocarroll , all seem to be a similar mindset, how about getting practical to feed our discussions ? In terms of backends of immediate interest, do we agree with this list:
Is it the right time to do this ? ps: could you add me as a reviewer so I can get notifications, thx. |
@VRouilly I agree with you that having each of us try and hammer out facile versions of different data stores will help to flesh out the API and its corner cases, but I don't think that right now is the right time to do really do it. I think laying down the groundwork as we've done here to just enable to do that in the future w/o having to change "much" in the core codebase is sufficient, for now. Again: this is just my opinion given time we all have to contribute directly to this effort and certain deadlines we (I, at least) would love to meet (with this and other projects). |
TileDB may be part of ExpressionPlot or I may swap it in for HDF5 in
FacileDataSet2. For ExpressionPlot, we'll just use their API to ask for a
SummarizedExperiment for a given project. I'll probably try that out so I
can see if this Interface provides an abstraction that is sufficient to
separate FacileExplorer from FacileDataSet.
Pete
…____________________
Peter M. Haverty, Ph.D.
Genentech, Inc.
phaverty@gene.com
On Tue, Aug 21, 2018 at 11:31 AM, Steve Lianoglou ***@***.***> wrote:
@VRouilly <https://github.com/VRouilly> I agree with you that having each
of us try and hammer out facile versions of different data stores will help
to flesh out the API and its corner cases, but I don't think that right now
is the right time to do really do it.
I think laying down the groundwork as we've done here to just enable to do
that in the future w/o having to change "much" in the core codebase is
sufficient, for now.
Again: this is just my opinion given time we all have to contribute
directly to this effort and certain deadlines we (I, at least) would love
to meet (with this and other projects).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AH02K34cVWCfKmrXBCVKKD2NWKnBZ7yAks5uTFIVgaJpZM4WFFgH>
.
|
Thusly?