SIG: new data structures for Bioconductor #8
Comments
From @vjcitn on October 22, 2017 4:45 Levi, I'll try to function as a scribe. On Sun, Oct 22, 2017 at 12:26 AM, Levi Waldron notifications@github.com
|
From @bhaibeka on October 23, 2017 11:41 I strongly support this initiative of course. Many of these datasets are now available (see picture) and although PharmacoGx::PharmacoSet objects do their job, they do not deal efficiently with data access and storage. @p-smirnov has deep experience with these pharmacogenomics datasets and would be interested in contributing. |
From @vjcitn on October 23, 2017 13:4 Hi Ben -- where is that image from? Public domain? I am working on a On Mon, Oct 23, 2017 at 7:41 AM, Benjamin Haibe-Kains <
|
From @bhaibeka on October 24, 2017 20:5 I drew the picture from scratch, feel free to reuse. For more, you can borrow any slides from here: https://www.pmgenomics.ca/bhklab/research/presentations |
From @p-smirnov on November 7, 2017 1:47 I would like to attend, just waiting for confirmation from the conference about registration. It would be great for |
From @lgatto on November 7, 2017 9:18 @p-smirnov haven't you received your invitation email yet? |
From @p-smirnov on November 7, 2017 13:5 @lgatto I searched through my email and found it last from last Friday. It was sorted out of my inbox so I missed seeing it. |
Great you can come @p-smirnov, I'm really looking forward to it! |
Initial agenda. Understood now from Laurent's comment below that we have four hours, 1-5pm. So here is a tentative schedule - I've scheduled more time for the pharmacogenomics component only because I know the measurable outcome to hopefully come from it, but certainly don't mind balancing if the VariantExperiment discussion needs more time.
Outcomes:
|
From @lgatto on November 28, 2017 18:25
Yes, it's meant to from 1pm to 5 pm. We will be serving coffee at 3pm, but people are free to grab a cup and continue as they see fit. |
From @lawremi on November 28, 2017 18:45 I feel bad that I can't make it to this SIG. I guess it's not feasible for me to attend remotely? Looking forward to the minutes. |
@lawremi you're willing to attend any of it between 1-5pm UK time (5-9am west coast time?), we'd certainly appreciate your presence. |
From @lawremi on November 29, 2017 3:50 Unfortunately I'll be in Australia and I think that's 12-4 AM so probably not. I'll at least be trying to sleep ;) |
A gist providing some dose-viability data to play with.
|
And some slides for pharmacogenomics and for on-disk data structures |
From @federicomarini on December 4, 2017 14:17 Here's the link for the benchmarking work by Mike Smith we touched upon: |
From @vjcitn on December 4, 2017 14:20 I had volunteered to be a scribe for this meeting. Very rudimentary notes https://docs.google.com/document/d/15FWsVlQEGUTn5ys0GRL56ixOHzG04J7kq1IPMiRQyKM/edit?usp=sharing On Mon, Dec 4, 2017 at 2:17 PM, Federico Marini notifications@github.com
|
@bhaibeka @p-smirnov @vjcitn want to continue this BOF at Bioc2018 in July? |
Levi, it is a good idea to try to continue this SIG. Thanks |
I have a prototype of a "long" format way of storing drug sensitivity data I would like some feedback on. @bhaibeka would you be able to attend? |
You're on the agenda @p-smirnov . Anyone else, just let me know, either in advance or during the session... |
Here are the (more or less) slides I presented: https://www.slideshare.net/LeviWaldron/why-reuse-core-classes And the code I used to demo exploring the inheritance and methods of some classes:
|
And the repo @p-smirnov posted with some code to define a demo object: https://github.com/bhklab/longArray |
From @lwaldron on October 22, 2017 4:26
This SIG will discuss recent and needed Bioconductor data classes. Some recent or in-testing data classes to discuss are:
MultiAssayExperiment
(for "gluing" different types of assays together)RaggedExperiment
(for copy number, mutations, or other data represented by different genomic ranges for each sample)restfulSE::RESTfulSummarizedExperiment
,restfulSE::BQSummarizedExperiment
for remote storage + local interactive analysis of very large datasetsOne presently identified need is a Bioconductor class for representing the drug sensitivity data from pharmacogenomics studies such as the Cancer Cell Line Encyclopedia (CCLE) and NCI-60. These studies perform standard -omics assays, but also dose-response experiments where cell lines are subjected to varying doses of each of numerous compounds. Responses are measured as cell viability, and the resulting dose-response curves are summarized using measures such as LC-50. The full dose-response data are a 3-D array (dose x time x cell line), which should be stored in addition to summary measure matrices (e.g. LC-50 concentration x cell line) The PharmacoGx Bioconductor package from the @bhaibeka lab provides numerous curated pharmacogenomics datasets as rich
PharmacoSet
objects, but these lack the flexibility and novel data storage models that would be available using aSummarizedExperiment
-derived object for sensitivity data contained along with -omics assays within aMultiAssayExperiment
. Therefore a desired outcome from this SIG is a draft class definition for cell line drug sensitivity data extending fromSummarizedExperiment
. This would accomplish both a needed new data class, and experience for those participating in extending existing core data structures to novel data types.Topic leader: Levi Waldron @lwaldron
Scribe: Vincent Carey @vjcitn (Vince can I volunteer you?)
Any interested participants are invited to use the issue to ask questions, suggest other relevant topics for discussion, and/or express their interest in participating.
Copied from original issue: Bioconductor/EuroBioc2017#5
The text was updated successfully, but these errors were encountered: