Isolated immune cell reconstruction evaluation #5

jaclyn-taroni · 2018-04-11T12:58:45Z

This PR adds 04-isolated_immune_cell_reconstruction.

In this notebook, I evaluate the reconstruction of the sorted leukocyte microarray dataset introduced in 03-isolated_cell_type_populations (E-MTAB-2452) as compared to a sorted leukocyte RNA-seq dataset that is included in the recount2 dataset (SRP045500), and, therefore, the training set for the PLIER model under consideration.

During #3, the idea of using only high-weight genes for reconstruction came up (see #4). I've chosen not explore this at this time because my goal was/is to test how the subset of latent variables used for reconstruction (all vs. only pathway-associated vs. only thoses LVs that are not significantly associated with any gene sets -- I assume these capture variation from technical factors), rather than to improve the reconstruction performance. I think exploration of improving reconstruction performance would probably require a deeper dive than makes sense for this particular project. Please let me know what you think.

Here's the notebook HTML file for easy viewing:
04-isolated_immune_cell_reconstruction.nb.html.zip

I've made a few changes upstream of this notebook in 02-recount2_PLIER_exploration to save the recount2 reconstructed expression data and associated evaluation metrics, as well.

gwaybio

only minor comments

gwaybio · 2018-04-11T15:28:59Z

04-isolated_immune_cell_reconstruction.Rmd

+gs.file <- file.path("data", "expression_data", 
+                     "E-MTAB-2452_hugene11st_SCANfast_with_GeneSymbol.pcl")
+exprs.df <-readr::read_tsv(gs.file)
+exprs.mat <- as.matrix(exprs.df[, 3:ncol(exprs.df)])


why starting at column 3? Maybe add comment about what first 2 columns are

gwaybio · 2018-04-11T15:34:15Z

04-isolated_immune_cell_reconstruction.Rmd

+                height = 11, width = 8.5)
+```
+
+### E-MTAB-2452 Boxplots


The x axis tick labels seem to bleed a bit into one another. Is it possible to rename them? For example, when their is an n = in the label, I tend to put on a new line

gwaybio · 2018-04-11T15:36:18Z

04-isolated_immune_cell_reconstruction.Rmd

+ggplot2::ggsave(plot.file, plot = ggplot2::last_plot())
+```
+
+## Summary


Are all datasets trained together? or are different models trained on each individually? Or, are the models trained using a single dataset and the other datasets are transformed into this space?

Or, are the models trained using a single dataset and the other datasets are transformed into this space?

This one -- a single PLIER model is trained on the recount2 dataset, which includes SRP045500. E-MTAB-2452 is transformed into the recount2 PLIER space.

I'm working with LVs from this recount2 PLIER model exclusively, but in some cases I'm using only LVs that are significantly associated with a pathway or only those LVs that are not associated with a pathway.

small doc change, relabel boxplot x axis ticks

huqiwen0313

Looks good ！ only some minor comments

huqiwen0313 · 2018-04-11T19:30:09Z

04-isolated_immune_cell_reconstruction.Rmd

+                    dplyr::mutate(MASE = as.numeric(as.character(MASE)),
+                    `Spearman correlation` = 
+                      as.numeric(as.character(`Spearman correlation`)))
+


Not quite understand why MASE need to do as.numeric(as.character(MASE)) conversion.

When binding together the columns, MASE and the Spearman correlation end up as factors.

huqiwen0313 · 2018-04-11T20:00:39Z

04-isolated_immune_cell_reconstruction.Rmd

+  ggplot2::theme_bw() +
+  ggplot2::scale_fill_manual(values = c("white", "gray50", "black")) +
+  ggplot2::ggtitle(paste("All, n =", ncol(z.matrix)))
+```


In the plot, recount2 means reconstructing the gene expression of recount samples based on recount PLIER model ?

Yes, that is correct.

huqiwen0313 · 2018-04-11T21:02:31Z

04-isolated_immune_cell_reconstruction.Rmd

+                       "E-MTAB-2452_reconstruction_error_recount2_model.pdf")
+ggplot2::ggsave(plot.file, plot = ggplot2::last_plot())
+```
+


Maybe, add statistics to show the distributions is significant different (t-test or ANNOVA) ? One benefit is it can provide a quantitative way to support the conclusion (e.g. the pre- and post-reconstruction correlation values are much more similar between two datasets), but it is depends on you since the difference is clear from the plot.

pairwise.t.test coming up in the next commit

jaclyn-taroni added 4 commits April 9, 2018 15:48

Update: save reconstructed data and eval df

8e99919

Add evaluation df missed in last commit

009fbbb

Add sample names to eval df

62d3f71

Add notebook for reconstruction in sorted cell types data

609d77c

jaclyn-taroni requested review from gwaybio and huqiwen0313 April 11, 2018 12:58

gwaybio approved these changes Apr 11, 2018

View reviewed changes

Small updates in response to PR comments

009cd09

small doc change, relabel boxplot x axis ticks

huqiwen0313 approved these changes Apr 11, 2018

View reviewed changes

Add pairwise t-test, rerun

8e5510d

jaclyn-taroni merged commit 384dcf6 into greenelab:master Apr 12, 2018

jaclyn-taroni deleted the cell-type-recon branch April 12, 2018 01:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Isolated immune cell reconstruction evaluation #5

Isolated immune cell reconstruction evaluation #5

jaclyn-taroni commented Apr 11, 2018

gwaybio left a comment

gwaybio Apr 11, 2018

gwaybio Apr 11, 2018

gwaybio Apr 11, 2018

jaclyn-taroni Apr 11, 2018

huqiwen0313 left a comment

huqiwen0313 Apr 11, 2018

jaclyn-taroni Apr 12, 2018

huqiwen0313 Apr 11, 2018

jaclyn-taroni Apr 12, 2018

huqiwen0313 Apr 11, 2018

jaclyn-taroni Apr 12, 2018

Isolated immune cell reconstruction evaluation #5

Isolated immune cell reconstruction evaluation #5

Conversation

jaclyn-taroni commented Apr 11, 2018

gwaybio left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huqiwen0313 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment