You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the standard analysis, filter_missval() silently sorts the rowData alphabetically. If we skip this step, this results in a rowData where the rownames and the name column contain the same information, but add_rejections() later sorts on the name column without reordering rownames, getting values that do not match. This creates problems later on when functions such as single_plot() use subset(), which work on rownames, then tries to use the name column.
For example, with the steps from ?add_rejections:
# Load example
data <- UbiLength
data <- data[data$Reverse != "+" & data$Potential.contaminant != "+",]
data_unique <- make_unique(data, "Gene.names", "Protein.IDs", delim = ";")
# Make SummarizedExperiment
columns <- grep("LFQ.", colnames(data_unique))
exp_design <- UbiLength_ExpDesign
se <- make_se(data_unique, columns, exp_design)
# Filter, normalize and impute missing values
# filt <- filter_missval(se, thr = 0) # <- no filtering step: no sorting!
norm <- normalize_vsn(se)
imputed <- impute(norm, fun = "MinProb", q = 0.01)
# Test for differentially expressed proteins
diff <- test_diff(imputed, "control", "Ctrl")
dep <- add_rejections(diff, alpha = 0.05, lfc = 1)
Then the resulting object has wrong names:
SummarizedExperiment::rowData(dep, use.names = TRUE) |>
as.data.frame() |> head() |> dplyr::select(name, ID)
#> name ID
#> RBM47 A7E2Y1-3 A7E2Y1-3
#> UBA6 AAAS Q9NRG9
#> ILVBL AAMDC Q9H7C9-3
#> FAM92A1 AAR2 Q9Y312
#> ACO2 ABCB6 H7BXK9
#> RPRD1B ABCB7 O75027-3
plot_single(dep, proteins = c("A7E2Y1-3","AAAS"))
#> Warning: 6 parsing failures.
#> row col expected actual
#> 1 -- value in level set ACTR3
#> 2 -- value in level set ACTR3
#> 3 -- value in level set ACTR3
#> 4 -- value in level set TUFM
#> 5 -- value in level set TUFM
#> ... ... .................. ......
#> See problems(...) for more details.
For users: always use filter_missval(), even with a high threshold.
The text was updated successfully, but these errors were encountered:
During the standard analysis,
filter_missval()
silently sorts therowData
alphabetically. If we skip this step, this results in arowData
where therownames
and thename
column contain the same information, butadd_rejections()
later sorts on thename
column without reorderingrownames
, getting values that do not match. This creates problems later on when functions such assingle_plot()
usesubset()
, which work on rownames, then tries to use thename
column.For example, with the steps from
?add_rejections
:Then the resulting object has wrong names:
For users: always use
filter_missval()
, even with a high threshold.The text was updated successfully, but these errors were encountered: