[Feature Request] Differing number of replicates in datasets #3

nlgittens · 2024-04-29T14:22:58Z

Issue: ReX can only handle datasets in which there is identical number of replicates across timepoints (perhaps across states also?).

This may be quite a common issue as there can be missed timepoints in certain datasets; different number of non-deuterated experiments; different number of replicates between states, which cannot be handled here.

Might be something to do with matrix being defined by number of timepoints and number of replicates, rather than by distinct experiments? It seems to be an error in error_prediction function, but can imagine there may also be other implications across different functions too as we also defined number of timepoints elsewhere.

Reproducible example:

data("BRD4_apo")

#filter data so 0 s only contains 2 replicates; other timepoints contain 3 replicates
BRD4_apo <- BRD4_apo %>%
  filter(!(Exposure == 0 & replicate == 3))

BRD4_apo <- DataFrame(BRD4_apo)
BRD4_apo <- cleanHDX(res = BRD4_apo, clean = TRUE)
BRD4_apo <- data.frame(BRD4_apo) %>% filter(End < 100)
BRD4_apo <- DataFrame(BRD4_apo)

numTimepoints <- length(unique(BRD4_apo$Exposure))
Timepoints <- unique(BRD4_apo$Exposure)
numPeptides <- length(unique(BRD4_apo$Sequence))
set.seed(1)
rex_test <- rex(HdxData = BRD4_apo,
                  numIter = 100,
                  R = max(BRD4_apo$End), 
                  density = "laplace",
                  numtimepoints = numTimepoints,
                  timepoints = Timepoints,
                  seed = 1L,
                  tCoef = c(0, rep(1, numTimepoints - 1)),
                  phi = 1,
                  BPPARAM = SerialParam())

Warning: 'package:stats' may not be available when loadingWarning: 'package:stats' may not be available when loadingFold 1 ... Fold 2 ... Fold 3 ... Fold 4 ... Fold 5 ...
Warning in res$Uptake[res$Sequence == unique(res$Sequence)[j]] - rep(mu, :
longer object length is not a multiple of shorter object length

Fold 1 ... Fold 2 ... Fold 3 ... Fold 4 ... Fold 5 ...
Warning in res$Uptake[res$Sequence == unique(res$Sequence)[j]] - rep(mu, :
longer object length is not a multiple of shorter object length

Error: BiocParallel errors
2 remote errors, element index: 1, 2
0 unevaluated and other errors
first remote error:
Error in .sd[j, ] <- rep(tCoef * numExch[[j]] * sqrt(sigmasq), each = numRep): number of items to replace is not a multiple of replacement length

The text was updated successfully, but these errors were encountered:

ococrook · 2024-04-29T17:15:55Z

Thanks Nathan, I had this one on my list. It's more of an enchancement than a bug. There are two ways to deal with this:

Impute them
model them

Modelling them is quite computationally intensive but if there's lots of imputation then that can cause bias. I suggest I write a simpel imputation script that has a warning if there are lots of missing values?

ococrook · 2024-04-29T17:19:52Z

an example dataset might be useful!

nlgittens changed the title ~~[BUG] A short description of the bug~~ [BUG] Differing number of replicates in datasets Apr 29, 2024

ococrook changed the title ~~[BUG] Differing number of replicates in datasets~~ [Feature Request] Differing number of replicates in datasets Apr 29, 2024

ococrook added the enhancement New feature or request label Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Differing number of replicates in datasets #3

[Feature Request] Differing number of replicates in datasets #3

nlgittens commented Apr 29, 2024

ococrook commented Apr 29, 2024

ococrook commented Apr 29, 2024

[Feature Request] Differing number of replicates in datasets #3

[Feature Request] Differing number of replicates in datasets #3

Comments

nlgittens commented Apr 29, 2024

ococrook commented Apr 29, 2024

ococrook commented Apr 29, 2024