Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upset changes original data (as.data.table on DESeq/GRanges objects) #3230
Comments
|
Thank you for reporting. In future reports please include calls to attach required libraries in your minimal reproducible example. From brief investigation it looks like sapply(res, address)==sapply(as.data.frame(res), address)
# baseMean log2FoldChange lfcSE stat pvalue padj
# TRUE TRUE TRUE TRUE TRUE TRUE Will fix |
|
re-opening as PR with the fix is not yet merged |
I noticed a real pitfall (bug) while examining wrong results in my analysis.
Changing a data.table object also alters the original data from which it was copied with as.data.table. This happens for objects like DESeqResults and GRanges.
According to the vignette:
But this is apparently not true.
Here is a minimal example:
Now using a set function on DT also changes the values in the original res object.
We notice that the values for genes A and C are swapped. This is probably due to the fact that the values are sorted by setkey but the rownames of res are not!
Therefore any analysis using the original res will be completly wrong.
I am aware that there are fixes for this like:
but my main issue is the fact, that this behaviour is not obvious and very dangerous for downstream work.
This was already somehow reported but nothing has changed.
data.table issue
GRanges copy
I love using data.table, it is simply amazing.
Hopefully you can address this issue.
Thank you!