Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems if not all samples have the same chromosomes #4

Closed
MelinaKlostermann opened this issue Jan 18, 2023 · 2 comments
Closed

Problems if not all samples have the same chromosomes #4

MelinaKlostermann opened this issue Jan 18, 2023 · 2 comments
Assignees

Comments

@MelinaKlostermann
Copy link
Contributor

Hi,
when I use makeBindingSites, I get the following error
Error in .subset_by_GenomicRanges(x, i) :
‘x’ must have unique names when subsetting by a GenomicRanges subscript.

I think the error comes from the .collapseSamples function, that uses
for (i in seq_along(p)) {
pSum = pSum + p[[i]]
}
This will not add up the chromosomes right if there are different chromosomes or the chromosomes do not have the same order in to samples.

when I merge these two samples

names(signal$signalPlus$1_oe_FLAG)
[1] “KI270802.1” “chr1" “chr10”
[4] “chr11" “chr12” “chr13"
[7] “chr14” “chr15" “chr16”
[10] “chr17" “chr18” “chr19"
[13] “chr2” “chr20" “chr21”
[16] “chr22" “chr3” “chr4"
[19] “chr5” “chr6" “chr7”
[22] “chr8" “chr9” “chrM”
[25] “chrX” “chrY”
names(signal$signalPlus$2_oe_FLAG)
[1] “GL000225.1” “chr1" “chr10”
[4] “chr11" “chr12” “chr13"
[7] “chr14” “chr15" “chr16”
[10] “chr17" “chr18” “chr19"
[13] “chr2” “chr20" “chr21”
[16] “chr22" “chr3” “chr4"
[19] “chr5” “chr6" “chr7”
[22] “chr8" “chr9” “chrM”
[25] “chrX” “chrY”
the merge does not contain “GL000225.1”
names(p)
[1] “KI270802.1" “chr1” “chr10" “chr11” “chr12"
[6] “chr13” “chr14" “chr15” “chr16" “chr17”
[11] “chr18" “chr19” “chr2" “chr20” “chr21"
[16] “chr22” “chr3" “chr4” “chr5" “chr6”
[21] “chr7" “chr8” “chr9" “chrM” “chrX”
[26] “chrY”

instead “KI270802.1” and “GL000225.1" are added and called “KI270802.1”

If I then merge all 4 samples the merge looks like this:
names(sgnMergePlus)
[1] “KI270802.1” “chr1" “chr10”
[4] “chr11" “chr12” “chr13"
[7] “chr14” “chr15" “chr16”
[10] “chr17" “chr18” “chr19"
[13] “chr2” “chr20" “chr21”
[16] “chr22" “chr3” “chr4"
[19] “chr5” “chr6" “chr7”
[22] “chr8" “chr9” “chrM”
[25] “chrX” “chrY” NA
[28] NA

and it will through an error because two names are NA. However, it would probably not cause an error if just one name is NA, which is kind of dangerous. Because it might add up the wrong stuff without causing an error.

@MelinaKlostermann
Copy link
Contributor Author

I think a fix for the .collapsSamples function would probably be
pSum = p[[1]]
names = unique(unlist(sapply(p, names)))
for (i in 2:length(p)) {
pSum = c(pSum, p[[i]])
pSum = split(pSum, names(pSum))
pSum = lapply(pSum, function(x){
if (length(x)==2){
x = x[[1]] + x[[2]]
}
})
}

@MelinaKlostermann
Copy link
Contributor Author

Hi, my fault. I updated to BindingSiteFinder Version 1.4.0 and this seems to solve the problem.

bds = BSFDataSetFromBigWig(ranges = pureclip_sites$PURB_oe_FLAG_merged_pureclip_sites_mtp001.bed, meta = meta)
Input ranges are not sorted, sorting for you.
Fixed ranges input, removing chr: GL000009.2 GL000195.1 GL000205.2 GL000213.1 GL000214.1 GL000219.1 GL000220.1 GL000224.1 GL000251.2 GL000252.2 GL000253.2 GL000254.2 GL000255.2 GL000256.2 ...

@MirkoBr MirkoBr closed this as completed Jun 22, 2023
@MirkoBr MirkoBr self-assigned this Apr 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

2 participants