help with an erro please #8

dklcdbi · 2024-02-14T19:07:05Z

Hello, I was trying to use your BindingSiteFinder and got an error:

Error in if (all(!df.global$increaseOverMin)) { :
missing value where TRUE/FALSE needed

I was using BSFind(object = KObds, anno.genes = gns, anno.transcriptRegionList = regions,

          est.subsetChromosome = "chr3", veryQuiet = TRUE, est.maxBsWidth = 29)

and it also gave following message:
2.16% (17/788) peaks overlap with multiple anno.genes in the given gene annotation.
A single instance of each peak is kept. This is recommended.

Could you please tell me waht could be the problem? thanks

The text was updated successfully, but these errors were encountered:

MirkoBr · 2024-02-22T09:51:46Z

Hi,

thanks for point out this error.

It seems like you are running into a problem that happens with the automated binding sites width estimation. This is done through the estimateBsWidth() function, which is automatically invoked if you run BSFind() without specifying a particular binding sites size (via the parameter bsSize).

Without having seen some of your data as an example I can't follow it up in more detail.

For the moment I suggest, that you try to manually estimate the correct binding site size. Please have a look at the chapters 5.2 and 5.3 in the vignette on how to do this.

Hope this helps!

KevinDJMin · 2024-07-02T15:34:17Z

Hi, it seems like I also encountered the error right here. As the default vignette showed subset.chromosome to be limited, I have created a list to incorporate about ten more chromosomes, but still encountered the error.

I tried to manually estimate the bsSize following the manual, but it cannot detect binding site signals:

Current definition does not result in binding sites on all chromosomes where signal was present.
No binding sites on: chr8 chr13 chr14 chr15 chr20 chr22 ML143359.1 ML143366.1 ML143369.1 ML143371.1 ML143377.1 ML143380.1 GL000009.2 GL000214.1 GL000218.1 GL000219.1 GL000220.1 GL000224.1 GL000255.2 GL000258.2 GL339449.2 GL383522.1 GL383546.1 GL949747.2 KI270333.1 KI270336.1 KI270337.1 KI270442.1 KI270466.1 KI270467.1 KI270706.1 KI270728.1 KI270733.1 KI270745.1 KI270754.1 KI270819.1 KI270831.1 KI270832.1 KI270850.1 KI270853.1 KI270866.1 KI270868.1 KI270879.1 KI270897.1 KI270905.1 KI270908.1 KI270913.1 KN538372.1 KV575244.1 KV880768.1 KZ208915.1 KZ559112.1

Any suggestion?

Thank you.

MelinaKlostermann · 2024-07-02T15:44:58Z

Hi,
I have a few questions for clarification.
So you used your own data set and you find crosslinks on all of these chromosomes right? My first thought would be that the signal is to weak on these chromosomes to define binding sites. Maybe we can check that first. Did you get pureclip peaks on all of these chromosomes?
Did you check once in a Genome Browser (like IGV) if you encounter a bell shaped distribution of peaks somewhere on these chromosomes? What is the maximum number of crosslinks in the center of these peaks? (Both very low and very high numbers below 20-30 or above 1000 can cause problems here). Could you maybe send me a screenshot of the signal in the genome browser of one of these peaks that do not become binding sites?

MelinaKlostermann · 2024-07-02T15:50:55Z

I made a new issue for this as it sounds like a different problem for me.

KevinDJMin · 2024-07-02T15:56:58Z

So you used your own data set and you find crosslinks on all of these chromosomes right?

Yes, I have completed in RBFOX2 eCLIP-seq PureCLIP crosslinking site definitions globally.

**Did you get pureclip peaks on all of these chromosomes?

Yes, when I manually checked the crosslinking sites outputted by PureCLIP peaks, all peaks are present in the chromosomes.

I am thinking on possibly the bigwig file that is having the problem, if the signal is not well captured. I will get back on the IGV screenshots in chromosome 8 which shows high enrichment score according to PureCLIP. Any recommendations on making a bigwig file? I was using deeptools bamcoverage function.

KevinDJMin · 2024-07-02T16:01:09Z

Here is one IGV image that shows binding sites at chromosome 8. BED file denotes the PureCLIP crosslinking site defined. -f.bw denotes the forward bigwig, with -r.bw the reverse bigwig for your reference.

I highly think my bigwig is incorrect. Any suggestions for rectifications?

MelinaKlostermann · 2024-07-03T06:25:59Z

I think one problem are indeed the bigwig files. BindingSiteFinder is build to work with single nucleotide crosslinks in bigwig format. For this you cannot turn the complete bam file into a bigwig but you need to select the nucleotide position 1nt upstream of the 5'end of the read in the bam file. To do that I usually turn the bam first into bed, select the upstream nucleotide and then turn it to bigwig after. (This approach comes from the iCLIP processing workflow, published by Busch et al 2020: https://www.sciencedirect.com/science/article/pii/S1046202318304948)

Here is a possible bash code to do this:

#### Convert all read locations to intervals in bed file format using BEDTools
  bedtools bamtobed -i {input.bam} > {output.bed}  \
  #### Shift intervals depending on the strand by 1 bp upstream using BEDTools
  bedtools shift -m 1 -p -1 -i {output.bed} -g {params.genome_fasta}.fai | sort -k1,1 -k2,2n -k3,3n > {output.bed_shift}
  #### Extract the 5' end of the shifted intervals and pile up into coverage track in bedgraph file format (separately for each strand) using BEDTools (in case of RPM-normalised coverage tracks, use additional parameter -scale with 1,000,000/#mappedReads)
  bedtools genomecov -bg -strand + -5 -i {output.bed_shift} -g {params.genome_fasta}.fai | LC_COLLATE=C sort -k1,1 -k2,2n -k3,3n > {output.bed_plus} 
  bedtools genomecov -bg -strand - -5 -i {output.bed_shift} -g {params.genome_fasta}.fai | LC_COLLATE=C sort -k1,1 -k2,2n -k3,3n  > {output.bed_minus} 
  #### conversion of bedgraph files to bw file format files using bedGraphToBigWig of the kentUtils suite
  bedGraphToBigWig {input.bed_plus} {params.genome_fasta}.fai {output.bw_plus} 
  bedGraphToBigWig {input.bed_minus} {params.genome_fasta}.fai {output.bw_minus}

It uses bedtools and bedGraphToBigWig from the kentUtils suite. Both can be installed with conda.

MelinaKlostermann · 2024-07-03T06:33:04Z

The code above comes from our clip processing tool racoon_clip, which retrieves the single nucleotide crosslinks from the raw fastq files. There racoon_clip also does the alignment to obtain the bam files a bit differently from encode, because it does not allow soft-clipping at the 5' end of the read and by default does not allow multimappers. Both make the final crosslink positions a bit cleaner. So maybe you could consider running racoon_clip on the encode fastq files instead (https://racoon-clip.readthedocs.io/en/latest/). But it should also work with using the code above.

MelinaKlostermann · 2024-07-03T06:51:00Z

If you rather want to do the shifting in R instead of bash you could also:
Turn the bam files into bed files (in bash).
Then something like this R code should work:

library(rtracklayer)
library(GenomicRanges)
crosslinks <- import.bed("file.bed")
crosslinks <- resize(crosslinks, width = 1, fix="start")
crosslinks <- shift(crosslinks, upstream = 1)
export.bw("file.bw")

KevinDJMin · 2024-07-03T06:54:52Z

Awesome! Thanks for the suggestion. I will try the above options and get back to you as soon as possible.

By the way, thank you for developing an awesome software! I actually had a very hard time to filter the robust crosslinking peaks in PureCLIP.

KevinDJMin · 2024-07-04T03:49:28Z

Dear Melina,

I have converted the bigwig file as what you described, and it works without any errors. Apparently it was caused by my lack of knowledge on CLIP-seq bigwig file. Hope this forum can help on the others who are having the same troubles.

Thank you.

MirkoBr self-assigned this Apr 20, 2024

MelinaKlostermann mentioned this issue Jul 2, 2024

Lost Signal on some chromosomes ZarnackGroup/racoon_clip#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

help with an erro please #8

help with an erro please #8

dklcdbi commented Feb 14, 2024 •

edited

Loading

MirkoBr commented Feb 22, 2024

KevinDJMin commented Jul 2, 2024

MelinaKlostermann commented Jul 2, 2024

MelinaKlostermann commented Jul 2, 2024

KevinDJMin commented Jul 2, 2024

KevinDJMin commented Jul 2, 2024 •

edited

Loading

MelinaKlostermann commented Jul 3, 2024 •

edited

Loading

MelinaKlostermann commented Jul 3, 2024

MelinaKlostermann commented Jul 3, 2024

KevinDJMin commented Jul 3, 2024

KevinDJMin commented Jul 4, 2024

help with an erro please #8

help with an erro please #8

Comments

dklcdbi commented Feb 14, 2024 • edited Loading

MirkoBr commented Feb 22, 2024

KevinDJMin commented Jul 2, 2024

MelinaKlostermann commented Jul 2, 2024

MelinaKlostermann commented Jul 2, 2024

KevinDJMin commented Jul 2, 2024

KevinDJMin commented Jul 2, 2024 • edited Loading

MelinaKlostermann commented Jul 3, 2024 • edited Loading

MelinaKlostermann commented Jul 3, 2024

MelinaKlostermann commented Jul 3, 2024

KevinDJMin commented Jul 3, 2024

KevinDJMin commented Jul 4, 2024

dklcdbi commented Feb 14, 2024 •

edited

Loading

KevinDJMin commented Jul 2, 2024 •

edited

Loading

MelinaKlostermann commented Jul 3, 2024 •

edited

Loading