Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

function error for BSJs #2

Closed
marvel479 opened this issue Jun 1, 2020 · 18 comments
Closed

function error for BSJs #2

marvel479 opened this issue Jun 1, 2020 · 18 comments

Comments

@marvel479
Copy link

marvel479 commented Jun 1, 2020

Hi! I have been trying to do some downstream analysis of my circRNAs, and reached as far as module 12 with it. This was as on Friday, 28th May. Now when I am using the same R script as the package suggested and that was working before, it doesn't work anymore.

> backSplicedJunctions <- getBackSplicedJunctions(gen, pathToExperiment = NULL)
image

Here I get:
This is what my data looks like after reading the gtf file.
`> head(gen)

chrom start end width strand type gene_name
1 chr1 3073253 3074322 1070 + exon 4933401J01Rik
2 chr1 3102016 3102125 110 + exon Gm26206
3 chr1 3213609 3216344 2736 - exon Xkr4
4 chr1 3205901 3207317 1417 - exon Xkr4
5 chr1 3213439 3215632 2194 - exon Xkr4
6 chr1 3206523 3207317 795 - exon Xkr4
transcript_id exon_number
1 ENSMUST00000193812.1 1
2 ENSMUST00000082908.1 1
3 ENSMUST00000162897.1 1
4 ENSMUST00000162897.1 2
5 ENSMUST00000159265.1 1
6 ENSMUST00000159265.1 2`

@Aufiero
Copy link
Owner

Aufiero commented Jun 2, 2020

Hi,
from what you reported I think it is not a problem of the GTF file. The GTF file is read correctly.
The function getBackSpliceJunctions is trying to read the file containing the circRNA prediction results (circRNA_X.txt, see vignettes).
Check if you correctly set your working directory (folder projectFolderName) and that the subfolders (mapsplice or nclscan, knife, circexplorer2, or uroborus or circmarker or other) contain the circRNA_X.txt file. Check also that experiment.txt reports the correct file name. Let me know if you solved the problem.

S

@marvel479
Copy link
Author

marvel479 commented Jun 2, 2020 via email

@Aufiero
Copy link
Owner

Aufiero commented Jun 2, 2020

Hi Aayushi,

it might be that there was an update from the new version of dplyr package that is causing problems.

I'll look into it the very next days and if there are any changes I'll push those and you can install the latest version of circRNAprofiler.

@marvel479
Copy link
Author

marvel479 commented Jun 2, 2020 via email

@Aufiero
Copy link
Owner

Aufiero commented Jun 7, 2020

Hi Aayushi,

I fixed the bugs. Now the release version of circRNAprofiler builds correctly on Bioconductor.
You can install the release version of circRNAprofiler using:

if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("circRNAprofiler")

For the documentation see:
browseVignettes("circRNAprofiler")

Let me know if you can proceed with the analysis.

Best,
Simona

@marvel479
Copy link
Author

marvel479 commented Jun 7, 2020 via email

@marvel479
Copy link
Author

Hi Simona,
Looks like the BSJ function is working but I have error while using the motif functions.

> targetsBTS_gr <-
+   getSeqsFromGRs(
+     annotatedBackgroundCircs,
+     genome,
+     lIntron = 200,
+     lExon = 9,
+     type = "ie")
Error in .getOneSeqFromBSgenomeMultipleSequences(x, names[i], start[i],  : 
  sequence chrMT not found

I had no circRNA candidates from chrMT, as I filtered those out, and I still get this error when I delete chrM entries from the gtf data frame manually. It looks like the problem is with genome, I am using mm10, any ideas?

@Aufiero
Copy link
Owner

Aufiero commented Jun 8, 2020

Hi Aayushi,

thanks for letting me know. I'll do a test and I'll let you know.

S

@Aufiero
Copy link
Owner

Aufiero commented Jun 8, 2020

Did you run this to get the genome mm10, right?

if (!requireNamespace("BSgenome.Mmusculus.UCSC.mm10", quietly = TRUE)){
  BiocManager::install("BSgenome.Mmusculus.UCSC.mm10")
}
genome <- BSgenome::getBSgenome("BSgenome.Mmusculus.UCSC.mm10")

And which genome annotation (gtf) did you use? gencode, UCSC, NCBI or Ensemble?

@marvel479
Copy link
Author

marvel479 commented Jun 8, 2020 via email

@Aufiero
Copy link
Owner

Aufiero commented Jun 8, 2020

It might be, and if you remove the circRNAs arising from chrMT, it should work.
So in your case before running getSeqsFromGRs(), you should fix the chrom:

annotatedBackgroundCircsFixed <- annotatedBackgroundCircs %>%
dplyr::mutate(chrom = ifelse(.data$chrom == 'chrMT', 'chrM', .data$chrom))

BTW I will now introduce this check for chrMT in the code of circRNAprofiler.
I'll notify you when it's done.

S

@marvel479
Copy link
Author

marvel479 commented Jun 9, 2020 via email

@marvel479
Copy link
Author

Hey Simone,
Sorry about another bother. I have a new problem now, this time with the getMotif function.

`> motifsFTS_gr <-
   getMotifs(targetsFTS_gr,
            width = 6,
            database = 'ATtRACT',
            species = "Mmusculus",
            rbp = TRUE,
              reverse = FALSE)

I am getting this:

Error in if (targetsToAnalyze$type[1] == "circ") { : 
  missing value where TRUE/FALSE needed`

I do not believe it is something that I was seeing before, and as I understand it has something to with a default not getting read correctly, any idea what can be changed?

@Aufiero
Copy link
Owner

Aufiero commented Jun 10, 2020

No problem. Can you run this command and show me the output:
head(targetsFTS_gr)

@marvel479
Copy link
Author

This is what I get
`> head(targetsFTS_gr)
$upGR
id gene transcript strand chrom startGR endGR length
1 Syt14:-:chr1:192986882:192980345 Syt14 ENSMUST00000215093.1 - chr1 192986873 192987082 210
seq
1 AUACUUAUUUAGAAACGUUUUGAAAAAUUUUGAAUGAAUGCUGUUGAAUAUAUUGAAAAUAUAUCUUUACACUUUGUGGGGUUUCUUACAAACUAUUUUAAAUUAUGACAUUUUAAAAGUAGUUGACAUUAGAUAGCAAACUGUCUAGCGUAUACUGGGAAAUUCCUUCUUAUUCUCAGCCUUGACUUUCUUUUCUCUAGUCUCUCCAGA
type
1 ie

$downGR
id gene transcript strand chrom startGR endGR length
1 Syt14:-:chr1:192986882:192980345 Syt14 ENSMUST00000215093.1 - chr1 192980146 192980355 210
seq
1 AUGGAGGACGGUAAGAACGCUAUUAUUUUAGAUGAUUUAUACCUAAAAUUCUAGGUAUGUAUGGUUGCACACUGACAGAGAGAGUGCAGAAUUCUGUUUCUAAGUGCAGUCCAAAUAAAAAGUUUGAGCUUGUGAUCAGCUCCAAAAUACCCCAAAUGGAAAGACAAAGUGUUGGACUCAGUGUGAUACUGGGAUUCUCACUCACAGUCU
type
1 ie`

@Aufiero
Copy link
Owner

Aufiero commented Jun 10, 2020

It's seems ok, you should not get any error since your targetsFTS_gr$upGR$type[1] or targetsFTS_gr$downGR$type[1] is equals to ie.

I did a test and if you have targetsFTS_gr$upGR$type[1] or targetsFTS_gr$downGR$type[1] equals to NA you get that error message, but there should be an NA value in there. Could you maybe rerun targetsFTS_gr<- getSeqsFromGRs(...) and check that again?

@marvel479
Copy link
Author

marvel479 commented Jun 11, 2020 via email

@Aufiero
Copy link
Owner

Aufiero commented Jun 11, 2020

Hi Aayushi, thanks, and thanks to you that reported the issues so that I can improve the code.

About the chrM/chrMT bugs, in the development version of circRNAprofiler (not in the release) it should be solved.

You can install the devel version of circRNAprofiler with:

BiocManager::install(version='devel')
BiocManager::install("circRNAprofiler")

I'll now close the issue that you opened on Github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants