Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New genome assembly #54

Open
marvel479 opened this issue Mar 26, 2020 · 9 comments · May be fixed by #55
Open

New genome assembly #54

marvel479 opened this issue Mar 26, 2020 · 9 comments · May be fixed by #55
Assignees

Comments

@marvel479
Copy link

Hi! I am new to coding and have been trying to annotate my identified splice sites from find_circ.py. My reads are from Danio Rerio11, which is not currently supported. Is there a way around/ alternate code for loading ENSEMBL annotations such that I can still use ciRcus.

@mschilli87
Copy link
Collaborator

@marvel479: Have a look at https://github.com/BIMSBbioinfo/ciRcus/pull/51/commits for an example how to add a species. As long as your assembly is on Ensembl, it's straight forward.
Otherwise, you could create the database yourself instead of relying on ciRcus. If this is too complicated for you, but you are willing to test for me, I could support you by attempting a PR.

@marvel479
Copy link
Author

@mschilli87, I am still not sure how to do this bit, It will be great if you can edit in another genome. I am most willing to test this out for you and assist in any other way possible, my skills unfortunately at the time are limited. Your help is really appreciated.

@mschilli87 mschilli87 self-assigned this Mar 28, 2020
mschilli87 added a commit that referenced this issue Mar 30, 2020
Note that this pushes some unrelated documentation changes introduced by
`devtools::build`.

fixes #54
@mschilli87 mschilli87 linked a pull request Mar 30, 2020 that will close this issue
@mschilli87
Copy link
Collaborator

@marvel479:

Could you please test the following branch and report back so we can add this to the development version if it works?

BiocManager::install("BIMSBbioinfo/circus@dr11")

@marvel479
Copy link
Author

marvel479 commented Mar 30, 2020 via email

@marvel479
Copy link
Author

Hi Marcel, So I know this error has been seen before, but not clearly resolved. When I just trying to load the human database for tests, I get the following error:

`> gtf2sqlite( assembly = "hg19", db.file = system.file("extdata/db/human_hg19_ens75_txdb.sqlite", package="ciRcus"))

snapshotDate(): 2019-10-29
downloading 1 resources
retrieving 1 resource
|=================================================================================================================| 100%

loading from cache
TxDb object:

Db type: TxDb

Supporting package: GenomicFeatures

Genome: hg19

transcript_nrow: 196354

exon_nrow: 674156

cds_nrow: 269141

Db created by: GenomicFeatures package from Bioconductor

Creation time: 2020-03-30 01:24:15 -0700 (Mon, 30 Mar 2020)

GenomicFeatures version at creation time: 1.38.2

RSQLite version at creation time: 2.2.0

DBSCHEMAVERSION: 1.2

Warning message:
In .get_cds_IDX(mcols0$type, mcols0$phase) :
The "phase" metadata column contains non-NA values for features of type stop_codon. This information was
ignored.

annot.list <- loadAnnotation(system.file("extdata/db/human_hg19_ens75_txdb.sqlite",

  •                                      package="ciRcus"))
    

loading TxDb annotation from SQLite database file...
Error in dbFileConnect(file) : DB file '' not found`

Any idea how I can get this taken care of? If system.files is not required, how can I have the databases included in the package get loaded?

@mschilli87
Copy link
Collaborator

@marvel479: I think there is a misconception. AFAIK, system.file(..., package = "ciRcus") just returns a standard path and you can completely ignore it. I'm not even sure if ciRcus actually does ship any annotation.
gtf2sqlite uses the AnnotationHub package to query it from ENSEMBL's servers and stores a local copy in a SQLite3 database at the path you tell it.
So just make sure to pass an existing, writable path (forget about system.file) and you should be able to load it with loadAnnotation.

@retaj: Maybe we should update the README and/or fix system.file to actually return something useful rather than '' for ciRcus? 😉

@mschilli87
Copy link
Collaborator

@marvel479: Did my canges in #55 work for you? With some feedback from your side we'd be able to inculde this in the next ciRcus version so other fly researchers benefit from my work as well. But I'd be reluctant to share untested code. So please get back to us.

@marvel479
Copy link
Author

marvel479 commented Apr 14, 2020 via email

@marvel479
Copy link
Author

@marvel479: Did my canges in #55 work for you? With some feedback from your side we'd be able to inculde this in the next ciRcus version so other fly researchers benefit from my work as well. But I'd be reluctant to share untested code. So please get back to us.

Hi Marcel,
I have verified that the new assembly does work. I would like to request you to please add it as dr11 and not GRCz11, as dr11 is more intuitive, but apart from that, it works perfectly. I also would like to mention that system.files does not work to make the Annot.list, and instead I used a local path.

Thanks for all your help through these issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants